首页 > 代码库 > group commit

group commit

http://kristiannielsen.livejournal.com/12254.html

http://blog.itpub.net/28218939/viewspace-1975809/

http://mysql.taobao.org/index.php?diff=prev&oldid=1070&title=内核月报2015-01-draft

Reference:http://mysqlmusings.blogspot.kr/2012/06/binary-log-group-commit-in-mysql-56.html

 

组提交:减小数据写入延时。

Group commit is an important optimisation for databases that helps mitigate the latency of physically writing data to permanent storage。

XA/2-phase commit:

  1. First, a prepare step, in which the transaction is made durable in the engine(s). After this step, the transaction can still be rolled back; also, in case of a crash after the prepare phase, the transaction can be recovered.
  2. If the prepare step succeeds, the transaction is made durable in the binary log.
  3. Finally, the commit step is run in the engine(s) to make the transaction actually committed (after this step the transaction can no longer be rolled back). 不允许回滚。

        The idea is that when the system comes back up after a crash, crash recovery will go through the binary log. Any prepared (but not committed) transactions that are found in the binary log will be committed in the storage engine(s). Other prepared transactions will be rolled back. The result is guaranteed consistency between the engines and the binary log。 崩溃恢复:1、2执行了,提交;1执行、2未执行,回滚。

When the binary log is enabled, MySQL uses XA/2-phase commit to ensure consistency between the binary log and the storage engine. This means that a commit now takes three parts:

    innobase_xa_prepare()
    write() and fsync() binary log
    innobase_commit()


Now, there is an extra detail to the prepare and commit code in InnoDB. InnoDB locks the prepare_commit_mutex in innobase_xa_prepare(), and does not release it until after the "fast" part of innobase_commit() has completed. This means that while one transaction is executing innobase_commit(), all subsequent transactions will be blocked inside innobase_xa_prepare() waiting for the mutex. As a result, no transactions can queue up to share an fsync(), and group commit is broken with the binary log enabled

binlog_prepare (do nothing)
 innodb_xa_prepare  (加锁, 刷新redo log)
      write() and fsync() binary log  
  binlog_commit
innobase_commit
prepare_commit_mutex保证redo log和binlog写入顺序一致。

二进制日志三阶段提交: 

存储引擎(InnoDB) Prepare    ---->    数据库上层(Binary Log)   Flush Stage    ---->    Sync Stage    ---->    调存储引擎(InnoDBCommit stage.

group commit