pt-table-checksum工具使用报错一例
图片来自Percona官网
今天同事在用 percona toolkit 工具中的 pt-table-checksum 对主从数据库进行校验,提交命令后,一直提示下面的信息:
Pausing because Threads_running=0
看字面意思是在提示当前活跃线程数为0,但为什么不继续执行呢。这个提示信息有点含糊其辞,该工具是用Perl写的,因此直接打开看脚本跟踪一下,大概就明白怎么回事了,原来是这个工具有负载保护机制,避免运行时对线上数据库产生影响。
和这个机制相关的参数名是: –max-load,其类型是:Array,用法是一个或多个 variables = value 组成的判断条件,然后根据这个规则判断某些条件是否超标。例如,设定 –max-load=”Threads_running=25″,意思是当前活跃线程数如果超过25,就暂停 checksum 工作,直到活跃线程数低于 25。
因此,在我们这个案例中,想要强制让 table-checksum 继续工作的话,可以设定 –max-load 的值,例如:
pt-table-checksum --max-load="Threads_running=25" ...其他选项...
或者
pt-table-checksum --max-load="Threads_connected=25" ...其他选项...
前面的选项意思是判断活跃线程数不要超过25个,后面的选项意思是当前打开的线程数不要超过25个。
下面是 pt-table-checksum 帮助手册里的一段话:
–max-load
type: Array; default: Threads_running=25; group: ThrottleExamine SHOW GLOBAL STATUS after every chunk, and pause if any status variables are higher than the threshold. The option accepts a comma-sep-
arated list of MySQL status variables to check for a threshold. An optional “=MAX_VALUE” (or “:MAX_VALUE”) can follow each variable. If not
given, the tool determines a threshold by examining the current value and increasing it by 20%.For example, if you want the tool to pause when Threads_connected gets too high, you can specify “Threads_connected”, and the tool will check
the current value when it starts working and add 20% to that value. If the current value is 100, then the tool will pause when Threads_con-
nected exceeds 120, and resume working when it is below 120 again. If you want to specify an explicit threshold, such as 110, you can use
either “Threads_connected:110″ or “Threads_connected=110″.The purpose of this option is to prevent the tool from adding too much load to the server. If the checksum queries are intrusive, or if they
cause lock waits, then other queries on the server will tend to block and queue. This will typically cause Threads_running to increase, and the
tool can detect that by running SHOW GLOBAL STATUS immediately after each checksum query finishes. If you specify a threshold for this vari-
able, then you can instruct the tool to wait until queries are running normally again. This will not prevent queueing, however; it will only
give the server a chance to recover from the queueing. If you notice queueing, it is best to decrease the chunk time.
更多建议: