Facebook的OnlineSchemaChange.php备注

Standard
...
define('OSC_FLAGS_ACCEPT_VERSION',  0x0400); // bypasses version check
...
public function __construct($input_sock = '', $input_user, $input_password,
  $input_dbname, $input_tablename,
  $input_altercmd, $input_outfile_folder = null,
  $input_flags = 0,
  $input_batchsize_load = 500000,
  $input_batchsize_replay = 500,
  $input_long_xact_time = 30,
  $input_logfile_folder = "/var/tmp/") {
  ...
}
 
...
 
  private function validateVersion() {
    ...
    if ((!($this->flags & OSC_FLAGS_ACCEPT_VERSION)) &&
        ($this->version !== "5.0.84") &&
        ($this->version !== "5.1.47")) {
      $error = "OSC has only been tested on versions 5.0.84 and 5.1.47. ".
               "Running on ".$this->version." is not allowed unless ".
               "OSC_FLAGS_ACCEPT_VERSION flag is set.";
      $this->raiseException($error, false);
    }
    return $this->version;
  }
 
...
 
$user='root';
$pwd='';
$db = 'test';
$table = 't1';
$sock = '/var/lib/mysql/mysql.sock';
 
$alter = "ALTER TABLE `t1` ADD INDEX ( `intcol1` )";
 
$osc = new OnlineSchemaChange($sock, $user, $pwd, $db, $table, $alter, NULL, 0x0400);
$osc->execute();

mysql小工具openarkkit

Standard

下载地址 http://code.google.com/p/openarkkit/

oak-apply-ri: apply referential integrity on two columns with parent-child relationship.
oak-block-account: block or release MySQL users accounts, disabling them or enabling them to login.
oak-chunk-update: perform long, non-blocking UPDATE/DELETE operation in auto managed small chunks.
oak-get-slave-lag: print slave replication lag and terminate with respective exit code.
oak-hook-general-log: hook up and filter general log entries based on entry type or execution plan criteria.
oak-kill-slow-queries: terminate long running queries.
oak-modify-charset: change the character set (and collation) of a textual column.
oak-online-alter-table: perform a non-blocking ALTER TABLE operation.
oak-prepare-shutdown: make for a fast and safe MySQL shutdown.
oak-purge-master-logs: purge master logs, depending on the state of replicating slaves.
oak-repeat-query: repeat query execution until some condition holds.
oak-security-audit: audit accounts, passwords, privileges and other security settings.
oak-show-limits: show AUTO_INCREMENT “free space”.
oak-show-replication-status: show how far behind are replicating slaves on a given master.

记下备用.

高效的MySQL分页

Standard

转载自 超群.com的博客

PERCONA PERFORMANCE CONFERENCE 2009上,来自雅虎的几位工程师带来了一篇”Efficient Pagination Using MySQL“的报告,有很多亮点,本文是在原文基础上的进一步延伸。

首先看一下分页的基本原理:

mysql> EXPLAIN SELECT * FROM message ORDER BY id DESC LIMIT 10000, 20\G
***************** 1. ROW **************
id: 1
select_type: SIMPLE
TABLE: message
TYPE: INDEX
possible_keys: NULL
KEY: PRIMARY
key_len: 4
REF: NULL
ROWS: 10020
Extra:
1 ROW IN SET (0.00 sec)

limit 10000,20的意思扫描满足条件的10020行,扔掉前面的10000行,返回最后的20行,问题就在这里,如果是limit 100000,100,需要扫描100100行,在一个高并发的应用里,每次查询需要扫描超过10W行,性能肯定大打折扣。文中还提到limit n性能是没问题的,因为只扫描n行。

文中提到一种”clue”的做法,给翻页提供一些”线索”,比如还是SELECT * FROM message ORDER BY id DESC,按id降序分页,每页20条,当前是第10页,当前页条目id最大的是9527,最小的是9500,如果我们只提供”上一页”、”下一页”这样的跳转(不提供到第N页的跳转),那么在处理”上一页”的时候SQL语句可以是:

SELECT * FROM message WHERE id > 9527 ORDER BY id ASC LIMIT 20;

处理”下一页”的时候SQL语句可以是:

SELECT * FROM message WHERE id < 9500 ORDER BY id DESC LIMIT 20;

不管翻多少页,每次查询只扫描20行。

缺点是只能提供”上一页”、”下一页”的链接形式,但是我们的产品经理非常喜欢”<上一页 1 2 3 4 5 6 7 8 9 下一页>”这样的链接方式,怎么办呢?

如果LIMIT m,n不可避免的话,要优化效率,只有尽可能的让m小一下,我们扩展前面的”clue”做法,还是SELECT * FROM message ORDER BY id DESC,按id降序分页,每页20条,当前是第10页,当前页条目id最大的是9527,最小的是9500,比如要跳到第8页,我看的SQL语句可以这样写:

SELECT * FROM message WHERE id > 9527 ORDER BY id ASC LIMIT 20,20;

跳转到第13页:

SELECT * FROM message WHERE id < 9500 ORDER BY id DESC LIMIT 40,20;

原理还是一样,记录住当前页id的最大值和最小值,计算跳转页面和当前页相对偏移,由于页面相近,这个偏移量不会很大,这样的话m值相对较小,大大减少扫描的行数。其实传统的limit m,n,相对的偏移一直是第一页,这样的话越翻到后面,效率越差,而上面给出的方法就没有这样的问题。

注意SQL语句里面的ASC和DESC,如果是ASC取出来的结果,显示的时候记得倒置一下。

已在60W数据总量的表中测试,效果非常明显。

mysql不停服修复主从同步的shell

Standard
#!/bin/sh
#yemaosheng.com上传到从DB修改后执行!
#主DB信息
MDB_IP=192.168.3.30
MDB_USERNAME=op
MDB_PASSWORD=123456
#从DB信息
SDB_IP=127.0.0.1
SDB_USERNAME=root
SDB_PASSWORD=132456
#相互同步的库名
DB=("db_1" "db_2")
 
/etc/init.d/mysqld restart
 
for((i=0;i<${#DB[@]};i++));do
	echo ${DB[${i}]}".sql Dump..."
	mysqldump -h${MDB_IP} -u${MDB_USERNAME} -p${MDB_PASSWORD} -R --triggers  --single-transaction --flush-logs --master-data --database ${DB[${i}]} > ${DB[${i}]}.sql 
	echo "Ok"
done
 
mysql -u${SDB_USERNAME} -p${SDB_PASSWORD} -h${SDB_IP} -e 'slave stop;'
 
for((i=0;i<${#DB[@]};i++));do
	echo ${DB[${i}]}".sql Input..."
	mysql -u${SDB_USERNAME} -p${SDB_PASSWORD} -h${SDB_IP} < ${DB[${i}]}.sql
	echo "Ok"
done
 
mysql -u${SDB_USERNAME} -p${SDB_PASSWORD} -h${SDB_IP} -e 'slave start;'
 
mysql -u${SDB_USERNAME} -p${SDB_PASSWORD} -h${SDB_IP} -e 'show slave status\G'
 
for((i=0;i<${#DB[@]};i++));do
	echo ${DB[${i}]}".sql Del..."
	rm -f ${DB[${i}]}.sql
	echo "Ok"
done

从mysql备份文件中提取指定表

Standard

mysqldump file:

awk 'BEGIN{temp_str = ""; find = "false"; end = "false";}{temp_str=temp_str""$0 ; if(match($0,"INSERT INTO `tb_name`") >0 ) find="true"; if(match($0,";") >0 ) end = "true";  if(end=="true" && find=="false"){ temp_str=""; end="false"}; if(end=="true" && find=="true") {print temp_str; end="false"; find="false"}}' .sql > _awk.sql

binlog file:

awk 'BEGIN{temp_str = ""; find = "false"; end = "false";}{temp_str=temp_str""$0 ; if(match($0,"tb_name") >0 ) find="true"; if(match($0,";") >0 ) end = "true";  if(end=="true" && find=="false"){ temp_str=""; end="false"}; if(end=="true" && find=="true") {print temp_str; end="false"; find="false"}}' .sql > _awk.sql

Mysql同步跳错shell

Standard

估计程序里面还是有点问题,经常造成大面积的同步出错,出错信息基本上是那种主键冲突之类的.
经常要几十台几十台的这样跳错太烦了,写了个shell.

#!/bin/sh
while [ "0" != $(mysql -h 127.0.0.1 -u yemaosheng_com -ppasswd -e "show slave status\G" | grep "Seconds_Behind_Master: " | awk -F ': ' '{print $2}') ]; do
  if [ 0 -lt $(mysql -h 127.0.0.1 -u yemaosheng_com -ppasswd -e "show slave status\G" | grep "Last_Error: Error" | wc -l) ]; then
    mysql -h 127.0.0.1 -u yemaosheng_com -ppasswd -e "SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1;slave start;";
  fi
done

mysql磁盘IO问题

Standard

14:30 发现游戏战斗卡
14:32 top查看发现cpu还好,但wa的百分比很高; tail -n 100 /var/log/messages 看到大量 abort io called 信息
14:35 准备将从库转换为主库
14:38 发现从库的同步已于先前一会时间出错;发公告停服
14:48 从主库备份出数据
15:50 因为主库IO已经有问题,所以备份超慢
15:55 更改从库配置及权限;将备份数据导入从库
16:05 完成开服

Mysql多机备份shell

Standard
#!/bin/sh
 
##
## 功能:备份Mysql数据库
## 说明:该版本将备分所有数据库,指定不备份的数据库除外
##
 
## 服务器配置段
## 说明:1、以下配置信息全为数组各数组元数用空格分开,
##       2、以SERVER为准,PORT、USER、PASS、NO_BAK_DB一一对应,若是后三个参数都相应可以不填,
##          将以默认的GLOBALS参数为准,若其中一台SEVER的参数不同,其前面的的参数需补全方可
 
## 设置保留多少天的备份文件
DAY=15
 
SERVER=("192.168.0.11" "192.168.0.12")
PORT=()
USER=()
PASS=()
## 不用备份的数据库列表,用 "|" 分开,与上面的参数一一对应
NO_BAK_DB=()
 
GLOBALS_PORT="3306" 
GLOBALS_USER=""
GLOBALS_PASS=""
GLOBALS_NO_BAK_DB="test|mysql|information_schema"
 
DATE_PATH="/data/database"
MYSQLSHOW="/usr/local/mysql/bin/mysqlshow"
MYSQLDUMP="/usr/local/mysql/bin/mysqldump"
LOGFILE="/var/log/bak_mysql.log"
 
_SERVER_=""
_PORT_="${GLOBALS_PORT}"
_USER_="${GLOBALS_USER}"
_PASS_=""
_NO_BAK_DB_="${GLOBALS_NO_BAK_DB}"
_DATE_=`date +%Y%m%d%H`
 
check_dir ()
{
	# process: check_dir function
	# Syntax:  check_dir
	# Returns:
	#       N/A
 
	if [ -n "$1" ]; then
		_PATH_="$1"
	else
		echo "脚本内部发生错误"
		exit 1
	fi
 
	if [ ! -d "${_PATH_}" ]; then
	   mkdir -p "${_PATH_}"
	   make_log "Creating ${_PATH_}"
	fi
}
 
make_log ()
{
	# process: make_log function
	# Syntax:  make_log
	# Returns:
	#       N/A
 
	check_dir "${LOGFILE%/*}"
	echo -e "[$(date +%Y-%m-%d' '%H:%M:%S)] ${1}" >> "${LOGFILE}"
}
 
tar_sql ()
{
	# process: tar_sql function
	# Syntax:  tar_sql <path>
	# Returns:
	#       N/A
 
	cd "${1%/*}"
	make_log "PWD: ${1%/*}"
 
	make_log "tar zcf ${1##*/}.tgz ${1##*/} >> "${LOGFILE}" 2>&1"
	tar zcf ${1##*/}.tgz ${1##*/} >> "${LOGFILE}" 2>&1
 
	make_log "rm -rf ${1##*/} >> "${LOGFILE}" 2>&1"
	rm -rf ${1##*/} >> "${LOGFILE}" 2>&1
}
 
rm_tar ()
{
	# process: rm_tar function
	# Syntax:  rm_tar <path>
	# Returns:
	#       N/A
 
	for _sql_file_ in `find "${1}" -name "*[0-9].tgz" -type f -mtime +${DAY}`; do
		make_log "rm -f ${_sql_file_}"
		rm -f ${_sql_file_}
	done 
}
 
for((i=0;i<${#SERVER[@]};i++));do
	if ((${#PORT[@]} > 0 & i < ${#PORT[@]})); then
		if [ "${PORT[${i}]}" = "" ]; then
			_PORT_="--port=${GLOBALS_PORT}"
		else
			_PORT_="--port=${PORT[${i}]}"
		fi
	else
		_PORT_="--port=${GLOBALS_PORT}"
	fi
 
	if ((${#USER[@]} > 0 & i < ${#USER[@]})); then
		if [ "${USER[${i}]}" = "" ]; then
			_USER_="--user=${GLOBALS_USER}"
		else
			_USER_="--user=${USER[${i}]}"
		fi
	else
		_USER_="--user=${GLOBALS_USER}"
	fi
 
	if ((${#PASS[@]} > 0 & i < ${#PASS[@]})); then
		if [ "${PASS[${i}]}" = "" ]; then
			_PASS_=""
		else
			_PASS_="--password=${PASS[${i}]}"
		fi
	else
		_PASS_="--password=${GLOBALS_PASS}"
	fi
 
	if [ "${_PASS_}" != "" ]; then
		_LOG_PASS_="--password=xxxxxxx"
	else
		_LOG_PASS_=""
	fi
 
	if ((${#NO_BAK_DB[@]} > 0 & i < ${#NO_BAK_DB[@]})); then
		_NO_BAK_DB_="${NO_BAK_DB[${i}]}"
	else
		_NO_BAK_DB_="${GLOBALS_NO_BAK_DB}"
	fi
 
	_DATE_PATH_="${DATE_PATH}"
	DATE_PATH="${DATE_PATH}/${SERVER[${i}]}"
 
	check_dir "${DATE_PATH}/${SERVER[${i}]}_${_DATE_}"
 
	for db in `${MYSQLSHOW} --host=${SERVER[${i}]} ${_PORT_} ${_USER_} ${_PASS_} | awk '{++n;if(n>3&&NF>=3&&$2!~"^('${_NO_BAK_DB_}')$")print$2}'`; do
 
		make_log "${MYSQLDUMP} --host=${SERVER[${i}]} ${_PORT_} ${_USER_} ${_LOG_PASS_} -R --triggers --database ${db} > ${DATE_PATH}/${SERVER[${i}]}_${_DATE_}/${db}.sql"
 
		${MYSQLDUMP} --host=${SERVER[${i}]} ${_PORT_} ${_USER_} ${_PASS_} -R --triggers --database ${db} > ${DATE_PATH}/${SERVER[${i}]}_${_DATE_}/${db}.sql
 
	done
 
	tar_sql "${DATE_PATH}/${SERVER[${i}]}_${_DATE_}"
	rm_tar "${DATE_PATH}"
	DATE_PATH="${_DATE_PATH_}"
done

Mysql同步出错处理

Standard

Mysql主从同步出错
Last_Error: Query caused different errors on master and slave. Error on master: ‘Deadlock found when trying to get lock; try restarting transaction’ (1213), Error on slave: ‘no error’ (0). Default database: ‘…’. Query: ‘INSERT INTO …’
Last_Error: Error ‘Duplicate entry ‘…’ for key 1′ on query. Default database: ‘…’. Query: ‘…’

遇到如上情况报错可用以下方式解决

(从DB上运行)
SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; #从DB运行,用来跳过1个事件,只有当同步进程出现错误而停止的时候才可以执行.
slave start;
show slave statue\G;

如果同步还有报错就继续重复以上三步操作直至无报错.

自动脚本

while [ 1 ]; do if [ `mysql -u root -e  "show slave status \G; " | grep  "Duplicate entry" | wc -l` -eq 1 ] ; then mysql -u root -e  "stop slave; set global sql_slave_skip_counter=1; start slave;" fi; sleep 1; mysql -u root -e "show slave status\G "; done

另外,最后没有报错,但slave_io_running no的话,先slave stop;slave start;还是no的话就只能通过重导数据来重做同步了.