码迷,mamicode.com
首页 > 数据库 > 详细

除了binlog2sql工具外,使用python脚本闪回数据(数据库误操作)

时间:2018-03-21 17:30:26      阅读:238      评论:0      收藏:0      [点我收藏+]

标签:value   update   field   sele   desc   cte   flags   dml   sage   

利用binlog日志恢复数据库误操作数据

在人工手动进行一些数据库写操作的时候(比方说数据修改),尤其是一些不可控的批量更新或删除,通常都建议备份后操作。不过不怕万一,就怕一万,有备无患总是好的。在线上或者测试环境误操作导致数据被删除或者更新后,想要恢复,一般有两种方法。

方法一、利用最近的全量备份+增量binlog备份,恢复到误操作之前的状态,但是随着数据量的增大,binlog的增多,恢复起来很费时。
方法二、如果binlog的格式为row,那么就可以将binlog解析出来生成反向的原始SQL

以下是利用方法二写的一个python脚本binlog_rollback.py,可利用此脚本生成反向的原始SQL。

 

说明:

0、前提是binlog的格式为row
1、要恢复的表操作前后表结构没有发生变更,否则脚本无法解析
2、只生成DML(insert/update/delete)的rollback语句
3、最终生成的SQL是逆序的,所以最新的DML会生成在输入文件的最前面,并且带上了时间戳和偏移点,方便查找目标
4、需要提供一个连接MySQL的只读用户,主要是为了获取表结构
5、如果binlog过大,建议带上时间范围,也可以指定只恢复某个库的SQL
6、SQL生成后,请务必在测试环境上测试恢复后再应用到线上

 

数据库配置:

[mysqld]

server_id = 1
log_bin = /var/lib/mysql/mysql-bin.log
max_binlog_size = 100M
binlog_format = row
binlog_row_image = full

 

脚本代码:

创建py脚本文件binlog_rollback.py如下

#!/bin/env python
# -*- coding:utf-8 -*-

import os,sys,re,getopt
import MySQLdb


host = 127.0.0.1
user = ‘‘
password = ‘‘
port = 3306
start_datetime = 1971-01-01 00:00:00
stop_datetime = 2037-01-01 00:00:00
start_position = 4
stop_position = 18446744073709551615
database = ‘‘
mysqlbinlog_bin = mysqlbinlog -v
binlog = ‘‘
fileContent = ‘‘
output=rollback.sql
only_primary = 0


# ----------------------------------------------------------------------------------------
# 功能:获取参数,生成相应的binlog解析文件
# ----------------------------------------------------------------------------------------
def getopts_parse_binlog():
    global host
    global user
    global password
    global port
    global fileContent
    global output
    global binlog
    global start_datetime
    global stop_datetime
    global start_position
    global stop_position
    global database
    global only_primary
    try:
        options, args = getopt.getopt(sys.argv[1:], "f:o:h:u:p:P:d:", ["help","binlog=","output=","host=","user=","password=","port=","start-datetime=",                                                                       "stop-datetime=","start-position=","stop-position=","database=","only-primary="])
    except getopt.GetoptError:
        print "参数输入有误!!!!!"
        options = []
    if options == [] or options[0][0] in ("--help"):
        usage()
        sys.exit()
    print "正在获取参数....."
    for name, value in options:
        if name == "-f" or name == "--binlog":
            binlog = value
        if name == "-o" or name == "--output":
            output = value
        if name == "-h" or name == "--host":
            host = value
        if name == "-u" or name == "--user":
            user = value
        if name == "-p" or name == "--password":
            password = value
        if name == "-P" or name == "--port":
            port = value
        if name == "--start-datetime":
            start_datetime = value
        if name == "--stop-datetime":
            stop_datetime = value
        if name == "--start-position":
            start_position = value
        if name == "--stop-position":
            stop_position = value
        if name == "-d" or name == "--database":
            database = value
        if name == "--only-primary" :
            only_primary = value

    if binlog == ‘‘ :
        print "错误:请指定binlog文件名!"
        usage()
    if user == ‘‘ :
        print "错误:请指定用户名!"
        usage()
    if password == ‘‘ :
        print "错误:请指定密码!"
        usage()
    if database <> ‘‘ :
       condition_database = "--database=" + "" + database + ""
    else:
        condition_database = ‘‘
    print "正在解析binlog....."
    fileContent=os.popen("%s %s  --base64-output=DECODE-ROWS --start-datetime=‘%s‘ --stop-datetime=‘%s‘ --start-position=‘%s‘ --stop-position=‘%s‘ %s\
                   |grep ### -B 2|sed -e s/### //g -e s/^INSERT/##INSERT/g -e s/^UPDATE/##UPDATE/g -e s/^DELETE/##DELETE/g " \
                   %(mysqlbinlog_bin,binlog,start_datetime,stop_datetime,start_position,stop_position,condition_database)).read()
    #print fileContent



# ----------------------------------------------------------------------------------------
# 功能:初始化binlog里的所有表名和列名,用全局字典result_dict来储存每个表有哪些列
# ----------------------------------------------------------------------------------------
def init_col_name():
    global result_dict
    global pri_dict
    global fileContent
    result_dict = {}
    pri_dict = {}
    table_list = re.findall(`.*`\\.`.*`,fileContent)
    table_list = list(set(table_list))
    #table_list 为所有在这段binlog里出现过的表
    print "正在初始化列名....."
    for table in table_list:
        sname = table.split(.)[0].replace(`,‘‘)
        tname = table.split(.)[1].replace(`,‘‘)
        #连接数据库获取列和列id
        try:
            conn = MySQLdb.connect(host=host,user=user,passwd=password,port=int(port))
            cursor = conn.cursor()
            cursor.execute("select ordinal_position,column_name \
                                                       from information_schema.columns                                                        where table_schema=%s and table_name=%s " %(sname,tname))

            result=cursor.fetchall()
            if result == () :
                print Warning:+sname+.+tname+已删除
                #sys.exit()
            result_dict[sname+.+tname]=result
            cursor.execute("select ordinal_position,column_name   \
                               from information_schema.columns                                where table_schema=%s and table_name=%s and column_key=PRI " %(sname,tname))
            pri=cursor.fetchall()
            #print pri
            pri_dict[sname+.+tname]=pri
            cursor.close()
            conn.close()
        except MySQLdb.Error, e:
            try:
                print "Error %d:%s" % (e.args[0], e.args[1])
            except IndexError:
                print "MySQL Error:%s" % str(e)

            sys.exit()
    #print result_dict
    #print pri_dict

# ----------------------------------------------------------------------------------------
# 功能:拼凑回滚sql,逆序
# ----------------------------------------------------------------------------------------
def gen_rollback_sql():
    global only_primary
    fileOutput = open(output, w)
    #先将文件根据--分块,每块代表一个sql
    area_list=fileContent.split(--\n)
    #逆序读取分块
    print "正在开始拼凑sql....."
    for area in area_list[::-1]:
        #由于一条sql可能影响多行,每个sql又可以分成多个逐条执行的sql
        sql_list = area.split(##)
        #先将pos点和timestamp传入输出文件中
        for sql_head in sql_list[0].splitlines():
            sql_head = #+sql_head+\n
            fileOutput.write(sql_head)
        #逐条sql进行替换更新,逆序
        for sql in sql_list[::-1][0:-1]:
            try:
                if sql.split()[0] == INSERT:
                    rollback_sql = re.sub(^INSERT INTO, DELETE FROM, sql, 1)
                    rollback_sql = re.sub(SET\n, WHERE\n, rollback_sql, 1)
                    tablename_pos = 2
                    table_name = rollback_sql.split()[tablename_pos].replace(`, ‘‘)
                    # 获取该sql中的所有列
                    col_list = sorted(list(set(re.findall(@\d+, rollback_sql))))
                    # 因为第一个列前面没有逗号或者and,所以单独替换
                    rollback_sql = rollback_sql.replace(@1=, result_dict[table_name][0][1]+=)
                    for col in col_list[1:]:
                        i = int(col[1:]) - 1
                        rollback_sql = rollback_sql.replace(col+=, AND  + result_dict[table_name][i][1]+=,1)
                    # 如果only_primary开启且存在主键,where条件里就只列出主键字段
                    if int(only_primary) == 1 and pri_dict[table_name] <> ():
                        sub_where = ‘‘
                        for primary in pri_dict[table_name]:
                            primary_name = primary[1]
                            for condition in rollback_sql.split(WHERE, 1)[1].splitlines():
                                if re.compile(^\s*+primary_name).match(condition) or re.compile(^\s*AND\s*+primary_name).match(condition):
                                    sub_where = sub_where + condition + \n
                        sub_where = re.sub(^\s*AND, ‘‘, sub_where, 1)
                        rollback_sql = rollback_sql.split(WHERE, 1)[0] + WHERE\n + sub_where
                if sql.split()[0] == UPDATE:
                    rollback_sql = re.sub(SET\n, #SET#\n, sql, 1)
                    rollback_sql = re.sub(WHERE\n, SET\n, rollback_sql, 1)
                    rollback_sql = re.sub(#SET#\n, WHERE\n, rollback_sql, 1)
                    tablename_pos = 1
                    table_name = rollback_sql.split()[tablename_pos].replace(`, ‘‘)
                    # 获取该sql中的所有列
                    col_list = sorted(list(set(re.findall(@\d+, rollback_sql))))
                    # 因为第一个列前面没有逗号或者and,所以单独替换
                    rollback_sql = rollback_sql.replace(@1=, result_dict[table_name][0][1] + =)
                    for col in col_list[1:]:
                        i = int(col[1:]) - 1
                        rollback_sql = rollback_sql.replace(col+=, , + result_dict[table_name][i][1]+=, 1).replace(col+=,AND  +result_dict[table_name][i][1]+=)
                    # 如果only_primary开启且存在主键,where条件里就只列出主键字段
                    if int(only_primary) == 1 and pri_dict[table_name] <> ():
                        sub_where = ‘‘
                        for primary in pri_dict[table_name]:
                            primary_name = primary[1]
                            for condition in rollback_sql.split(WHERE, 1)[1].splitlines():
                                if re.compile(^\s* + primary_name).match(condition) or re.compile(^\s*AND\s*+primary_name).match(condition):
                                    sub_where = sub_where + condition + \n
                        sub_where = re.sub(^\s*AND, ‘‘, sub_where, 1)
                        rollback_sql = rollback_sql.split(WHERE, 1)[0] + WHERE\n + sub_where

                if sql.split()[0] == DELETE:
                    rollback_sql = re.sub(^DELETE FROM, INSERT INTO, sql, 1)
                    rollback_sql = re.sub(WHERE\n, SET\n, rollback_sql, 1)
                    tablename_pos = 2
                    table_name = rollback_sql.split()[tablename_pos].replace(`, ‘‘)
                    # 获取该sql中的所有列
                    col_list = sorted(list(set(re.findall(@\d+, rollback_sql))))
                    # 因为第一个列前面没有逗号或者and,所以单独替换
                    rollback_sql = rollback_sql.replace(@1=, result_dict[table_name][0][1] + =)
                    for col in col_list[1:]:
                        i = int(col[1:]) - 1
                        rollback_sql = rollback_sql.replace(col+=, , + result_dict[table_name][i][1]+=,1)

                rollback_sql = re.sub(\n$,;\n,rollback_sql)
                #print rollback_sql
                fileOutput.write(rollback_sql)
            except IndexError,e:
                print "Error:%s" % str(e)
                sys.exit()
    print "done!"

def usage():
    help_info="""==========================================================================================
Command line options :
    --help                  # OUT : print help info
    -f, --binlog            # IN  : binlog file. (required)
    -o, --outfile           # OUT : output rollback sql file. (default rollback.sql)
    -h, --host              # IN  : host. (default 127.0.0.1)
    -u, --user              # IN  : user. (required)
    -p, --password          # IN  : password. (required)
    -P, --port              # IN  : port. (default 3306)
    --start-datetime        # IN  : start datetime. (default 1970-01-01 00:00:00)
    --stop-datetime         # IN  : stop datetime. default 2070-01-01 00:00:00
    --start-position        # IN  : start position. (default 4)
    --stop-position         # IN  : stop position. (default 18446744073709551615)
    -d, --database          # IN  : List entries for just this database (No default value).
    --only-primary          # IN  : Only list primary key in where condition (default 0)

Sample :
   shell> python binlog_rollback.py -f mysql-bin.000001 -o /tmp/rollback.sql -h 192.168.0.1 -u user -p pwd -P 3307 -d dbname
=========================================================================================="""

    print help_info
    sys.exit()



if __name__ == __main__:
    getopts_parse_binlog()
    init_col_name()
    gen_rollback_sql()

 

实例:

mysql> grant select on *.* to jeck@localhost identified by 123;
Query OK, 0 rows affected (0.09 sec)
mysql> create database t2;
Query OK, 1 row affected (0.02 sec)

mysql> use t2;
Database changed
mysql> create table test(
    -> id int(10) not null auto_increment,
    -> gender enum(male,woman),
    -> hobby varchar(20) default null,
    -> primary key(id)
    -> );
Query OK, 0 rows affected (0.64 sec)

mysql> desc test;
+--------+----------------------+------+-----+---------+----------------+
| Field  | Type                 | Null | Key | Default | Extra          |
+--------+----------------------+------+-----+---------+----------------+
| id     | int(10)              | NO   | PRI | NULL    | auto_increment |
| gender | enum(male,woman) | YES  |     | NULL    |                |
| hobby  | varchar(20)          | YES  |     | NULL    |                |
+--------+----------------------+------+-----+---------+----------------+
3 rows in set (0.03 sec)

mysql> alter table test add name varchar(15) after id;
Query OK, 0 rows affected (0.58 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> insert into test values
    -> (1,tom,male,movie),
    -> (2,Nancy,woman,dance),
    -> (3,jeck,male,basketball),
    -> (4,danny,male,game);
Query OK, 4 rows affected (0.04 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> select * from test;
+----+-------+--------+------------+
| id | name  | gender | hobby      |
+----+-------+--------+------------+
|  1 | tom   | male   | movie      |
|  2 | Nancy | woman  | dance      |
|  3 | jeck  | male   | basketball |
|  4 | danny | male   | game       |
+----+-------+--------+------------+
4 rows in set (0.00 sec)

 

更新数据

mysql> update test set hobby=hike where name=jeck;
Query OK, 1 row affected (0.06 sec)
Rows matched: 1  Changed: 1  Warnings: 0

 

删除数据

mysql> delete from test where id=4;
Query OK, 1 row affected (0.05 sec)

mysql> select * from test;
+----+-------+--------+-------+
| id | name  | gender | hobby |
+----+-------+--------+-------+
|  1 | tom   | male   | movie |
|  2 | Nancy | woman  | dance |
|  3 | jeck  | male   | hike  |
+----+-------+--------+-------+
3 rows in set (0.00 sec)

 

生成方向sql语句文件

[root@A mysql]# python binlog_rollback.py -f /var/lib/mysql/mysql-bin.000003  -o rollback.sql -u jeck -p 123 --start-datetime=2018-03-21 16:37:00  --stop-datetime=2018-03-21 16:39:00 -d t2
正在获取参数.....
正在解析binlog.....
正在初始化列名.....
正在开始拼凑sql.....
done!

 

 

数据恢复输出文件 vim rollback.sql

## at 3553
##180321 16:38:58 server id 1  end_log_pos 3605 CRC32 0xd8f72cdb        Delete_rows: table id 72 flags: STMT_END_F
INSERT INTO `t2`.`test`
SET
  id=4
  ,name=danny
  ,gender=1
  ,hobby=game;
## at 3324
##180321 16:37:58 server id 1  end_log_pos 3398 CRC32 0xafd8a964        Update_rows: table id 72 flags: STMT_END_F
UPDATE `t2`.`test`
SET
  id=3
  ,name=jeck
  ,gender=1
  ,hobby=basketball
WHERE
  id=3
  AND name=jeck
  AND gender=1
  AND hobby=hike;

 

检查语句并恢复

[root@A mysql]# mysql -uroot -p123 -S /var/lib/mysql/mysql.sock < rollback.sql
mysql> select * from t2.test;
+----+-------+--------+------------+
| id | name  | gender | hobby      |
+----+-------+--------+------------+
|  1 | tom   | male   | movie      |
|  2 | Nancy | woman  | dance      |
|  3 | jeck  | male   | basketball |
|  4 | danny | male   | game       |
+----+-------+--------+------------+
4 rows in set (0.00 sec)

 

 转载自:http://www.cnblogs.com/prayer21/p/6018736.html

除了binlog2sql工具外,使用python脚本闪回数据(数据库误操作)

标签:value   update   field   sele   desc   cte   flags   dml   sage   

原文地址:https://www.cnblogs.com/dannylinux/p/8618087.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!