科技知識動態:從庫crash一直自動重啟(mysqld got signal 11)問題解決

導讀 跟大家講解下有關從庫crash一直自動重啟(mysqld got signal 11)問題解決,相信小伙伴們對這個話題應該也很關注吧,現在就為小伙伴們說說

跟大家講解下有關從庫crash一直自動重啟(mysqld got signal 11)問題解決,相信小伙伴們對這個話題應該也很關注吧,現在就為小伙伴們說說從庫crash一直自動重啟(mysqld got signal 11)問題解決,小編也收集到了有關從庫crash一直自動重啟(mysqld got signal 11)問題解決的相關資料,希望大家看到了會喜歡。

一:問題描述 今天收到郵件報警,遂進數據庫查看slave狀態,發現io進程和sql進程都為NO. mysql show slave status \G;*************************** 1. row*************************** Slave_IO_State: Master_Host: 此處不予顯示,哈哈 Master_User: replic

一:問題描述

今天收到郵件報警,遂進數據庫查看slave狀態,發現io進程和sql進程都為NO.

mysql> show slave status \G;*************************** 1. row*************************** Slave_IO_State: Master_Host: 此處不予顯示,哈哈 Master_User: replica Master_Port: 3306 Connect_Retry: 60 Master_Log_File:master1-bin.001191 Read_Master_Log_Pos: 29214749 Relay_Log_File:web_appdb_10-relay-bin.000663 Relay_Log_Pos: 29213639 Relay_Master_Log_File: master1-bin.001191 Slave_IO_Running: No Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: ccda.%,eip_fileservice.% Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 29213491 Relay_Log_Space: 29215212 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULLMaster_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 01 row in set (0.01 sec) ERROR:No query specified嘗試啟動,然后再次查看狀態,竟然報錯,說連不上數據庫。

mysql> start slave;Query OK, 0 rows affected (0.00 sec) mysql> show slave status \G;ERROR 2006 (HY000): MySQL server has goneawayNo connection. Trying to reconnect...ERROR 2002 (HY000): Can't connect to localMySQL server through socket '/tmp/mysql.sock' (2)ERROR:Can't connect to the server ERROR:No query specified

連續嘗試多次,可以登錄數據庫了,再次查詢復制,發現狀態還是NO.

mysql> show slave status \G;No connection. Trying to reconnect...Connection id: 1Current database: *** NONE *** *************************** 1. row*************************** Slave_IO_State: Master_Host: 10.0.3.34 Master_User: replica Master_Port: 3306 Connect_Retry: 60 Master_Log_File:master1-bin.001191 Read_Master_Log_Pos: 29214749 Relay_Log_File:web_appdb_10-relay-bin.000663 Relay_Log_Pos: 29213639 Relay_Master_Log_File: master1-bin.001191 Slave_IO_Running: No Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: ccda.%,eip_fileservice.% Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 29213491 Relay_Log_Space: 29215426 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULLMaster_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 01 row in set (0.00 sec) ERROR:No query specified

發現只要start slave,該服務器數據庫就會自動重啟。

而且start slave io_thread沒問題,當start slave sql_thread時,才會導致數據庫自動重啟。

查看錯誤日志:

160429 9:09:00 [Note] Event Scheduler: Loaded 0 events160429 9:09:00 [Note] /usr/local/mysql/bin/mysqld: ready for connections.Version: '5.5.19-log' socket: '/tmp/mysql.sock' port: 3306 Source distribution160429 11:04:47 [Note] Slave SQL threadinitialized, starting replication in log 'master1-bin.001191' at position29213491, relay log './web_appdb_10-relay-bin.000663' position: 29213639160429 11:04:47 - mysqld got signal 11 ;This could be because you hit a bug. It isalso possible that this binaryor one of the libraries it was linkedagainst is corrupt, improperly built,or misconfigured. This error can also becaused by malfunctioning hardware.We will try our best to scrape up some infothat will hopefully help diagnosethe problem, but since we have alreadycrashed, something is definitely wrongand this may fail. key_buffer_size=268435456read_buffer_size=6291456max_used_connections=3max_threads=2000thread_count=2connection_count=2It is possible that mysqld could use up tokey_buffer_size + (read_buffer_size +sort_buffer_size)*max_threads = 20764878 Kbytes of memoryHope that's ok; if not, decrease somevariables in the equation. Thread pointer: 0x2ab2f1b54740Attempting backtrace. You can use thefollowing information to find outwhere mysqld died. If you see no messagesafter this, something wentterribly wrong...stack_bottom = 0x594310e8 thread_stack0x30000/usr/local/mysql/bin/mysqld(my_print_stacktrace+0x33)[0x765df3]/usr/local/mysql/bin/mysqld(handle_segfault+0x36e)[0x4ee4fe]/lib64/libpthread.so.0[0x31a640ebe0]/usr/local/mysql/bin/mysqld(_ZNK9table_def15compatible_withEP3THDP14Relay_log_infoP5TABLEPS5_+0x31a)[0x74c29a]/usr/local/mysql/bin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0xcdc)[0x6f0d3c]/usr/local/mysql/bin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_info+0x14d)[0x5021ed]/usr/local/mysql/bin/mysqld[0x504b19]/usr/local/mysql/bin/mysqld(handle_slave_sql+0xc0a)[0x5061ea]/lib64/libpthread.so.0[0x31a640677d]/lib64/libc.so.6(clone+0x6d)[0x31a54d49ad] Trying to get some variables.Some pointers may be invalid and cause thedump to abort.Query ((nil)): is an invalid pointerConnection ID (thread ID): 353Status: NOT_KILLED The manual page athttp://dev.mysql.com/doc/mysql/en/crashing.html containsinformation that should help you find outwhat is causing the crash.160429 11:04:48 mysqld_safe Number ofprocesses running now: 0160429 11:04:48 mysqld_safe mysqldrestarted160429 11:04:48 InnoDB: The InnoDB memoryheap is disabled160429 11:04:48 InnoDB: Mutexes andrw_locks use GCC atomic builtins160429 11:04:48 InnoDB: Compressed tablesuse zlib 1.2.3160429 11:04:48 InnoDB: Initializing bufferpool, size = 32.0G160429 11:04:50 InnoDB: Completedinitialization of buffer pool160429 11:04:50 InnoDB: highest supportedfile format is Barracuda.InnoDB: The log sequence number in ibdatafiles does not matchInnoDB: the log sequence number in theib_logfiles!160429 11:04:50 InnoDB: Database was not shut down normally!InnoDB: Starting crash recovery.InnoDB: Reading tablespace information fromthe .ibd files...InnoDB: Restoring possible half-writtendata pages from the doublewriteInnoDB: buffer...InnoDB: Last MySQL binlog file position 0112571, file name ./mysql-bin.048292160429 11:04:52 InnoDB: Waiting for the background threads tostart160429 11:04:53 InnoDB: 1.1.8 started; logsequence number 5992159806777160429 11:04:53 [Note] Recovering after acrash using mysql-bin160429 11:04:53 [Note] Starting crashrecovery...160429 11:04:53 [Note] Crash recoveryfinished.160429 11:04:53 [Warning] Neither--relay-log nor --relay-log-index were used; so replication may break when thisMySQL server acts as a slave and has his hostname changed!! Please use'--relay-log=web_appdb_10-relay-bin' to avoid this problem.160429 11:04:53 [Note] Event Scheduler:Loaded 0 events160429 11:04:53 [Note]/usr/local/mysql/bin/mysqld: ready for connections.Version: '5.5.19-log' socket: '/tmp/mysql.sock' port: 3306 Source distribution

對于這個錯誤”mysqld got signal 11”,我在網上查了,有的說是磁盤空間滿了,有的說是內存問題,也有可能是硬件錯誤,也有可能是中繼日志重放位置的sql導致的。

查看中繼日志該位置執行的語句:

Relay_Log_File:web_appdb_10-relay-bin.000663

Relay_Log_Pos: 29213639

# at 29213639#160428 21:29:32 server id 1 end_log_pos 29213559 Query thread_id=624506 exec_time=0 error_code=0SET TIMESTAMP=1461850172;;SET@@session.character_set_client=45,@@session.collation_connection=45,@@session.collation_server=33;BEGIN;# at 29213707#160428 21:29:32 server id 1 end_log_pos 29213657 Table_map: `ccda`.`ess_accounting_relation`mapped to number 311993# at 29213805#160428 21:29:32 server id 1 end_log_pos 29213757 Table_map:`ccda`.`ess_accounting_relation_1` mapped to number 311994# at 29213905#160428 21:29:32 server id 1 end_log_pos 29214051 Update_rows: table id 311993 flags:STMT_END_F BINLOG 'PBAiVxMBAAAAYgAAANnDvQEAALnCBAAAAAEABGNjZGEAF2Vzc19hY2NvdW50aW5nX3JlbGF0aW9uAA4DDw8PDw8PDw8PDw8MAxYsATwAPAAGACwBPAAsAZYAlgCWAJYA/D8=PBAiVxMBAAAAZAAAAD3EvQEAALrCBAAAAAEABGNjZGEAGWVzc19hY2NvdW50aW5nX3JlbGF0aW9uXzEADgMPDw8PDw8PDw8PDwwDFiwBPAA8AAYALAE8ACwBlgCWAJYAlgD8Pw==PBAiVxgBAAAAJgEAAGPFvQEAALnCBAAAAAEADv////8Q4Dnl7wATADEyMTEwMTAxMTE1MTEwMDE2MzUS5bqU5LuY5LuY5qy+5Yet6K+BA0VBUwAACeaKpei0puWNlRYAemhhbmd5MTA0NzE1MTExODEzMTYyMhcxMTExMDExNTExMTg1MDUyMzA2MjE2MQAAAI3GveRVEgAAEOA55e8AEwAxMjExMDEwMTExNTExMDAxNjM1EuW6lOS7mOS7mOasvuWHreivgQNFQVMUADExMTEwMTE1MTEwOTEwMzgwMDg5CeaKpei0puWNlRYAemhhbmd5MTA0NzE1MTExODEzMTYyMhcxMTExMDExNTExMTg1MDUyMzA2MjE2MQAAAI3GveRVEgAA';### UPDATE `ccda`.`ess_accounting_relation`### WHERE### @1=15721785### @2='1211010111511001635'### @3='應付付款憑證'### @4='EAS'### @5=NULL### @6=''### @7='報賬單'### @8='zhangy1047151118131622'### @9='11110115111850523062161'### @10=''### @11=''### @12=''### @13=2016-01-19 16:25:09### @14=NULL### SET### @1=15721785### @2='1211010111511001635'### @3='應付付款憑證'### @4='EAS'### @5=NULL### @6='11110115110910380089'### @7='報賬單'### @8='zhangy1047151118131622'### @9='11110115111850523062161'### @10=''### @11=''### @12=''### @13=2016-01-19 16:25:09### @14=NULL

先備份一下該記錄,然后手動在從庫上更新一下,看是否報錯。

UPDATE `ccda`.`ess_accounting_relation`

SET attachId='11110115110910380089'

WHERE id = 15721785;

結果發現在從庫上也可以正常update呀。

后來我想查看下該表表結構,結果出現錯誤:

mysql> show create table`ccda`.`ess_accounting_relation` \G;

ERROR 144 (HY000): Table'./ccda/ess_accounting_relation_1' is marked as crashed and last (automatic?)repair failed

ERROR:

No query specified

mysql> select count(*) fromccda.ess_accounting_relation_1;

ERROR 144 (HY000): Table'./ccda/ess_accounting_relation_1' is marked as crashed and last (automatic?)repair failed

詭異,剛才還能更新呢,現在卻又不能正常訪問了。

然后,嘗試修復出問題的表:

check table ccda.ess_accounting_relation_1;

repair table ccda.ess_accounting_relation_1;

修復成功后,查看ccda.ess_accounting_relation 表結構,發現該表是個合并表,ess_accounting_relation_1是myisam引擎:

CREATE TABLE `ess_accounting_relation` (

……)

ENGINE=MRG_MyISAM DEFAULT CHARSET=utf8 INSERT_METHOD=LASTUNION=(`ess_accounting_relation_1`)

原本以為修復該表成功后,start slave,就正常了。結果還是會導致數據庫重啟。

此時,再檢查ess_accounting_relation_1,也是正常的,沒有顯示崩潰。

我試驗,在從庫跳過該表的操作(用change master to或者set global sql_slave_skip_counter=n),當執行其他表的操作時,并沒有導致從庫重啟。

我試驗在從庫配置文件里添加參數:replicate_ignore_table=ccda.ess_accounting_relation過濾掉這個表,然后重啟數據庫,再start slave,沒有導致從庫重啟。

最后,我大膽試驗下,在主庫直接操作ess_accounting_relation_1的某條數據(前提是已經注釋掉了上面的參數replicate_ignore_table),發現從庫在應用相應數據時,并沒有導致重啟。

所以,問題就出在了這個mrg_myisam存儲引擎。

這個表,其實每天也都有update,可是為什么最近才出現了這個問題,那就不知道了。因為這個表引用的子表數據量太大了嗎?該表大概1600萬數據。不曉得。

二:出錯原因

一個mrg_myisam存儲引擎的合并表,引用了一個myisam引擎的子表,更新前者導致slave數據庫一直自動重啟,且偶爾子表也會發生崩潰。

這估計是mysql的一個bug吧。

三:解決辦法

由于ess_accounting_relation表數據只來源于ccda.ess_accounting_relation_1這一個表,實際上并沒有合并的意義,而且,通過了解發現,這個myisam表經常更新。myisam容易崩潰,且不支持行鎖,故建議將ccda.ess_accounting_relation_1改成innodb存儲引擎(可以先mysqldump備份下這個表,然后在備份文件里將MyISAM改成innodb),刪掉ccda.ess_accounting_relation,將ccda.ess_accounting_relation_1重命名為ccda.ess_accounting_relation。

--關于mrg_myisam介紹,請參考http://blog.csdn.net/yabingshi_tech/article/details/51320701

--備注:mysql版本5.5.19

來源:php中文網

免責聲明:本文由用戶上傳,如有侵權請聯系刪除!