關於big5轉utf8的問題 - Linux

Table of Contents

是這樣的。

因為各種網路建言
想把上古時代完全沒有宣告編碼的Mysql+big5 server全部改成utf8編碼
但是在dump&reload的時候碰壁
目前是已經成功用各種偏方
可以看到dump出來的.sql裡面有顯示正確中文

(那個把.sql先用latin1 reload回Mysql
再重新用latin1 dump出來讓Mysql當翻譯官的神奇辦法)

但是把這份.sql iconv成utf8並且把內文latin1 sed 成 utf8之後
再reload回character_set已經重新設定成utf8的Mysql時出現了亂碼

檢查local之後總覺得好像不太對勁
想貼上來請大大幫忙判斷一下是哪邊出了什麼問題

以下為舊系統配置
CentOS release 5.5 (Final)

$ locale
LANG=zh_TW.BIG5
LC_CTYPE=en_US.ISO8859-1
LC_NUMERIC="zh_TW.BIG5"
LC_TIME="zh_TW.BIG5"
LC_COLLATE="zh_TW.BIG5"
LC_MONETARY="zh_TW.BIG5"
LC_MESSAGES="zh_TW.BIG5"
LC_PAPER="zh_TW.BIG5"
LC_NAME="zh_TW.BIG5"
LC_ADDRESS="zh_TW.BIG5"
LC_TELEPHONE="zh_TW.BIG5"
LC_MEASUREMENT="zh_TW.BIG5"
LC_IDENTIFICATION="zh_TW.BIG5"
LC_ALL=

$ file dump_utf8mb4.sql
dump_utf8mb4.sql: ASCII text, with very long lines
encoding=latin1
fileencoding=

mysql> show variables like 'character%';
+--------------------------+----------------------------+
| Variable_name | Value |
+--------------------------+----------------------------+
| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

mysql> show variables like 'colla%';
+----------------------+-------------------+
| Variable_name | Value |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci |
+----------------------+-------------------+

以下為新系統配置
CentOS Linux release 7.6.1810 (Core)

$ locale
LANG=en_US.UTF-8
LC_CTYPE=en_US.ISO8859-1
LC_NUMERIC=en_US.UTF-8
LC_TIME=en_US.UTF-8
LC_COLLATE=en_US.UTF-8
LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

文件屬性

$ file dump_utf8mb4.sql
dump_utf8mb4.sql: ASCII text, with very long lines
encoding=latin1
fileencoding=utf-8


Mysql 編碼設置

mysql> show variables like 'character%';
+--------------------------+--------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | utf8 |
| character_set_filesystem | binary |
| character_set_results | utf8 |
| character_set_server | utf8 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql-8.0/charsets/ |
+--------------------------+--------------------------------+

mysql> show variables like 'colla%';
+----------------------+-----------------+
| Variable_name | Value |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database | utf8_unicode_ci |
| collation_server | utf8_unicode_ci |
+----------------------+-----------------+

在新系統配置下目前.sql文件是可以看到正確顯示的中文
但那個
LC_CTYPE=en_US.ISO8859-1
我怎麼看都覺得怪怪

有沒有大神知道我要怎麼改.sql文件編碼和系統編碼
才可以再reload到新系統Mysql的時候可以正確顯示中文

拜託各位大神了
謝謝各位大神m(_ _)m

--

All Comments

Mia avatarMia2019-03-09
哪邊看到亂碼?截圖一下 ?
Kristin avatarKristin2019-03-14
題外話請用 utf8mb4 而不要再用 utf8, 表情符號才支援
Damian avatarDamian2019-03-18
在開發測試環境生測試資料都不能貼?-_-