python 使用 utf8mb4编码 和 MariaDB交互

作者: 麻辣阁 分类: python 发布时间: 2018-12-26 18:39

使用 utf8mb4编码可以保存 emoji 表情等特殊字符, 否则可能会出现类似如下的报错内容:

Traceback (most recent call last):
  File "/site-packages/mysql/connector/connection_cext.py", line 392, in cmd_query
    raw_as_string=raw_as_string)
_mysql_connector.MySQLInterfaceError: Incorrect string value: '\xF0\x9F\x8E\x85\xF0\x9F...' for column 'post_content' at row 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "html_to_wp_ssyd.py", line 865, in <module>
    loop()
  File "html_to_wp_ssyd.py", line 718, in loop
    update_one_html(html_href)
  File "html_to_wp_ssyd.py", line 470, in update_one_html
    update_wp(table_prefix,html_href,entry_num=50)
  File "html_to_wp_ssyd.py", line 604, in update_wp
    cursor.execute(sql)
  File "/site-packages/mysql/connector/cursor_cext.py", line 264, in execute
    raw_as_string=self._raw_as_string)
  File "site-packages/mysql/connector/connection_cext.py", line 395, in cmd_query
    sqlstate=exc.sqlstate)
mysql.connector.errors.DataError: 1366 (22007): Incorrect string value: '\xF0\x9F\x8E\x85\xF0\x9F...' for column 'post_content' at row 1

MariaDB 侧的设置

首先查数据库表的相关字段的编码, 这里假设为 utf8mb4_unicode_520_ci, 修改 /etc/my.cnf 配置:

[mysqld]
#collation-server = utf8_general_ci
#character-set-server = utf8
character_set_server=utf8mb4
collation-server=utf8mb4_unicode_520_ci
init_connect='SET NAMES utf8mb4'
skip-character-set-client-handshake=true

重启 MariaDB.

Python 侧的设置

py文件头添加:

# -*- coding: utf-8 -*-

用以下类似方式创建连接:

# Connect to mysql.
dbc = MySQLdb.connect(host='###', user='###', passwd='###', db='###', use_unicode=True)

# Create a cursor.
cursor = dbc.cursor()

# Enforce UTF-8 for the connection.
cursor.execute('SET NAMES utf8mb4')
cursor.execute("SET CHARACTER SET utf8mb4")
cursor.execute("SET character_set_connection=utf8mb4")

参考: How to encode (utf8mb4) in Python


发表评论

电子邮件地址不会被公开。 必填项已用*标注