MySQL 8.0 から default collation が utf8mb4_0900_ai_ci になった。

MySQL 8.0 changed the utf8mb4 default collation from utf8mb4_general_ci to utf8mb4_0900_ai_ci (for details, see here and here).

ref. An in depth DBA’s guide to migrating a MySQL database from the utf8 to the utf8mb4 charset – Saverio Miroddi – 64K RAM SYSTEM  38911 BASIC BYTES FREE

Collations

mysql 8 で追加された collations

+----------------------------+
| Collation                  |
+----------------------------+
| utf8mb4_0900_ai_ci         |
| utf8mb4_de_pb_0900_ai_ci   |
| utf8mb4_is_0900_ai_ci      |
| utf8mb4_lv_0900_ai_ci      |
| utf8mb4_ro_0900_ai_ci      |
| utf8mb4_sl_0900_ai_ci      |
| utf8mb4_pl_0900_ai_ci      |
| utf8mb4_et_0900_ai_ci      |
| utf8mb4_es_0900_ai_ci      |
| utf8mb4_sv_0900_ai_ci      |
| utf8mb4_tr_0900_ai_ci      |
| utf8mb4_cs_0900_ai_ci      |
| utf8mb4_da_0900_ai_ci      |
| utf8mb4_lt_0900_ai_ci      |
| utf8mb4_sk_0900_ai_ci      |
| utf8mb4_es_trad_0900_ai_ci |
| utf8mb4_la_0900_ai_ci      |
| utf8mb4_eo_0900_ai_ci      |
| utf8mb4_hu_0900_ai_ci      |
| utf8mb4_hr_0900_ai_ci      |
| utf8mb4_vi_0900_ai_ci      |
+----------------------------+

The convention used here is:

  • Character-set name (utf8mb4)
  • Language identifier (de,is etc)
  • Unicode version (0900)
  • Accent and case sensitivity (ai_ci means case and accent insensitive)

ref. MySQL :: New collations in MySQL 8.0.0

collation の比較

+------------+-----------------------+---------+-----------+
| Expression | Collation             | Result  | Comment   |
+------------+-----------------------+---------+-----------+
| 'å'='a'    | utf8mb4_0900_ai_ci    | True    | Correct   |
| 'Å'='a'    | utf8mb4_0900_ai_ci    | True    | Correct   |
| 'Å'='å'    | utf8mb4_0900_ai_ci    | True    | Correct   |
+------------+-----------------------+---------+-----------+
| 'å'='a'    | utf8mb4_da_0900_ai_ci | False   | Correct   |
| 'Å'='a'    | utf8mb4_da_0900_ai_ci | False   | Correct   |
| 'Å'='å'    | utf8mb4_da_0900_ai_ci | True    | Correct   |
+------------+-----------------------+---------+-----------+
| 'å'='a'    | utf8_general_ci       | True    | Correct   |
| 'Å'='a'    | utf8_general_ci       | False   | Incorrect |
| 'Å'='å'    | utf8_general_ci       | False   | Incorrect |
+------------+-----------------------+---------+-----------+
| 'å'='a'    | utf8_danish_ci        | False   | Correct   |
| 'Å'='a'    | utf8_danish_ci        | True    | Incorrect |
| 'Å'='å'    | utf8_danish_ci        | False   | Incorrect |
+------------+-----------------------+---------+-----------+

ref. MySQL :: MySQL 8.0 Collations: The devil is in the details.

実際に試してみよう。

mysql> SELECT 'Å'='a' COLLATE utf8mb4_general_ci AS Result;
+--------+
| Result |
+--------+
|      0 |
+--------+
1 row in set (0.00 sec)

mysql> SELECT 'Å'='a' COLLATE utf8mb4_0900_ai_ci AS Result;
+--------+
| Result |
+--------+
|      1 |
+--------+
1 row in set (0.01 sec)

utf8mb4_0900_ai_ci is new default

utf8mb4_0900_ai_ciが新しいデフォルトになった。

MySQL 5.7 already provide a much newer utf8mb4_unicode_520_ci collation that handles characters in SMP correctly. The Sushi-Beer problem could be solved by switching to this collation. But in MySQL 8.0 we decided to take this a step further by introducing utf8mb4_0900_ai_ci, which is based on the latest Unicode standard, and we intend to set this collation as the default collation for utf8mb4.

MySQL :: Sushi = Beer ?! An introduction of UTF8 support in MySQL 8.0

実際に試してみると下記の通り。

mysql> create database foo;
Query OK, 1 row affected (0.01 sec)

mysql> SELECT schema_name, default_collation_name FROM information_schema.SCHEMATA;
+--------------------+------------------------+
| SCHEMA_NAME        | DEFAULT_COLLATION_NAME |
+--------------------+------------------------+
| mysql              | utf8mb4_0900_ai_ci     |
| information_schema | utf8mb3_general_ci     |
| performance_schema | utf8mb4_0900_ai_ci     |
| sys                | utf8mb4_0900_ai_ci     |
| foo                | utf8mb4_0900_ai_ci     |
+--------------------+------------------------+
5 rows in set (0.00 sec)

参考リンク