2023-07-25 MySQL 8.0 のdefault collationはutf8mb4_0900_ai_ci
MySQL 8.0 から default collation が utf8mb4_0900_ai_ci
になった。
MySQL 8.0 changed the utf8mb4 default collation from
utf8mb4_general_ci
toutf8mb4_0900_ai_ci
(for details, see here and here).
Collations
mysql 8 で追加された collations
+----------------------------+
| Collation |
+----------------------------+
| utf8mb4_0900_ai_ci |
| utf8mb4_de_pb_0900_ai_ci |
| utf8mb4_is_0900_ai_ci |
| utf8mb4_lv_0900_ai_ci |
| utf8mb4_ro_0900_ai_ci |
| utf8mb4_sl_0900_ai_ci |
| utf8mb4_pl_0900_ai_ci |
| utf8mb4_et_0900_ai_ci |
| utf8mb4_es_0900_ai_ci |
| utf8mb4_sv_0900_ai_ci |
| utf8mb4_tr_0900_ai_ci |
| utf8mb4_cs_0900_ai_ci |
| utf8mb4_da_0900_ai_ci |
| utf8mb4_lt_0900_ai_ci |
| utf8mb4_sk_0900_ai_ci |
| utf8mb4_es_trad_0900_ai_ci |
| utf8mb4_la_0900_ai_ci |
| utf8mb4_eo_0900_ai_ci |
| utf8mb4_hu_0900_ai_ci |
| utf8mb4_hr_0900_ai_ci |
| utf8mb4_vi_0900_ai_ci |
+----------------------------+
The convention used here is:
- Character-set name (
utf8mb4
)- Language identifier (
de
,is
etc)- Unicode version (
0900
)- Accent and case sensitivity (
ai_ci
means case and accent insensitive)
ref. MySQL :: New collations in MySQL 8.0.0
collation の比較
+------------+-----------------------+---------+-----------+
| Expression | Collation | Result | Comment |
+------------+-----------------------+---------+-----------+
| 'å'='a' | utf8mb4_0900_ai_ci | True | Correct |
| 'Å'='a' | utf8mb4_0900_ai_ci | True | Correct |
| 'Å'='å' | utf8mb4_0900_ai_ci | True | Correct |
+------------+-----------------------+---------+-----------+
| 'å'='a' | utf8mb4_da_0900_ai_ci | False | Correct |
| 'Å'='a' | utf8mb4_da_0900_ai_ci | False | Correct |
| 'Å'='å' | utf8mb4_da_0900_ai_ci | True | Correct |
+------------+-----------------------+---------+-----------+
| 'å'='a' | utf8_general_ci | True | Correct |
| 'Å'='a' | utf8_general_ci | False | Incorrect |
| 'Å'='å' | utf8_general_ci | False | Incorrect |
+------------+-----------------------+---------+-----------+
| 'å'='a' | utf8_danish_ci | False | Correct |
| 'Å'='a' | utf8_danish_ci | True | Incorrect |
| 'Å'='å' | utf8_danish_ci | False | Incorrect |
+------------+-----------------------+---------+-----------+
ref. MySQL :: MySQL 8.0 Collations: The devil is in the details.
実際に試してみよう。
mysql> SELECT 'Å'='a' COLLATE utf8mb4_general_ci AS Result;
+--------+
| Result |
+--------+
| 0 |
+--------+
1 row in set (0.00 sec)
mysql> SELECT 'Å'='a' COLLATE utf8mb4_0900_ai_ci AS Result;
+--------+
| Result |
+--------+
| 1 |
+--------+
1 row in set (0.01 sec)
utf8mb4_0900_ai_ci is new default
utf8mb4_0900_ai_ci
が新しいデフォルトになった。
MySQL 5.7 already provide a much newer
utf8mb4_unicode_520_ci
collation that handles characters in SMP correctly. The Sushi-Beer problem could be solved by switching to this collation. But in MySQL 8.0 we decided to take this a step further by introducingutf8mb4_0900_ai_ci
, which is based on the latest Unicode standard, and we intend to set this collation as the default collation for utf8mb4.
MySQL :: Sushi = Beer ?! An introduction of UTF8 support in MySQL 8.0
実際に試してみると下記の通り。
mysql> create database foo;
Query OK, 1 row affected (0.01 sec)
mysql> SELECT schema_name, default_collation_name FROM information_schema.SCHEMATA;
+--------------------+------------------------+
| SCHEMA_NAME | DEFAULT_COLLATION_NAME |
+--------------------+------------------------+
| mysql | utf8mb4_0900_ai_ci |
| information_schema | utf8mb3_general_ci |
| performance_schema | utf8mb4_0900_ai_ci |
| sys | utf8mb4_0900_ai_ci |
| foo | utf8mb4_0900_ai_ci |
+--------------------+------------------------+
5 rows in set (0.00 sec)