MyBB Community Forums

Full Version: utf8mb4
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Just a tip to anyone trying to alter their DB from the current default of utf-8 and maybe wants to add support for emoji's. Do not use utf8mb4_unicode_ci for your collation. I found out the hard way you will experience problems with the mybb_users table for the username column. The unicode_ci considers certain characters to be the exact same.

Example:
ℬasic
Basic

So the UNIQUE index for that column will not take. If you remove the UNIQUE index then members who login or possibly other function won't work properly.

Anyways, it's been a fun day for me dealing with this and trying to convert my database. I'd thought I'd share so that maybe someone else will be saved some headache.
What collation did you settle on?
(2020-01-05, 07:10 AM)Ben Cousins Wrote: [ -> ]What collation did you settle on?

I am using utf8mb4_general_ci
Yeah, I have faced the exactly same problem.

For MySQL's UNIQUE key generating, it seems that it uses the Unicode keys. See this https://stackoverflow.com/a/53388427/6681141 for example:
utf8_polish_ci      Ł greater than L and less than M
utf8_unicode_ci     Ł greater than L and less than M
utf8_unicode_520_ci Ł equal to L
utf8_general_ci     Ł greater than Z

So for anyone who wants every Latin letter with its variants (maybe including some Greek letters? and half-wide/full-wide letters in CJK) allowed in their forum's username, as OP suggested, avoid using utf8/utf8mb4_unicode_ci and utf8/utf8mb4_unicode_520_ci.
utf8/utf8mb4_general_ci and utf8/utf8mb4_bin are both good candidates.