MyBB Community Forums

Full Version: (aw)Full utf8 support in 1.2?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I admire your efforts to switch to utf8 (having in mind those scanf/printf php functions) but sometimes it's better to either do/test completely such ideas in advance or dont include them at all:
  • First you should have warned ppl for this (at least in the changelog), as some use different langs than english and this (upgrade.php) could totally screw up their data.
  • Second you should set collations in DB schema for the new install (which I had to relly on as it turned out updating from 1.1.8 is a hell of a nightmare).
  • And of course, third ... you should set DB connection's collation to UTF8 too (e.g. in db_mysql.php)! Not all servers are configured to support this by default so it is imperative to select it...

I suggest a mod. to db_mysql's connect function like:
function connect($hostname="localhost", $username="root", $password="", $pconnect=0)
	{
		if($pconnect)
		{
			$this->link = @mysql_pconnect($hostname, $username, $password) or $this->dberror();
		}
		else
		{
			$this->link = @mysql_connect($hostname, $username, $password) or $this->dberror();
		}
		mysql_query('SET NAMES utf8;',$this->link);
		mysql_query('SET CHARACTER SET utf8;',$this->link);
		mysql_query('SET CHARACTER_SET_DATABASE = utf8;',$this->link);
		return $this->link;
	}

And an example for DB as follows:

CREATE TABLE `mybb_adminlog` (
  `uid` int(10) unsigned NOT NULL default '0',
  `dateline` bigint(30) NOT NULL default '0',
  `scriptname` varchar(50) collate utf8_general_ci NOT NULL default '',
  `action` varchar(50) collate utf8_general_ci NOT NULL default '',
  `querystring` varchar(150) collate utf8_general_ci NOT NULL default '',
  `ipaddress` varchar(50) collate utf8_general_ci NOT NULL default ''
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_general_ci;

Another little bug is in parsing settings.php's encoding. If you type in forum's name in different than eng. language (via install.php) you will end up with "??? ???? ??????" when your forum displays it directly in utf8.

BTW, I've noticed there is a slight difference between those functions in db_mysql.php and db_mysqli.php - last one does not check for persistant connections. Does this have to do with the fact that db_mysqli.php (under Windows) uses ODBC (there goes "improved" Mysql support Big Grin)?!?

P.S. It is not funny to see reasonable posts like "either reinstall or forget about updating" esp. for ppl who use your forum on several sites. I hope next "major" update wont be such pain in the lime Sad
I like this forum and it's functionality but if I have to spend several days to sync. databases manually, each time you release "updates" - It will force me to switch to another forum...

Edit: For those who would argue about staticly setting encoding to utf8 - I would like to remind them to check what UTF8 was created for...
Quote:P.S. It is not funny to see reasonable posts like "either reinstall or forget about updating" esp. for ppl who use your forum on several sites. I hope next "major" update wont be such pain in the lime
I don't see anybody suggesting to either reinstall or not upgrade.

The only reason this is a major upgrade process is because of a lot of internal code structures, templates and themes were rewritten to bring them up to a proper standard. The next major release will work perfectly fine when upgraded.

As for the differences between db_mysqli.php - MySQLi does not support persistent connections (well, I can't find a function for it in the PHP documentation.

I'll look in to the UTF8 business too, but so far it works everywhere I've tested it.
About those "reasonable posts" it was an example not a quote, sry. I was getting an impression there is no other way to update (from MyBB 1.2 Released thread), before I've even tried it (afterwards I was sure Smile).

I don't say you haven't tested it - I just mind for such feature not being tested thoroughly...

Also it won't hurt if you add accept-encoding="utf8" to all HTML forms, to ensure international text don't gets messed up in post processing by some misconfigured servers (or browsers maybe) and remove charset setting in language settings.

Another bug found:
Cropping titles in the forums list is a good thing... unless they're UTF8 encoded and you use single-byte functions for the purpose... Guess what happens to text in the Last post field for example...
Give them some time to breath.. The most annoying thing I find in PHP is internationalization! (and to be frank, I never implemented it completely in my scripts!). Maybe they need more people to test it with different languages.

But it's nice you are trying to help find bugs n' hence, helping make the software better.
If you ask me, the most annoying thing is when you try to update to a "final" version and it turns everything upside-down...

Based on my recent experience I can definately conclude that MyBB 1.2 is not UTF-8 compatible!!! So, if your forum works without utf don't even think of switching! Here is a short list of discovered problems (to ease developers):
  • Database is not collated to utf8
  • Server connections are not set to utf8
  • update.php corrupts database cause of similar to the two previous problems
  • settings.php is not utf8 encoded
  • Threads in forum lists are being cropped using the default charset for the respective functions (resulting in defects in their display)
  • And the most recent one - Search engine (with functions like clean_keywords) corrupts encoding of the keywords and the search itself.
  • UPDATE: forum's data cache functions are not fully compatible with older PHP4/MySQL4 versions and utf8 non-latin text...

For less than a week I found more than 5 serious bugs which most annoyingly are being ignored by testers and/or developers. I hope this attitude changes in future, towards better international compatibility or you risk loosing many non-english users...

P.S. About the search problem - the fix is to replace all strtolower with mb_strtolower (and if you haven't set default encoding add an 'utf8' parameter) in functions_search.php.
q-tech is right ! I just updated my forum to 1.2 : WHAT A MESS !!!! I mean look on your own forum : http://community.mybboard.net/showthread...455&page=1 here you can see the extent of the damage that will be done if you have a forum that uses accents every where...
q-tech,

I've sent you a private message regarding all of this so we can attempt to get it sorted out.
I use English 99% of the time. On my own forum I already set up UTF-8 to be the default for my database and web pages, so as a beta tester I didn't see any problems with the few accents that I did use.
laie_techie Wrote:I use English 99% of the time.  On my own forum I already set up UTF-8 to be the default for my database and web pages, so as a beta tester I didn't see any problems with the few accents that I did use.
In other words - you've prepared (customized) your forum and used only english to test UTF-8 compatibility. If all tests were done in this fashion - it's a wonder that MyBB works so well...

I've resolved and the thread list bug - it is related to php's settings about default mb's charset. All you need is to specify somewhere (pref. in index or clasess) internal encoding using the translation's settings (e.g. english.php) something like:
mb_internal_encoding ($langinfo['charset']);

I think I've provided developers with sufficient info in this thread to fix those bugs and make MyBB fully UTF-8 compatible.

Also they must reconsider (and replace) all strtolower and sprintf functions (as they've done with substr) which handle user data, and add that HTML encoding tag to all forms...
Don't rip on Beta testers like laie_techie. They're just everyday users that were invited to test on their own forums. Keep your beefs polite and with the developers. Sheesh, Chris is even discussing this with you privately, I don't see why you've gotta continue being irate.

q-tech Wrote:...the fix is to replace all strtolower with mb_strtolower...

Lots of people don't have Mbyte functionality, so you'd need to check and make sure they do, and if not, then fall back to the original PHP functions.
Pages: 1 2