MyBB Community Forums

Full Version: error merging post from smf 1.1.11
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
i've got a problem with merge syetem...
posts remain cut since first emoticon in text...

example:
http://i37.tinypic.com/2l8i8so.jpg

???

(sorry my AWFUL english... -.-)
Doc, can you test the latest SVN trunk and see if you're still having this issue?

If so I'll open a new bug report on the tracker.
Hi Dylan, thanks for your reply. I'm working with Doc90 for this conversion.
I just tried with the latest version from SVN, and it still truncates messages when it encounters emoticons.
I've uploaded a mini dump of the SMF board we are converting, with only 2 topics and a bunch of messages, to help the team figure out the problem.
<snip>
(for reference, these are the original topics from the live SMF board: 1st, 2nd )
It seems that the merge system goes into panic with &nbsp; and smileys with double quotes around them such as :asd:
Ok, thanks. I'll look into this as soon as possible. Perhaps this evening if I have time. Also keep an eye on the bug tracker, I may not post here if I insert it as a new issue on the tracker.
we'll keep an eye on the tracker too.
In my tests with the mini dump you can always replicate the problem, but let us know if you need anything else.
Thanks again
Ok guys, I can not reproduce this with the latest svn trunk. Everything in your test threads comes across fine.

However, one problem I noticed is that even your sample database you sent doesn't have all tables using the same encoding. Some are using utf8_general_ci, others are using latin1_swedish_ci. That is going to cause problems somewhere along the line. Especially with your table collation set as latin1_swedish_ci.

I did however notice another BB Code parser error with this conversion. Embedded quotes aren't coming across properly, and the word link is being appended to everyones names. So I'll get that fixed as soon as possible.

Anyways, even with your database collation problem, it still converted all posts properly for me.
I just noticed that the error occurs when "Automatically convert messages to UTF8?" is set to "No"
I had to set it to No because otherwise letters such as àèìòù, which are common in Italian, are replaced with garbage.
I don't think this is related with the encoding problems you noticed, since the only tables set with latin1_swedish_ci are used by some mods such as feedbot, pretty urls, profile comments, visual warning which do not interfere with the posts table (at least not with the posts included in the mini dump i linked before) and are not used by the merge system. Everything else is correctly set to utf8_general_ci
Do you think we have some inconsistency with the DB encoding (in the table declaration it says it is utf8, but it really is latin1 or something else) or is the merge system failing?

I've also opened two bug reports in the issues tracker, for problems encountered while trying the merge system from trunk with the full DB.
I actually saw the line 115 error in my own testing today. Not sure why converting utf8 is creating garbage characters if you use it. Unless its because of your overall schema being set to latin1_swedish. I'm not a utf expert so I'll probably bring Ryan in on this one to explain it to me. I didn't notice any corrupted characters either, all looked correct to me. No garbage. Which just creates more confusion Smile
So,

* Proper character conversion depends on the right character set on both ends. I.e. Even if the table itself is in latin1 format, maybe the characters were inserted with utf8 format.
- So therefore:
-- The MySQL table needs to be in the right format
-- The MysQL Connection and charset needs to be in the right format
-- PHP needs to interpret it in the right format
-- The text needs to be inserted in the right format

If any of these aren't right then it will cause the problems described. It's very very hard to extrapolate character encodings from pure characters themselves, which is why the Merge System sometimes has a difficult time converting it properly.

Thanks,
Ryan
thanks for your support.
It seems we had tables that declared using utf8 but data encoded with latin1. Something must have gone wrong with the various updates in SMF history.
Running the merge system setting "cp1252 West European" as the table encoding and enabling the automatic conversion to UTF8 didn't truncate any post and didn't create any weird character.
Pages: 1 2