MyBB Community Forums

Full Version: [suggestion] What he said,
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I don't know how to explain it, so he can.

Quote:You mean does MyBB's do the stupid str_replace thing?

I'll check...

[~5 minutes go by]

Why yes, it does do BBCode retardedly. Not as retardedly as str_replace, lol, but it does do preg_replace, meaning you can do:


italic bold still bold! -> <em>italic <strong>bold</em> still bold!</strong>

When it should be:
<em>italic <strong>bold</strong></em> still bold![/b]


From what I can tell too, for some STUPID reason, they don't use htmlspecialchars on their messages. In fact, each time they parse messages, they do str_replace("<", "&lt;", $str); and on >...

Because you know it is more likely that a post will have HTML allowed than not. OH, WAIT! Toungue


OT: http://mschat.net/forum/index.php?topic=2067.0

Its what people from the other side think. Toungue.
Hmm.... Does he even know the term of what he's trying to talk about? It's called a lexical parser. Which, btw, still uses regular expressions. Any 21st century parser will use regular expressions no matter how you try to avoid them.

Does he also realize that the MyBB codebase is over 10 years old? PHP was on version 3, 10 years ago. OOP and more advanced concepts like the one he is trying to describe weren't even considered at that time.

Seems like he's having a hard time himself just simply writing his parser and getting it off the ground.
I wouldn't take it too serious justifying myself Wink everyone who's gotta know it is aware of. The rather younger people amongst us (most likely like the quoted person) can, well, stick to their stuff.
@Ryan
The term is irrelevant, the fact that MyBB uses regular expressions is relevant.

The way I created my BBCode parser (if you haven't guessed, I am the one who posted what Mark quoted) uses no regular expressions at all. The only time the preg_match function is called is the validation of attributes and what not.

Anyways, the way my BBCode parser works is as such:

Say I have this as a message:
[b]Hey everyone![/b]

The parser will go through the message, and convert the above into a structured array, something like this:
Array (
Array(
[text] => [b]
[tag] => b
[is_tag] => 1
[closing] => 0
[bbc] => Array (
              Information about parsing the BBCode, like before and after tags, makes things faster later
            )
)
Array(
[text] => Hi everyone!
[tag] => 
[is_tag] => 0
)
Array(
[text] => [/b]
[tag] => b
[is_tag] => 1
[closing] => 1
)
)

Once that is done, if the tag is valid, and if the tag has attributes, and those are valid, etc. etc. the HTML will be not replaced (yes, you see, no regex!), but inserted as it goes along (the right stuff is added to a buffer as it goes along, either putting the BBCode tags as is back, or the HTML).

So yeah...

@ray
In with the new, out with the old, as the old is old, and the new is new. Do you keep an old computer if you can get a new one? Heck no.

Also, why in the world does MyBB's registration CAPTCHA actually tell you if it is valid or not right on the registration page as you type? Do you know how, boy, I don't know how to say it nicely. So I won't say anything other than just bringing it up.
(2010-08-10, 03:40 AM)aldo Wrote: [ -> ]Anyways, the way my BBCode parser works is as such:

Say I have this as a message:
[b]Hey everyone![/b]

The parser will go through the message, and convert the above into a structured array, something like this:

Aka a lexical parser. What you're doing is called tokenizing. Which is exactly what a regular expression does...

All you've done is written a poor man's regular expression engine. Realize that? PHP's regular expression engine is written in C, which is a compiled language, so it'll be faster then anything you try to accomplish and the regexes we pass through are cached and optimized using opcodes if you have your configuration setup right. You're solution offers none of that.

There's a reason why every software in the entire world web uses a form of regular expressions to do these types of things. Yours wouldn't scale.

Quote:Also, why in the world does MyBB's registration CAPTCHA actually tell you if it is valid or not right on the registration page as you type? Do you know how, boy, I don't know how to say it nicely. So I won't say anything other than just bringing it up.

You do realize that there is no difference right. Either the spam bot presses submit and sees the same thing in essentially the same amount of time (the time difference is negligible) or an XMLHttp request is made which is the same as pressing submit. It just allows web pages to be dynamic and saves on bandwidth.

You seem like a decent fellow, but you're not correct.
However, when you do it with regex, you repeatedly apply the regex until it is found no more, right? Does that not mean you are repeatedly searching the same thing over and over and over again, when it could be done just once.

Also, as said, unless you have it go through and validate the parsed BBCode, your spitting out invalid HTML, which browsers still interpret, but isn't very good.

As for the CAPTCHA thing, sure, while spambots can still submit the registration and figure out if it is correct on the CAPTCHA or not, all the bot has to do is send a request to the page which responds with whether or not the CAPTCHA is correct, making it faster for the bots.

Does it have flood protection?
(2010-08-10, 04:28 AM)aldo Wrote: [ -> ]However, when you do it with regex, you repeatedly apply the regex until it is found no more, right?

No. One preg_replace will cover that all in one go. Only exception is if you needs it to be nestable BBCode but still faster then your method.

(2010-08-10, 04:28 AM)aldo Wrote: [ -> ]Does that not mean you are repeatedly searching the same thing over and over and over again, when it could be done just once.

Even in that case, it's still faster then a non-pre-compiled, non-pre-cached, non-optimized, non-opcode, poor mans tokenizer.

Even if the end your code takes less cycles to run then a regular expression, because it's PHP Code, the Zend Engine is going to parse it on-the-fly which is much slower then C compiled code by it's nature.

(2010-08-10, 04:28 AM)aldo Wrote: [ -> ]Also, as said, unless you have it go through and validate the parsed BBCode, your spitting out invalid HTML, which browsers still interpret, but isn't very good.

Malformed BBCode is malformed. Making valid HTML out of malformed BBCode isn't my number one priority for the time I spend working on MyBB. I would rather have it simply not parse malformed BBCode, but again, that level of logic is slow and not a high priority based on what our users want.

(2010-08-10, 04:28 AM)aldo Wrote: [ -> ]As for the CAPTCHA thing, sure, while spambots can still submit the registration and figure out if it is correct on the CAPTCHA or not, all the bot has to do is send a request to the page which responds with whether or not the CAPTCHA is correct, making it faster for the bots.

Does it have flood protection?

Again, the speed difference is negligible either way.

(2010-08-10, 04:28 AM)aldo Wrote: [ -> ]Does it have flood protection?

Flood protection is a firewall issue, not a software issue. Any implementation of flood protection would just simulate a firewall. No need to reinvent the wheel.

Do you realize that once a CAPTCHA has been called, it is not possible to use that specific CAPTCHA again until at least about a week (depending on how fast the task is triggered to clean out the CAPTCHA blacklist table). With 1.45519152 × 10^25 possible combinations, you're not going to run out of possibilities.

This prevents spam bots from finding a CAPTCHA that it can read and using it consistently. Even if they queried the captcha generator page directly, the bot couldn't then use that CAPTCHA again.

You also can't derive the CAPTCHA from the imagehash that is correlated with it. They're made from unique, statistically-unpredictable, entropy.

Enjoy.
(2010-08-10, 03:40 AM)aldo Wrote: [ -> ]@ray
In with the new, out with the old, as the old is old, and the new is new. Do you keep an old computer if you can get a new one? Heck no.

Suppose all answers you received couldn't be phrased in a better way.

Your attitude is quite cheeky and uncomfortable. Hopefully you learn that prejudging (in an on top offending way) due to too little knowledge doesn't pay off but makes you look like one of those little script kiddies.
(2010-08-18, 12:34 AM)ray187 Wrote: [ -> ]
(2010-08-10, 03:40 AM)aldo Wrote: [ -> ]@ray
In with the new, out with the old, as the old is old, and the new is new. Do you keep an old computer if you can get a new one? Heck no.

Suppose all answers you received couldn't be phrased in a better way.

Your attitude is quite cheeky and uncomfortable. Hopefully you learn that prejudging (in an on top offending way) due to too little knowledge doesn't pay off but makes you look like one of those little script kiddies.

Dude, leave us alone. Like Ryan said. We have our priorities. Go code your own bbsystem. I for one am extremely grateful for all Ryan's hard work, and I am sorry that he has to deal with people who think they know it all.

Thank you Ryan, I will stick as long as you offer a great product. I am in charge of another script that is coming along slowly, but even if I finish that I will probably always have a forum with mybb to play around with.