MyBB Community Forums

Full Version: Text formatting in MyBB 2.0: a proposal
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi!

I'm the author of s9e\TextFormatter, a text formatting library that supports BBCodes and most other features found in forums, implemented as plugins. My goal is to have it used by forum softwares and other commenting systems, and I'm currently contacting open source projects like MyBB to see how we can work together.

Even though s9e\TextFormatter's code is fairly mature (years of development and thousands of unit tests) it's a hobby project and I'm still figuring out how to make it more accessible to open source projects like MyBB. I've only started writing a cookbook of short functional examples recently, and there's a lot of material to cover so I think it's more efficient for me to just go ask developers what they're looking for so that I can prioritize it accordingly.

MyBB 2.0 is in the early stages of development, so I see it as an opportunity not only for you to benefit from offloading the development of your text processing but also for me to draw from your feedback and better discern what kind of text formatting the next generation of forums want.

I don't want to post a bigger wall of text than it already is so I'll try to summarize the big lines below. Every text formatting library out there claims the same qualities ("fast! secure! easy!") and it's hard to highlight what makes s9e\TextFormatter different because everyone has different expectations. If there's something that I should have included but I didn't, please feel free to ask. Otherwise, there's more info in this document and a working example of a basic parser there.
  • Plugin-based. Look into the Plugins directory on GitHub, there's a couple of examples in each subdirectory.
  • The BBCodes plugin is very flexible and robust. It doesn't get tripped by malformed or misnested tags. It supports and extends the same custom BBCodes syntax as phpBB's, meaning that users can create their own BBCodes without touching the code. Also, there are safeguards in place to prevent unsafe BBCodes from being created.
  • There's also a plugin (unimaginatively called Generic) that performs generic replacements similar to MyBB's MyCode. Like custom BBCodes, unsafe replacements (e.g. unfiltered content in an onclick attribute, or in a URL) are detected.
  • There's no Markdown plugin yet, but it's on my longterm TODO list and I can reprioritize it on demand.
  • The formatting is done on an internal representation of the text. It doesn't alter the text via preg_replace() or similar, meaning that a bad regexp won't leak unfiltered input or create a XSS vector. Input is always filtered and XSS is taken seriously.
  • Continuously tested on Travis: [Image: TextFormatter.png?branch=master] [Image: badge.png]

If you have any questions, please feel free to ask.
If you want to get integrated with software such as forums I would look at getting it frameworks as that's the way open source forum software is moving for their next big releases.

But, I don't know enough about this say more than that Toungue.
I would definitely support this especially if a standardized bbcode format (think: w3c and HTML) was decided upon by the major forum softwares (MyBB, phpbb, VB, XF, IPB, etc). I know that someone on the team tried starting that a few months ago (Paul?), don't know what became of it though.

AFAIK, the MyCode parser at current is fulled with hacks, and deserves a rewrite. I think Chris mentioned support for Markdown, but don't quote me on that. Smile
If you want to see this in MyBB, make a plugin for it, for MyBB 1.6 or 1.8.

When you add emoticons for ":|", "::|", "::", "||", which one does your parser use for "::||"?
(2013-06-30, 02:35 AM)Alex Smith Wrote: [ -> ]If you want to get integrated with software such as forums I would look at getting it frameworks as that's the way open source forum software is moving for their next big releases.

It's already available via Composer (although I'm not advertising it for the time being) so it should be usable as an independent component, but I'll look into frameworks later down the road.

(2013-06-30, 05:07 AM)Seabody Wrote: [ -> ]I would definitely support this especially if a standardized bbcode format (think: w3c and HTML) was decided upon by the major forum softwares (MyBB, phpbb, VB, XF, IPB, etc).

Some sort of BBCode standardization would be nice, but I don't really see that happening in this life. That's why I made the BBCode parser flexible enough to understand most of the BBCode syntaxes used in those forums (and I'm thinking about supporting WordPress's short codes too.) For instance, here in MyBB quote tags start with the author's name, followed by values for pid and dateline, each enclosed in single quotes, whereas in vBulletin it's just the author's name, not in quotes (even if the name contains spaces) and I think it's followed by a semicolon and a post number sometimes. Or perhaps I'm confusing it with another forum system, I've seen a lot of different syntaxes during research. Smile

The BBCode parser in s9e\TextFormatter can parse either form, or even both at the same time. So while a bit of standardization (even if it's only the syntax) would be welcome, until that happens I'll have the parser bend over backwards to accomodate the syntaxes that are already in use.

(2013-06-30, 06:44 PM)frostschutz Wrote: [ -> ]If you want to see this in MyBB, make a plugin for it, for MyBB 1.6 or 1.8.

When you add emoticons for ":|", "::|", "::", "||", which one does your parser use for "::||"?

s9e\TextFormatter requires PHP 5.3+ so I don't know if it's a good fit for a 1.x version. If anybody wants to write a plugin for 1.x, I'll give them all the support they need from my side of the code as long as they take care of the MyBB side of it.

I've just tested those emoticons and the longest match "::|" wins, regardless of other factors such as the order of addition.
(2013-06-30, 11:25 PM)JoshyPHP Wrote: [ -> ]s9e\TextFormatter requires PHP 5.3+

5.4+ actually (for the shortened array syntax), but that's probably easily amended...

(2013-06-30, 11:25 PM)JoshyPHP Wrote: [ -> ]I've just tested those emoticons and the longest match "::|" wins

Sounds good. ( I asked because of this http://dev.mybb.com/issues/2099 which is a typical issue hereabouts ). I tried reading into your code and didn't find the relevant parts; over 200 files for a bbcode parser is a bit too much for me.

Quote:I don't know if it's a good fit for a 1.x version

I don't know either, but there is no other version and won't be in the foreseeable future (I was told to wait for 2.0 back in 2008, and it's not any closer now, so you're going to have a long wait yourself). So it's either 1.x or nothing for quite some time.

And you'd have to do it yourself, as I don't see anyone else caring enough to make the effort, or caring enough to want to use something else at all. You actually find people defending MyBB's parser for the bugs it has, even for things that could be trivially fixed.

Apart from solid parsing your code would have to offer some features that aren't already there, if you want it to be appealing to regular MyBB users. I'd probably not use it myself, even if you made a plugin for it. I know MyBB's code too well by know, to switch to something I don't know how to hack when it ends up doing something differently from what I want it to do.

That's the problem with new bbcode parsers... every forum already has one.
If you're on PHP 5.3 you need to run a script that will replace a few things such as short arrays, traits and a couple of others. I keep an eye on compatibility thanks to Travis CI. I'm looking for a better solution though, at some point I intend to make a branch (automatically maintained) for PHP 5.3 that will not require you to run that kind of things.

Thanks for the link to that bug, it's useful for me to know what kind of bugs/edge cases people encounter so that I can avoid potential design pitfalls.

(2013-07-01, 02:16 AM)frostschutz Wrote: [ -> ]I tried reading into your code and didn't find the relevant parts; over 200 files for a bbcode parser is a bit too much for me.

Yeah, the codebase is relatively big (code coverage tells me 6999 lines of code) and even though the Emoticons plugin is only 60 lines, may not find its regexp immediately.

It's not just "a bbcode parser" though. ~8.5% of the codebase (excluding tests) deals with BBCodes, the rest is a framework for handling text formatting in general. A typical BBCode parser will focus on transforming BBCodes. Then on top of that, maybe it'll handle emoticons, and automatically linkify URLs. But then, it has to order them the right way so that a :/ emoticon won't eat parts of an http:// URL. And also that it doesn't double-linkify inside of a [url] BBCode, or try to linkify an image's URL. All of that is handled globally and in a coherent way, outside of plugins.

(2013-07-01, 02:16 AM)frostschutz Wrote: [ -> ]And you'd have to do it yourself, as I don't see anyone else caring enough to make the effort, or caring enough to want to use something else at all.

If nobody cares about better formatting, I'm not going to shove it down their throats! Big Grin

As you said, every forum already has a BBCode parser, or something to that effect. And they're often tightly coupled and buried deep inside their old codebase. At this point, ripping it apart and replacing it (even with a shiny brand new BBCode3000++) can cause more pain than the bugs they've grown accustomed too. That's why I'm targeting next-generation software right now. I'm trying to catch developers before they invest their time implementing a different solution. It doesn't matter if it takes a few years before it emerges publicly.