Pre-Parse Post/Signatures & Sphinx Search plugins
#1
Hi Guys,

I am releasing a few of my plugins that I use on ncaabbs.com and other high profile forums. These plugins allow functionality for better performance on identified weak-areas of MyBB on high profile forums. I am not releasing these plugins in my official plugin release thread for several reasons:
  • These plugins require advanced knowledge of website administration
  • They are only needed for high profile forums with a need to push the most out of MyBB for stability and performance reasons
  • I don't have enough time to upkeep these to the amount that would be required for the average-joe and it's unlikely that someone will be running a forum that requires these without having advanced knowledge of website administration

These plugins have been under intense testing (stress testing, usability testing, et cetera) for the past 2 years and I believe they have reached a point to which they can be safely released and therefore operated by those who have enough knowledge to. These plugins are only being released for knowledge to developers and for other high profile forums wishing or requiring the most performance they can pack out of MyBB for stability reasons.

A readme.txt has been included in each zip. Please read it carefully.

1. Pre-Parse Post Plugin
Description: This plugin pre-parses high profile threads (a thread or post created within the last 30 days or a thread which contains 100 or more posts) and puts them into a separate database table. This is used to bypass the normal MyBB procedure to parse "on-the-go" and instead "pre"-parses when the thread or post is created or updated.
Technical Details: A weak-area in MyBB was identified in multiple tests which consistently showed the MyBB post parser was contributing a large portion to the "Server Load" of a high profile MyBB forum. A work-around was created and tested. The results consistently shows this flow-of-procedure versus MyBB's standard flow-of-procedure takes up less processing power in exchange for a relatively low amount of disk space.

Download: preparsepost.zip

2. Pre-Parse Signature Plugin
Description: This plugin pre-parses all users signatures and puts them into a separate database column. This is used to bypass the normal MyBB procedure to parse "on-the-go" and instead "pre"-parses when the signature is created or updated.
Technical Details: A weak-area in MyBB was identified in multiple tests which consistently showed the MyBB signature parser was contributing a large portion to the "Server Load" of a high profile MyBB forum. A work-around was created and tested. The results consistently shows this flow-of-procedure versus MyBB's standard flow-of-procedure takes up less processing power in exchange for a relatively low amount of disk space.

Download: preparsesig.zip

3. Sphinx Plugin
Description: This plugin uses a specialized searching engine to perform extremely fast searches among tables with millions or even billions of rows. This plugin takes advantage of the "Sphinx Search" engine to search the MyBB posts table on high profile forums with a large number of posts.
Technical Details: A weak-area in MyBB was identified in multiple tests which consistently showed the MyBB/MySQL search engines are slow on large forums and therefore "lockup" the MyBB posts table during the search. This in turn could lead to, essentially, a Denial of Service attack. A work-around was created and tested. The results consistently shows this flow-of-procedure versus MyBB's standard flow-of-procedure takes up less processing power in exchange for a relatively low amount of disk space.

Dependencies: Sphinx Search Engine (NOT SphinxSE); A dedicated server & IT skills to install Sphinx Search Engine.
Download: sphinx.zip

Hope you enjoy these.

Ryan


Attached Files
.zip   preparsepost.zip (Size: 4.46 KB / Downloads: 1,137)
.zip   preparsersig.zip (Size: 2.58 KB / Downloads: 973)
.zip   sphinx.zip (Size: 13.35 KB / Downloads: 917)
Reply
#2
thanks you very much, ryan i love all your plugins =]
Reply
#3
Good Job. The only thing is though, I can't see any benefits for using the Sphinx search engine other than moderating the CPU and disk usage. I've developed plenty of full text search engines and they've run smoothly and quickly. I can post up some benchmarks if need-be. My point is, if MyBB's forum search is optimized, it'll definitely fix the problem without needing to outsource it.
Best Regards.
Reply
#4
(01-10-2009, 05:40 PM)Rcpalace Wrote: Good Job. The only thing is though, I can't see any benefits for using the Sphinx search engine other than moderating the CPU and disk usage. I've developed plenty of full text search engines and they've run smoothly and quickly. I can post up some benchmarks if need-be. My point is, if MyBB's forum search is optimized, it'll definitely fix the problem without needing to outsource it.
Best Regards.

You've obviously never ran a 3 million post forum Smile
Reply
#5
(01-10-2009, 05:40 PM)Rcpalace Wrote: I've developed plenty of full text search engines and they've run smoothly and quickly.
You mean use MySQL's full text indexer? I doubt you can write any effective full text search engine in PHP, except perhaps, use a DB table with every single word in it.

As for the MySQL vs Sphinx engine, I have no idea on performance differences - I can't imagine the difference to be large, however Sphinx probably offers more features than the MySQL indexer.
Reply
#6
(01-10-2009, 07:03 PM)Ryan Gordon Wrote: You've obviously never ran a 3 million post forum Smile

No, not in forum base, but I have a quite large db for a user generated content website for a client, so I needed to crunch some benchmarks.


(01-11-2009, 04:58 AM)Yumi Wrote:
(01-10-2009, 05:40 PM)Rcpalace Wrote: I've developed plenty of full text search engines and they've run smoothly and quickly.
You mean use MySQL's full text indexer? I doubt you can write any effective full text search engine in PHP, except perhaps, use a DB table with every single word in it.

As for the MySQL vs Sphinx engine, I have no idea on performance differences - I can't imagine the difference to be large, however Sphinx probably offers more features than the MySQL indexer.

Yes, using the MySQL full text indexer and a few tricks of my own. I haven't tried this yet, but I was thinking of caching the table every 24 - 48 hours, and do searches from there. That way the longest time consuming query will be done once behind the scenes. I know MySQL caches recent queries but not the table data itself. I have read that the cache solution cut the query times by over half, have any of you tried it?
Best Regards.
Reply
#7
(01-11-2009, 02:42 PM)Rcpalace Wrote: Yes, using the MySQL full text indexer and a few tricks of my own. I haven't tried this yet, but I was thinking of caching the table every 24 - 48 hours, and do searches from there. That way the longest time consuming query will be done once behind the scenes. I know MySQL caches recent queries but not the table data itself. I have read that the cache solution cut the query times by over half, have any of you tried it?
Best Regards.

Why would I want to try any of that if I just have sphinx and it works perfectly fine without having to have the "longest time consuming query will be done once behind the scenes". And in fact, we ALWAYS have at least 200-300 users online at any time during the day. If we're doing intensive queries while those users are on, your just asking for trouble.

(01-11-2009, 04:58 AM)Yumi Wrote: As for the MySQL vs Sphinx engine, I have no idea on performance differences - I can't imagine the difference to be large

I think it's clear that when the server starts crashing because of the MySQL search and that it doesn't when using sphinx, that is a pretty big performance difference. It used to take 30 or more seconds to do searches and we're talkin about squeezing everything we could out of indicies and optimizations before we moved to sphinx. Sphinx takes under a second always. In fact, it mentions on it's site that it can search billions of rows in just a few seconds.

And this is a fact, both Chris Boulton and Jeremy Sands, both IT professionals, will say the same.
Reply
#8
Not to be rude, but if two people say the same it doesn't quite make me switch. You've gotten me curious however, so I'm going to run some benchmarks just now on 1 million entries in a table. I'm going to compare it with sphinx, caching it, and MySQL's full text indexer, and I'll post up the results.
Reply
#9
(01-11-2009, 06:35 PM)Rcpalace Wrote: Not to be rude, but if two people say the same it doesn't quite make me switch.

Your right, there are three people saying the same, and not to be rude, but all of which probably have more experience then you in doing this. I also have over 20,000 College Athletic fans that would say the old way sucked because it was constantly crashing and lagging up the server.

(01-11-2009, 06:35 PM)Rcpalace Wrote: You've gotten me curious however, so I'm going to run some benchmarks just now on 1 million entries in a table. I'm going to compare it with sphinx, caching it, and MySQL's full text indexer, and I'll post up the results.

Your results won't be valid. Until your table is bigger then the RAM you have installed on your server/PC then the results won't be valid.

The reason why it's so slow is because with MySQL it can't process the table in memory if the table is bigger then the memory allowed to be used by MySQL, so it has to switch to disk which is much much slower.

We have 2 GB's of RAM installed on each of the two servers that run ncaabbs.com - Our posts table is currently 3.2 GB's. We never had a problem with searching until the table hit around 1.7 GBs which was about two years back. Factor in about 300 MBs for other programs and you'll see why 1.7 GB's is where MySQL starts to be drop dead slow with 2 GBs of RAM with those types of intensive queries.


Like I said however, you will only ever need Sphinx on really really large forums. There is no point to switching until you've pushed MySQL to the limits.
Reply
#10
(01-11-2009, 06:50 PM)Ryan Gordon Wrote:
(01-11-2009, 06:35 PM)Rcpalace Wrote: Not to be rude, but if two people say the same it doesn't quite make me switch.

Your right, there are three people saying the same, and not to be rude, but all of which probably have more experience then you in doing this. I also have over 20,000 College Athletic fans that would say the old way sucked because it was constantly crashing and lagging up the server.

They probably do and probably have more experience then you and I as well. I don't care if you bring 1,000,000 of those people. The key is to solve things without using unnecessary 3rd party equipment.

However, my major covers all of this, as soon as I finish my GE and transfer over to UCD (hopefully), I'll get into such in more depth.

Quote:
(01-11-2009, 06:35 PM)Rcpalace Wrote: You've gotten me curious however, so I'm going to run some benchmarks just now on 1 million entries in a table. I'm going to compare it with sphinx, caching it, and MySQL's full text indexer, and I'll post up the results.

Your results won't be valid. Until your table is bigger then the RAM you have installed on your server/PC then the results won't be valid.

The reason why it's so slow is because with MySQL it can't process the table in memory if the table is bigger then the memory allowed to be used by MySQL, so it has to switch to disk which is much much slower.

That's why I'll have the allowed memory less for the tests: if their's a will, their's a way. Wink If 1 million entries doesn't fill it up, I'll create a loop to fill the table with random content which otta do it.

Quote:Like I said however, you will only ever need Sphinx on really really large forums. There is no point to switching until you've pushed MySQL to the limits.

Which I'll be doing during my tests:

Rcpalace Wrote:If 1 million entries doesn't fill it up, I'll create a loop to fill the table with random content which otta do it.


Their is no reason why I shouldn't conduct my own benchmarks and optimize it accordingly. If you find using 3rd party solutions the best way, go for it, it's your opinion, your website/position, do whatever you think is best.
Best Regards.
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)