MyBB Community Forums

Full Version: Preparser Cache (beta)
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4 5
Well, I'd say doing posts would be harder (and in 1.2 will definately require atleast one small file change as far as I've looked) but I've made a signature only preparser (I hope you don't mind; only runs on showthread since that's the only noticeable effect it would have. I'm not going to release it on the mods site so you can use the code if you want & call it your own.) without any direct code modifications.

I'm using it at the ncaabbs now and will have a post preparser plugin at the ncaabbs soon too (will work on it during my upcoming school break).
Thanks Tikitiki.

Yours is more of a "full preparser", rather than a "cache".
Basically, if the parser is changed (ie removing MyCodes, adding word filters etc), you have to re-parse everything.
EDIT: It appears that it will detect non-existent preparsed sigs, so I guess all you needed to add is something to set all preparsed sigs to null when updating MyCodes/Word Filters etc.

This preparser cache stores parsed signatures and posts in a separate table, and is kept only for about 7 days (changeable in AdminCP), so can also save disk space [out of interest, I wonder how well PHP cachers work with large amounts of data]. This allows parser updates to TRUNCATE the table, so that they're reparsed and cached on the next thread view (updates done via shutdown queries).
The issue, however, is that the tables need to be joined into the query, otherwise, we're forced to perform extra queries to retrieve parsed posts (which is obviously going to impact on performance).

That is why I don't think there's any neat way to do this without code edits, since I don't see any way of modifying the query without code edits.

If there were a few more hooks ideally placed, we could probably get away with just one extra query: retrieve all parsed posts from the PIDs list, store them in a cache, dummy the $parser object (point it to some custom object which will try to predict the PID/UID of the post being parsed) and some other stuff, but, unfortunately, it's not really possible...


Actually, most of the code edits really are designed to remove the parser from parsing thread subjects (which seems to be done in many places). If we take those out, then the main code edits would have to be in showthread.php and /inc/functions_post.php
If you simply changed the signature table to a column like in mine it would remove the need to edit the query because it will automatically be selected by the u.*. Personally I wouldn't trim any of the signature table because there's no way to predict how many times it's been viewed and it doesn't take up that much more space anyway seeing as there will usually be a much smaller amount of users than posts and most members won't even have a signature. And yes, my plugin is incomplete on the admin side and I can simply refresh them as you said when mycodes are updated.

Posts I agree definately need to be in a seperate table and I'm going to see if I can change this line in inc/functions_post.php for 1.4

$post['message'] = $parser->parse_message($post['message'], $parser_options);

to

if($post['already_parsed'] != 1)
{
	$post['message'] = $parser->parse_message($post['message'], $parser_options);
}

removing the need for a code edit (apart from showthread.php).

I'm planning on writing the plugin like this:

Cache only posts with threads that have been made in the last 30 days, stickies, or big threads (i.e. over 100 posts - would be a setting) and then in 1.2 hook onto global_end or use a shutdown function and trim if($rand == 4) or something like that. In 1.4 you can use tasks.
Caching threads will over 100 posts would be great. I have to lock threads with 1000 posts as that's around the point things become slow. Errors tend to start popping up.
labrocca Wrote:Caching threads will over 100 posts would be great. I have to lock threads with 1000 posts as that's around the point things become slow. Errors tend to start popping up.
Unless you have your posts per page set to something really high it shouldn't affect speed at all should it?
I haven't looked but it does appear to first check all posts in a thread and then of course make at least the pagination. My members complain that when making posts to a thread with a lot of posts that it sometimes lags. While I don't think it's a huge problem there is always some optimizing that can be done.
Tikitiki Wrote:Personally I wouldn't trim any of the signature table because there's no way to predict how many times it's been viewed and it doesn't take up that much more space anyway seeing as there will usually be a much smaller amount of users than posts and most members won't even have a signature.

At least on my forum, 90% of my users have some sort of a signature. Wink
Soshite Wrote:
Tikitiki Wrote:Personally I wouldn't trim any of the signature table because there's no way to predict how many times it's been viewed and it doesn't take up that much more space anyway seeing as there will usually be a much smaller amount of users than posts and most members won't even have a signature.

At least on my forum, 90% of my users have some sort of a signature. Wink

How many users do you have on your forum? Probably not more than a few thousand. It took less than 30 seconds (wasn't actually counting) to go through 20,000 users on my NCAA Board.

I'll have the post parser plugin up soon
Tikitiki Wrote:If you simply changed the signature table to a column like in mine it would remove the need to edit the query because it will automatically be selected by the u.*
Yes, it's acceptable for signatures, but probably less so for posts.
Separate tables are easier to handle - ie a single REPLACE INTO query can update multiple rows.
If we stored things in the same table, you'd have to perform an update_query for each update (alternatively, select, and build a REPLACE INTO query). Meaning in the worst case scenario, you'd have to run about 20-40 update queries, whereas using a separate table, the worst case would be 2 update queries.
I guess you could argue that 20-40 update queries isn't that much, considering that it's cached, but then, this was aimed more for speed rather than ease of installation (well, since we're doing code edits anyway, we may as well...)


Tikitiki Wrote:Personally I wouldn't trim any of the signature table because there's no way to predict how many times it's been viewed and it doesn't take up that much more space anyway seeing as there will usually be a much smaller amount of users than posts and most members won't even have a signature.
Yeah, most "ghost" users probably won't have a signature.
My plugin just updates the last accessed time stamp (randomly). It's hard to say whether it's better or worse. I'd side with either method in this case.

Tikitiki Wrote:I'm planning on writing the plugin like this:

Cache only posts with threads that have been made in the last 30 days, stickies, or big threads (i.e. over 100 posts - would be a setting) and then in 1.2 hook onto global_end or use a shutdown function and trim if($rand == 4) or something like that. In 1.4 you can use tasks.
Sounds interesting Toungue

labrocca Wrote:I haven't looked but it does appear to first check all posts in a thread and then of course make at least the pagination. My members complain that when making posts to a thread with a lot of posts that it sometimes lags. While I don't think it's a huge problem there is always some optimizing that can be done.
The thread has a cached post counter, so no, it won't check all posts.
The lag may be from the subscription side of things (for every subscribed member, a new row is inserted into the DB).
I tend to make it so that a multi-row insert query is used, rather than looping insert_query.
Ok, I've got the post plugin finished. I just need to get the debug code cleaned up and I'll post it tomorrow. It's currently running at the ncaabbs if you want to check it out. Load times seem to be about cut in half (On a thread with over 1,000 posts before load was about .42; after about .2).
Pages: 1 2 3 4 5