MyBB Community Forums

Full Version: performance/optimization suggestions
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2
I'd like to see the following optimization made for 2.0, or at least discuss it here to see if they would really be a performance improvement, though I believe it to be.

Right now 1.6x the showthread page grabs the user info with each and every post. This is at least one query for every PPP and the more PPP, the more queries are run.

I would like to suggest that at the start of showthread, the distinct list of uids for the tid be used in a single query and the user info loaded into an array keyed on uid.

That is one query for all users participating, regardless of the PPP settings. However it does not take pagination into consideration

A function such as
/**
 * get list of users in active thread(s)
 * @id array of tids or int of single tid to search for
 * @return array of user data
 */
function get_thread_users($tid)
{
    global $mybb, $db, $plugins;
    
    if(is_array($tid))
    {
	    $tid = implode(',', $tid);
	}
	
   	$sql = 'SELECT u.* FROM '.TABLE_PREFIX.'users u INNER JOIN '.TABLE_PREFIX.'posts p 
                      ON u.uid=p.uid WHERE p.tid IN ({$tid}) GROUP BY u.uid';
   	$query = $db->write_query($sql);
   	while($users = $db->fetch_array($query))
   	{
   		$link = str_replace("{uid}", $users['uid'], PROFILE_URL);
		$users['profilelink'] = htmlspecialchars_uni($link);
   		$userlist[$users['uid']] = $users;
   	}
    
    $plugins->run_hook('get_thread_users', $userlist);
    
    return $userlist;
}

The above query could be optimized even further by changing it to work on a list of pids that are specific to the page in being loaded. With pagination, you can determine the list of pids for the page and before hitting the post handler build_post() function, run the function above for those specific pids

You are already grabbing the list of posts specific to the page needs, so run through that result to generate an array of pids and pass it to the following

/**
 * get list of users in active post(s)
 * @id array of pids or int of single pid to search for
 * @return array of user data
 */
function get_post_users($pid)
{
    global $mybb, $db, $plugins;
    
    if(is_array($pid))
    {
	    $pid = implode(',', $pid);
	}
	
   	$sql = 'SELECT u.* FROM '.TABLE_PREFIX.'users u INNER JOIN '.TABLE_PREFIX.'posts p 
                       ON u.uid=p.uid WHERE p.pid IN ({$pid}) GROUP BY u.uid';
   	$query = $db->write_query($sql);
   	while($users = $db->fetch_array($query))
   	{
   		$link = str_replace("{uid}", $users['uid'], PROFILE_URL);
		$users['profilelink'] = htmlspecialchars_uni($link);
   		$userlist[$users['uid']] = $users;
   	}
    
    $plugins->run_hook('get_post_users', $userlist);
    
    return $userlist;
}

This is very similar what I do for my MyShowcase and CPM_Userlink plugins. At showthread_start I just query all users in the thread and grab the data up front.

With the plugin hook there, Google SEO or similar plugins can then add the required info on one set of data, saving even more queries.

Then during the build_post() function, just use $userlist[$pid] to get the specific info for the user of that post
(2011-10-06, 05:11 PM)pavemen Wrote: [ -> ]Right now 1.6x the showthread page grabs the user info with each and every post. This is at least one query for every PPP and the more PPP, the more queries are run.

Sorry pavemen, but I don't believe that's the case. If you have a thread with 100 posts, the number of queries remains exactly the same if your PPP is 1, 50 or 100.

$query = $db->query("
	SELECT u.*, u.username AS userusername, p.*, f.*, eu.username AS editusername
	FROM ".TABLE_PREFIX."posts p
	LEFT JOIN ".TABLE_PREFIX."users u ON (u.uid=p.uid)
	LEFT JOIN ".TABLE_PREFIX."userfields f ON (f.ufid=u.uid)
	LEFT JOIN ".TABLE_PREFIX."users eu ON (eu.uid=p.edituid)
	WHERE $pids
	ORDER BY p.dateline
");

User information is collated in the query itself, not separately - per post - as you've described it.

Unless I'm missing something painfully obvious.
I am just going off the top of my head as to what I recall being an issue for me when i needed to get the list of unique users participating in a thread.

since the above query is not a unique list of users as it contains the post, thread and forum info as well, there is potential for repeated user content for those cases where a user is shown more than once on a page.

plus i see the issue as i run Google SEO, which is querying each user during get_profile_link().

i suppose i can keep this content in my plugins, but then it runs multiple times on my site, once for each plugin that uses it.

apologies for wasting the bandwidth.
Both get_profile_link() and build_profile_link() have no query. get_user() does though.

And Tomm is right. In showthread a query is not run for every user.

View your advanced stats and see which queries are run by hooks.
like I said, its Google SEO that adds the query per post, plus my two plugins that need a unique list of users to get additional user info from other databases/tables.

i was mistaken in the reason for my original post.
I agree and wish the original query created a cache array of all uid's. It would be super helpful. An array like $userpids[] = array($uid, $pid) would be very handy.

Because every postbit action for a user often requires a new query for every loop. Instead a plugin could use one query to create it's own array and then call that for each loop.

So I actually agree with you Pavemen that a change to accommodate a uid array would be very handy.
Now it makes sense. Why not hook into postbit and collect the unique UIDs without having to run a query? I'd have to do testing to see if splitting the showthread query into two is beneficial for performance or whether it's just counting pennies to make pounds.

As for 2.0, this entire area will probably be rewritten anyway.
(2011-10-06, 07:48 PM)Tomm M Wrote: [ -> ]Now it makes sense. Why not hook into postbit and collect the unique UIDs without having to run a query? I'd have to do testing to see if splitting the showthread query into two is beneficial for performance or whether it's just counting pennies to make pounds.

As for 2.0, this entire area will probably be rewritten anyway.

but during build_postbit is when the html output is created for each postbit record, so grabbing the unique IDs at that time will not help when I need to add to the output during build_postbit itself.

example, I want to add user content from another table based on UID. i can query within build_postbit for each user (which my be repeated) to get the extra details and update the template output accordingly. that is at least one query per post per page

or i can get the full list of users and all the extra details for each at the start of the page and then simply pull that user data from the array during build_postbit hooks. this is one query regardless of the number of posts and users
Pointless IMO. Gather the user ID from post bit, alter the template for a placeholder (<!--user_123-->) then do your work after the post bit is built with a _replace function. Quick, simple, requires no more queries and stops faffing around with splitting queries and forcing data through functions.
(2011-10-06, 09:56 PM)Tomm M Wrote: [ -> ]Pointless IMO. Gather the user ID from post bit, alter the template for a placeholder (<!--user_123-->) then do your work after the post bit is built with a _replace function. Quick, simple, requires no more queries and stops faffing around with splitting queries and forcing data through functions.

but when do I grab the extra data from other tables to do the replace with? the plugin would still need to run a query to get the user specific data from the other table(s).

i am not worried about a simple situation where a replace of static data will work. I am talking about being able to query a users information from Coppermine, a blog, a showcase etc in order to link to that users additional content.

if I have a list of users up front, I can do a bulk query to get all the data. if I dont have a list and work from within the postbit I need to query those tables each postbit build.

Pages: 1 2