MyBB Community Forums

Full Version: Use relevancy in fulltext searchs
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Attachment updated for 1.6.0

In 1.,4x and 1.6x, there is no use of a score or relevancy in fulltext searches, so the query results are not ordered by any relevance of the resutls. By defaults, MySQL will not sort 'IN BOOLEAN MODE' queries by relevancy, so you need to add these changes to make full use of fulltext functionality.

A few minor edits to functions_search.php enabled this functionality. I am using MySQL5 and fulltext searches are working a ton better now.

I have attached my version here. There are 4 new variables in perform_search_mysql_ft( )

Create 'score' fields to calculate relevancy and sort against
$message_score_fld
$subject_score_fld

Simple var for setting result order
$message_order
$subject_order

These are simply inserted into the 3 fulltext queries that contain the '_lookin' vars in the above function.

No guarantee that this new code will work in other DBMS besides MySQl/MySQLi
I have updated the functions_search.php file for version 1.4.13
Just curious as to what this actually does ?
I ran a search in advance search with all the default settings
with the fuctions_search.php shipped with 1.4.13
Then with that left open I uploaded this version
And ran the same search
The results were identical
I have compared the 2 files in notepad and can see the relevant changes
But it doesnt seem to return any different result
OS centos 5.5 php 5.2.13 and mysql running
I have altered the search template to switch the search defaults from
post to title ( search in titles by default without having to manually change it )
What I am looking for is a way to add the "" into the code as the results for what i need are perfect when you manually
add them into the search line
eg.
search for foo bar
get 11 pages of results
then if you search for "foo bar"
get 3 results
My forum all the info is in the titles so i dont need to search every word
just needs to be a string in the search for titles bar
Anyone know how to add the "" by default in the code so its hidden
then whatever is put into the search box will automatically be wrapped in the quotes ?
and maybe a way to add a radio button under the default search parameters like :-

(0) search titles only
( ) search entire post
___________________
(0) turn off quotes

Thanks
this only applies if you are using full text search and not a regular search. also only applies if you have a large board that would return more than the max number of results you specify to be returned from a search. without this, your search results will be our of order or not complete.
Yes we are using full text and we have over a million posts and 30 000 members
(culled from 58 000) . We actually had to switch full text on as standard was maxing
the cpu in the mysqld process . And by maxing I mean it was hitting over 200% for 2-3 seconds dropping the whole site offline . But with the full text on there is no load at all . Thats why I am looking for some way to return precise results instead of the users having to know they need to
put quotes in . I just added a submission to resources anyway as I worked out a simpler approach by adding a jquery text field inside the search box .
We have our Board set to return no more than 600 so alls good .
But makes sense if you want to return more Big Grin
Cheers
i am not familiar with jquery but if you want to make full use of a full text search, you need a relevancy (aka score) in order to properly rank your search results and output them accordingly. without it, you do not know which result has a better match to the query keywords
ismadman..switch to the Sphinx Search Plugin. Your load will drop to almost nothing for search. And I mean nothing. You'll probably think it's not working. I have 3.5 million posts on my forum btw.
Yes I use sphinx on another ipb forum I run . But as soon as we dropped the standard search and switched to full text the load disappeared completely . I installed mytop on my server and watched top and mytop simultaneously while a friend hammered the search . The standard search was bringing up queries in the mytop and the load hit the roof . In full text mytop wasnt displaying any queries at all and absolutely no load . Full text to me is faster but i run a music site and the titles are all important . Thats why i switched everything over to search for titles by default and its fantastic . As I said , the server loads gone so thats fine . I know users can type quotes around their phrase to get a better result but (as I have done on my ipb site) i need to get the quotes in with the query . I'm thinking its either in search.php or functions_search.php ...I will have a look later but for now I found a neat little script that just puts the instructions right inside the search box . Cheers for the advice and if the load gets out of hand I will definately look into it . So the plugin is here ? I'll have a look Big Grin
compatible with mybb 1.6?
Updated the original post with 1.6.0 compatible file
actually, I am finding that the results are not greatly improved with smaller forums without additional code changes due to how mybb compiles the search log. it looses the score/relevancy when the total items searched is less than the max search limits/results.

however, at least the top results are supplied, they are just not sorted based on relevancy. so MyBB will query the database and store the pid and tid values in the proper order (by relevancy) int he searchlog.

so when mybb does the actual search to display the results to the user, the user specified results order (date, forum,etc) is used to order the output but the pids and tids used in the search are the "top" results, just not in score/rank order.

for smaller forums, where the potential list of pids and tids that the search matches is less than the max search limit (typically 500-1000) then any relevancy is lost as number of potential results is less than the max and all results are returned with no consideration of the score.

example: 2,000 potential tid/pids that match your search term, and that set is sorted by relevancy. The hard limit for search is 500, so the "top" 500 tid/pids are saved to the search log. This is better, but not the best

example: 200 potential tid/pids that match your search term and those are sorted by relevancy. The hard limit for search is 500. Thus all 200 tid/pids are searched with no consideration of the relevancy when returning results. This is not bad as you still get results, but could be better.

sorry for not realizing this earlier, but I am going to look into the code some more to see if I can make a change that uses relevancy in the final results