MyBB Community Forums

Full Version: Post Relevancy
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Just finished the first prototype of a plugin I'm working on called Post Relevance.

It attempts to compute how relevant a post is to the thread title using tf-idf weightings in a ranked retrieval system. The output of which is a little message on each post giving a score for how relevant the post is. The higher the score, the more relevant.

The plugin pulls in the thread title and splits it up into keywords, removing words like 'the' etc. Then it fetches synonyms for the keywords using http://words.bighugelabs.com/api.php , it removes all words like 'the' etc from the synonyms list.

Once it has a complete set of keywords from the thread title and the synonyms, it computes the tf-idf (wiki it) weightings using the term-frequency and the inverse-document-frequency. This ranks each post in a thread according to how relevant the respective post is to the thread title.

I will put the protoype online this week once I've cleaned it up a bit and added some ACP configuration settings.

For now, how useful do you think this would be as a plugin? Worth me releasing and maintaining it?
Not saying this is irrelevant, but there is an inbuilt function which suggests similar threads.
This plugin doesn't suggest similar threads. It computes how relevant each post within a thread is with regard to the thread title.
If you could pull this off without many false-positives it'd be great.
(2011-11-28, 03:12 AM)pyridine Wrote: [ -> ]If you could pull this off without many false-positives it'd be great.

I don't think false-positives are much of an issue. The plugin doesn't just classify each post as either relevant or non-relevant, it produces a score for each post. It's not so much a case of whether it produces a false-positive, it's a case of whether or not the score for each post is accurate enough to have relevant context.