Pre-Parse Post/Signatures & Sphinx Search plugins
#11
(01-11-2009, 09:05 PM)Rcpalace Wrote: The key is to solve things without using unnecessary 3rd party equipment.

And you think we haven't tried? Not to be rude, but your wasting your time. Look at every other major forum software out there and you will find a Sphinx plugin. And that is because they all have the same limitations under MySQL.

(01-11-2009, 09:05 PM)Rcpalace Wrote: Their is no reason why I shouldn't conduct my own benchmarks and optimize it accordingly.

Go right ahead, but this thread is not here so you can challenge the solutions I have come up with.
Reply
#12
(01-11-2009, 02:42 PM)Rcpalace Wrote: I haven't tried this yet, but I was thinking of caching the table every 24 - 48 hours, and do searches from there.
How would you cache the table? Load it all into a memory cacher? How would that actually be faster?

(01-11-2009, 05:27 PM)Ryan Gordon Wrote: I think it's clear that when the server starts crashing because of the MySQL search and that it doesn't when using sphinx, that is a pretty big performance difference. It used to take 30 or more seconds to do searches and we're talkin about squeezing everything we could out of indicies and optimizations before we moved to sphinx. Sphinx takes under a second always. In fact, it mentions on it's site that it can search billions of rows in just a few seconds.

And this is a fact, both Chris Boulton and Jeremy Sands, both IT professionals, will say the same.
Perhaps I over-estimated MySQL's indexer. Full text indexing isn't exactly cutting edge development (unless you operate Google perhaps), and has been fairly well researched, so I would've imagined MySQL's implementation to be fairly efficient.
Haven't actually tested much, so I can't comment further.
Reply
#13
This user has been denied support. This user has been denied support.
Actually this is the first time I've come across Sphinx (well, not that I'm acquainted with PHP/MySQL in general, but anyhow...). The "solution" I see on many sites is not a solution at all, but they either make a search that uses a datecut (e.g. here it would be search 'from a month ago' instead of 'from any post date'), so they have fewer posts to check. Or they just use Google site search - the latter actually works well provided Google really REALLY loves the site.

To get back on topic, regarding your pre-parsing plugins... if those are optimizations without a catch (such as dependencies of the sphinx), why not include them with stock MyBB?
Reply
#14
The datecut would really depend on your forum and whether you consider old posts useful or not. An issue with Google is that it doesn't necessarily index everything, and isn't always up to date, even on popular forums.
Example: http://www.google.com.au/search?hl=en&cl...earch&meta=
(at time of writing, x264 is at rev 1069 according to Google, but click the link and you'll see it at rev 1074)

Other issues may be that you don't want to redirect to Google.
But it is a cheap, simple thing to do and usually works reasonably well, at least in my case.
Reply
#15
(01-11-2009, 11:27 PM)frostschutz Wrote: To get back on topic, regarding your pre-parsing plugins... if those are optimizations without a catch (such as dependencies of the sphinx), why not include them with stock MyBB?

And we will down the line. Just a matter of time and planning, that's all.
Reply
#16
(01-11-2009, 11:24 PM)Yumi Wrote:
(01-11-2009, 02:42 PM)Rcpalace Wrote: I haven't tried this yet, but I was thinking of caching the table every 24 - 48 hours, and do searches from there.
How would you cache the table? Load it all into a memory cacher? How would that actually be faster?

Cache it into a regular file, similar to an SQL dump and do searches from there. Proven to be very efficient and cuts down load time by over half. However, it's not something I favor as the first query will really be a killer as Ryan said earlier. I'm working on something currently and I've pulled results from my db 2million + entries (yes I added to it using a for loop), and results are currently being pulled out at .0018 on average. I'm not going to count my chickens before they hatch though, results are currently irrelevant, so I'd need to finish it before giving the final word.
Best Regards.
Reply
#17
(01-12-2009, 01:56 AM)Rcpalace Wrote:
(01-11-2009, 11:24 PM)Yumi Wrote:
(01-11-2009, 02:42 PM)Rcpalace Wrote: I haven't tried this yet, but I was thinking of caching the table every 24 - 48 hours, and do searches from there.
How would you cache the table? Load it all into a memory cacher? How would that actually be faster?

Cache it into a regular file, similar to an SQL dump and do searches from there. Proven to be very efficient and cuts down load time by over half. However, it's not something I favor as the first query will really be a killer as Ryan said earlier. I'm working on something currently and I've pulled results from my db 2million + entries (yes I added to it using a for loop), and results are currently being pulled out at .0018 on average. I'm not going to count my chickens before they hatch though, results are currently irrelevant, so I'd need to finish it before giving the final word.
Best Regards.
That wouldn't help unless your DB is being overloaded with other queries. For one, how do you search the file? Open it and parse the entire thing?
Reply
#18
(01-12-2009, 01:56 AM)Rcpalace Wrote: Cache it into a regular file, similar to an SQL dump and do searches from there. Proven to be very efficient and cuts down load time by over half.

Sounds like a poor mans sphinx...
Reply
#19
(01-12-2009, 03:13 AM)Ryan Gordon Wrote:
(01-12-2009, 01:56 AM)Rcpalace Wrote: Cache it into a regular file, similar to an SQL dump and do searches from there. Proven to be very efficient and cuts down load time by over half.

Sounds like a poor mans sphinx...

It has been tested by several people and affirmed it works. However, this was just one of my brainstormed ideas that I've seen work elsewhere; as I said in my previous post, I'm working on a different solution which is going great so far, I just need to fix some relevancy issues. I hope you understand that Sphinx isn't the only solution to things, the best solution would be developing your own solution if and when possible. The best thing as a developer though, is to always be opened minded and open for suggestions/criticism. With that said, I appreciate your heavy criticism given, I think I'm done with this discussion as it's going nowhere. Keep up the good plugins, I'll be installing one of them on a test board to see how they work. I'm definitely interested in your pre-parse plugins. Smile
Best Regards.
Reply
#20
(01-12-2009, 06:03 AM)Rcpalace Wrote: I hope you understand that Sphinx isn't the only solution to things,

I understand that perfectly, yet you seem so adamant to dismissing sphinx as a perfectly good solution.

(01-12-2009, 06:03 AM)Rcpalace Wrote: the best solution would be developing your own solution if and when possible.

That is absolutely not true. Why should I reinvent a perfectly good wheel?

(01-12-2009, 06:03 AM)Rcpalace Wrote: Keep up the good plugins, I'll be installing one of them on a test board to see how they work. I'm definitely interested in your pre-parse plugins. Smile
Best Regards.

Good luck with your benchmarks
Reply


Forum Jump:


Users browsing this thread: 1 Guest(s)