MyBB Community Forums

Full Version: New Spider Found
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Lately Gigabot has been indexing my site, but this spider shows up as a guest. It has the following UA:
Gigabot/2.0/gigablast.com/spider.html

It should be added to the $bots array in inc/class_session.php.
There are literally tens of thousands of different bots out there. Including them all in a comprehensive list in the software would be a MASSIVE endeavor (and also pointless, since you'll *never* have a list of all of them).
They should release a PEAR script for all the bots Toungue
I know there are a ton of spiders, but by including all the ones that index your site, you can save a ton of bandwidth (ie always using the same session ID and preventing them from accessing certain pages).
tmhai Wrote:do bots have some sort of pattern in their ip address?

No, but in their hostname, usually it indicates it. For example google bot is something like:

crawl-xxx-xxx-xxx-xxx.googlebot.com

The hostnames differ for each bot for each search engine.

tmhai Wrote:If this is the case, then Chris could implement a script which would select all the ip's currently viewing the forums, filter them, and label those that match a bot into a special category, ie "Bots". The rules you then apply in the admin cp will also apply for those ip addresses that mach for a bot.

It's said simply, I dont know how it will be implemented though.

If you look at class_session.php, you can see how the bots are sorted. And by default the bots that are identified are grouped as "Unregistered" users, and by changing a certain variable in that file, you can change it so that the bots are grouped in another custom usergroup (I can't name the variable off the top of my head).
var $botgroup = 1;
That one? Wink
Yes precisely.
umm referring to the post about bandwidth and large number of spiders. umm, it will be a nice thing to include a setting that has all the available web spiders and an admin is prompt to select whether he wants this or that to crawl his/her site or not.

Umm we all know that not all spiders are like google's or yahoo's, and you might not want them to come and use your bandwidth for nothing, it is not that any bot will help your site up, some belongs to weak engines, therefore they are undue. However we can't neglect that some might also want them inspite of whatever they do, so that's why implementing that new setting would be nice.

A little off the subject, umm in the class_session.php i guess there are more than 8 or 9 different bots, but none of them appear except for Google's, Yahoo's and MSN's Confused why they are there? (correct me if i'm wrong).

Regards
It would be a nice feature to add custom bot agents to the bot groups. Mabye in a future version? I have user agent banning built into the custom ban system for my site and I've always wondered why CMS, Blogs, forums, etc. don't include such a feature in their software.