MyBB Community Forums

Full Version: Share your spider list
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
I start this thread to let us sharing our list of spiders / bots used in our MyBB forums.
Here is mine:
name useragent
Ahrefs AhrefsBot
Alexa Internet ia_archiver
AOL AOLBuild
Applebot Applebot
Ask.com Teoma
Baidu Baiduspider
Bing bingbot
CheckHost CheckHost
Cuil twiceler
DataSift TweetmemeBot
Discord Discordbot
DuckDuckGo DuckDuckBot
Facebook facebookexternalhit
Flip FlipboardProxy
Google Googlebot
Internet Archive archive.org_bot
Linkdex inkdexbot
Lycos lycos
Majestic 12 MJ12bot
MetaURI MetaURI
MojeekBot MojeekBot
Paper.li PaperLiBot
Pingdom Pingdom.com_bot
QuerySeeker QuerySeekerSpider
Qwant Qwantify
Semrush SemrushBot
Showyou ShowyouBot
Topsy butterfly
Twitter Twitterbot
UptimeRobot UptimeRobot
Voila/orange VoilaBot
XoviBot XoviBot
Yahoo! Slurp
Yandex YandexBot

I attach the csv file (with .txt extension) which has a "comment" column, unused in mybb. Feel free to complete it
Hi,

what bots/spiders do you want? The banned ones? The active ones since last year or since 200X?

There are few more in my list...
The active ones (since 2010) are the most relevant for webmasters who want to know which spiders are frequently on their forum and avoid having an enormous list of "guest reading thread".
Install iPatrol. That will generate bot database for you.
(sorry for spamming Toungue)
The banned ones:

user-agent: 360Spider
user-agent: A6-Indexer
user-agent: AhrefsBot
user-agent: AlphaSeoBot
user-agent: B-l-i-t-z-B-O-T
user-agent: Barkrowler
user-agent: BoardReader
user-agent: BomboraBot
user-agent: Buck
user-agent: CATExplorador
user-agent: Clickagy
user-agent: Dataprovider
user-agent: DeuSu
user-agent: DnyzBot
user-agent: DomainCrawler
user-agent: DomainSigmaCrawler
user-agent: FBSMTWB
user-agent: Flamingo
user-agent: GarlikCrawler
user-agent: GrapeshotCrawler
user-agent: Jooblebot
user-agent: Keybot Translation-Search-Machine
user-agent: LightspeedSystems
user-agent: LinkpadBot
user-agent: LinqiaMetadataDownloaderBot
user-agent: MJ12bot
user-agent: Mail.RU_Bot
user-agent: MauiBot
user-agent: MegaIndex
user-agent: MetaURI
user-agent: Netcraft
user-agent: OpenLinkProfiler
user-agent: Plukkie
user-agent: PubMatic Crawler Bot
user-agent: SEOkicks
user-agent: SMTBot
user-agent: SemrushBot
user-agent: SirdataBot
user-agent: Siteimprove
user-agent: SurdotlyBot
user-agent: SurveyBot
user-agent: Synthesio
user-agent: TodoExpertosBot
user-agent: TurnitinBot
user-agent: TweetmemeBot
user-agent: Twingly
user-agent: URLAppendBot
user-agent: WebDataStats
user-agent: XoviBot
user-agent: YisouSpider
user-agent: ZoomBot
user-agent: ZoominfoBot
user-agent: cmscrawler
user-agent: datagnion
user-agent: grammarly
user-agent: linkdexbot
user-agent: linkfluence
user-agent: ltx71
user-agent: magpie-crawler
user-agent: moreover
user-agent: oBot
user-agent: omgili
user-agent: panscient
user-agent: proximic
user-agent: rogerbot
user-agent: scrapinghub
user-agent: sistrix
user-agent: ubermetrics
user-agent: vebidoobot
user-agent: woorankreview
user-agent: Aboundex
user-agent: AdnormCrawler
user-agent: BLEXBot
user-agent: Buzzbot
user-agent: CheckMarkNetwork
user-agent: CrazyWebCrawler-Spider
user-agent: Dataprovider
user-agent: DotBot
user-agent: ElectricMonk
user-agent: Ezooms
user-agent: GroupHigh
user-agent: HubSpot
user-agent: JamesBOT
user-agent: KomodiaBot
user-agent: MixrankBot
user-agent: NextGenSearchBot
user-agent: R6_FeedFetcher
user-agent: RankurBot
user-agent: RavenCrawler
user-agent: Riddler
user-agent: SEOdiver
user-agent: SafeDNSBot
user-agent: Scopia
user-agent: ScoutJet
user-agent: SentiBot
user-agent: SiteExplorer
user-agent: WeSEE
user-agent: adbeat_bot
user-agent: asafaweb
user-agent: dlvr.it
user-agent: probethenet
user-agent: trendictionbot
user-agent: voltron

Reasons to be banned:
- Don't obey robots.txt
- Known Spammers
- Unidentified bots (no URL in their user agent, or fake URL)
- Bot/Spider form a company that offers paid services from what they get from my web page. If they don't pay me they got banned. Server and bandwidth costs me money Wink .
There is an updated huge list of web spiders crawling  your website and it can be reached here:

tiny.cc/zqym6y