MyBB Community Forums

Full Version: Indexing issue
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Pages: 1 2 3 4
How do I stop spiders going to URLs such as:

myforum?datecut=0&sortby=subject&order=asc
myforum?datecut=0&sortby=subject&order=asc
myforum?pid=10&mode=threaded

I only want them to go to myforum, nothing after "?" such as myforum?pid=10&mode=threaded
MAKE A FILE IN THE ROOT OF YOUR SITE CALLED , "robots.txt" and add te following as is


User-Agent: *
Disallow: /(the addresses after your forums url ex: /index.php)
Disallow: /(aditional addresses add like this )
Allow: /
(2009-11-22, 10:25 PM)victor12 Wrote: [ -> ]MAKE A FILE IN THE ROOT OF YOUR SITE CALLED , "robots.txt" and add te following as is


User-Agent: *
Disallow: /(the addresses after your forums url ex: /index.php)
Disallow: /(aditional addresses add like this )
Allow: /

To be more specific:

User-Agent: *
Disallow: /index.php?pid=10&mode=threaded
Disallow: /index.php?pid=12&mode=threaded
Disallow: /other/directory/to/block

You don't need "Allow: /" at the end.
Add nofollow to the link tags in forumdisplay templates.
Is there another way?

I would like to ignore all after my forum topics, for example

forum-topic1/?datecut=0&sortby=subject&order=desc

forum-topic2/?datecut=0&sortby=subject&order=desc

forum-topic3/?datecut=0&sortby=subject&order=desc

and etc
ignore the whole forum
As labrocca said "Add nofollow to the link tags in forumdisplay templates."
(2009-11-23, 01:15 AM)Zomaian Wrote: [ -> ]As labrocca said "Add nofollow to the link tags in forumdisplay templates."

but then my forum main URLs will be ignored as well.
(2009-11-23, 01:14 AM)victor12 Wrote: [ -> ]ignore the whole forum

that's not what i want to do.
Google understands wildcards, so you can block any URL that contains sortby= or mode=

See the example robots.txt in the Google SEO package.

nofollow is okay too, but does not prevent indexing. If someone else copies the link elsewhere (like how you did it in this thread), the link will not be nofollow there and thus be followed and indexed. Another alternative would be to write a plugin that adds a noindex meta tag to pages that should not be indexed, but the spider would still follow and download these pages. robots.txt is the only way to completely block a spider from indexing or even accessing a page.
(2009-11-23, 07:03 AM)frostschutz Wrote: [ -> ]Google understands wildcards, so you can block any URL that contains sortby= or mode=

See the example robots.txt in the Google SEO package.

nofollow is okay too, but does not prevent indexing. If someone else copies the link elsewhere (like how you did it in this thread), the link will not be nofollow there and thus be followed and indexed. Another alternative would be to write a plugin that adds a noindex meta tag to pages that should not be indexed, but the spider would still follow and download these pages. robots.txt is the only way to completely block a spider from indexing or even accessing a page.

I did use the exact example in the seo pack. But when I run my sitemap generator the pages still get indexed in the sitemap. also i used disallow /?* but again, all "?" pages get indexed in the sitemap.
Pages: 1 2 3 4