MyBB Community Forums

Full Version: Google excluded url
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello friends, does anyone here know why google is excluding urls? -> https://prnt.sc/cNcJJKJH2Whx

This is the robots.txt:
Sitemap: https://ddwarez.com/sitemap-index.xml

User-agent: *
Disallow: /captcha.php
Disallow: /editpost.php
Disallow: /misc.php
Disallow: /modcp.php
Disallow: /moderation.php
Disallow: /newreply.php
Disallow: /newthread.php
Disallow: /online.php
Disallow: /printthread.php
Disallow: /private.php
Disallow: /ratethread.php
Disallow: /report.php
Disallow: /reputation.php
Disallow: /search.php
Disallow: /sendthread.php
Disallow: /task.php
Disallow: /usercp.php
Disallow: /usercp2.php
Disallow: /calendar.php
Disallow: /*action=emailuser*
Disallow: /*action=nextnewest*
Disallow: /*action=nextoldest*
Disallow: /*year=*
Disallow: /*action=weekview*
Disallow: /*action=nextnewest*
Disallow: /*action=nextoldest*
Disallow: /*sort=*
Disallow: /*order=*
Disallow: /*mode=*


User-agent: MediaPartners-Google
Allow: / 
"warez", probably that google doesn't accept that.
And in the console, you can have explanation about the exclusion reason.
(2022-07-26, 08:03 AM)Crazycat Wrote: [ -> ]"warez", probably that google doesn't accept that.
And in the console, you can have explanation about the exclusion reason.

I don't think so, I have seen web pages of warez that indexed them in minutes and hours.  If it were the case, they would exclude all the urls, not almost all of them, or they would not accept the site per se.
So, look at the explanations in the search-console. And look in the "Excluded page" (or something similar) menu.

You can also have more detals about the excluded pages, peharps they are subsequent to your rules. excluding the printthread.php file will create at least one exclusion per thread.
(2022-07-26, 01:48 PM)Crazycat Wrote: [ -> ]So, look at the explanations in the search-console. And look in the "Excluded page" (or something similar) menu.

You can also have more detals about the excluded pages, peharps they are subsequent to your rules. excluding the printthread.php file will create at least one exclusion per thread.
I don't see that it says anything in particular, I leave you a screenshot. do you recommend removing the printthread.php from the robots.txt? or how can I improve the robots.txt maybe that's it.
You'll need to look at what the specific pages are that are being excluded. That graph tells us nothing. Does it not list them lower down on that page? If the excluded pages are ones you've blocked via robots.txt, then it's nothing to worry about, they'd be being excluded because you've told Google to exclude them. However if they're pages that should be being indexed, you'd need to find out why. But right now if you don't know what pages are actually being excluded, nobody can tell what the issue is.
(2022-07-26, 03:56 PM)Matt Wrote: [ -> ]You'll need to look at what the specific pages are that are being excluded. That graph tells us nothing. Does it not list them lower down on that page? If the excluded pages are ones you've blocked via robots.txt, then it's nothing to worry about, they'd be being excluded because you've told Google to exclude them. However if they're pages that should be being indexed, you'd need to find out why. But right now if you don't know what pages are actually being excluded, nobody can tell what the issue is.

I hope is this, ckeck the attachments
It seems to be that it just hasn't indexed them yet then. It says "Discovered - currently not indexed", but doesn't sound like they're being explicitly excluded for any particular reason.
(2022-07-26, 04:17 PM)Matt Wrote: [ -> ]It seems to be that it just hasn't indexed them yet then. It says "Discovered - currently not indexed", but doesn't sound like they're being explicitly excluded for any particular reason.

That is what I do not understand, there are links from more than 30 days ago and it is not indexed.  The sitemap.xml is correct.  what i am doing is sending the urls manually when a new post is published.  any idea what i can do?
You can't do anything to force Google to index the page, you'd need to either wait or contact them about it.