суббота, 10 октября 2015 г.

What People Say About Content Blocking

Based on Jim's and Marie's clarifications below, Gary seems to have revealed a very significant loophole. Basically, where a page is gone, never return a 404 or 410 status code but instead change the page in some way to indicate it's no longer available to aid users (e.g. a rich 404 page) BUT leave the status code as 200, add a 'noindex follow' and continue to include the page in the sitemap.
Think about this. If an ecommerce site has a discontinued product, the recommended approach has been to either (1) 301 to an alternative product if a very similar one exists or (2) show a rich 404/410 page (with 404/410 status code) offering alternatives if a similar product doesn't exist. But the problem with (2) is you lose the external and internal link equity and any internal links to 404 pages that you accidentally leave are considered a technical issue. And your GSC crawl errors page ends up filled up with 404s/410s of discontinued products for months that make it difficult to see genuine 404 errors.
Now imagine that instead of (2) you do what Gary says and offer the user a page that looks like a rich 404, but is actually a 200 status code and noindex follow. Your link equity is retained and passed to all the alternative products linked from your rich 404 page. You don't have to worry that Google will penalise you if you've accidentally left some internal links to that page in your site. Your GSC crawl errors won't fill up with crap. And that page will actually be removed from Google's index quicker than if you 404 or 410ed it!
Am I missing something here because this seems too good to be true?

https://disqus.com/by/simonlhill/

I was there when this occurred yesterday. What Gary told the lady that asked the question about de-indexing thin content after that content lead to a Panda filter was that he thought de-indexing the thin content that triggered Panda was NOT the best solution. He said her time would be better spent writing great (thick? LOL) content for those pages previously containing thin content.
And I have to agree as not only would this eventually lead to the Panda filter being lifted, but now they would have those pages to drive additional traffic for their targeted keywords and the site would continue to benefit from any inbound links to those previously thin pages.
He went on to say that IF they decided to go ahead and de-index them anyway (against his advice) then rather than trying to 404 or 410 the thin pages, the best way to accomplish getting those pages de-indexed was:
a) to add a <meta name="robots" content="noindex"/> to each thin page AND
b) submit a new sitemap.xml that contained all of the URLs that you want de-indexed.
He stated that as a result of submitting the sitemap.xml (I took him as meaning 
https://disqus.com/by/jimhodson/

If you have massive pages with super thin content (such as pagination pages) and you noindex them, once they are removed from googles index (and if these pages aren't viewable to the user and/or don't get any traffic) is it smart to completely remove them (404?) or is there any valid reason that they should be kept?
If you noindex them, should you keep all URLs in the sitemap so that google will recrawl and notice the noindex tag?
If you noindex them, and then remove the sitemap, can Google still recrawl and recognize the noindex tag on their own?

https://disqus.com/by/sammy_hall/

- I'd worry less about the technicalities here - do what's best for your users and your website.
- Having URLs in a sitemap isn't a requirement for crawling. Google will likely still find these URLs and tags regardless.
- Note that noindex isn't a 'hard block' for Google's crawler in the same way as a nofollow or a robots.txt directive. Google will still re-crawl URLs with noindex tags, albeit it's likely to do so less frequently.

https://disqus.com/by/MartinOddy/

Source: https://www.seroundtable.com/google-block-thin-content-use-noindex-over-404s-21011.html



Комментариев нет:

Отправить комментарий