Pawel Szulencki Search Engine Optimization/Marketing blog.
If you're new here, you may want to subscribe to my RSS feed or follow me on Twitter. Thanks for visiting!
As you know Google and other search engines try to index everything they manage to find through spidering the web. But sometimes you may not want to show some part of your website to others and to search engines. There are some techniques that let you hide what you want from Google. Here they are:
1. Block entire website from Google using Robots.txt file.
User-agent: Googlebot
Disallow: /
User-agent: *
Disallow: /
For http version (http://www.domainname.com/robots.txt):
User-agent: *
Allow: /
For https version (https://www.domainname.com/robots.txt):
User-agent: *
Disallow: /
User-agent: Googlebot
Disallow: /foldername/
User-agent: Googlebot
Disallow: /*.pdf$
User-agent: Googlebot-Image
Disallow: /image.jpg
User-agent: Googlebot-Image
Disallow: /
2. Remove or block pages from being indexed by Google using meta tags.
<meta name="robots" content="noindex, nofollow">
If you want to exclude all site from being indexed place that meta tag onto every page of your website.
<meta name="robots" content="noimageindex">
<meta name="Googlebot" content="noarchive">
NOTICE: This will only prevent from indexing a cache version of the site, Google will continue to index the website and show it in result pages.
<meta name=”googlebot” content=”nosnippet”>
NOTICE: Removing snippets will also remove cached version of your site.
Sphere: Related ContentPawel Szulencki is a SEO (Search Engine Optimization) and Marketing certified specialist who is interested in organic SEO, paid campaigns (PPC) and Social Media Marketing channels. (Read more)
travs (1 comments.)
August 5th, 2008 at 9:00 pm
nice tips, thanks for this..
Stephan Miller (1 comments.)
August 5th, 2008 at 11:37 pm
Google tends to be picky about duplicate content and blocking their spiders from pages that have them can help. That or using dofollow. I think there are a few Wordpress plugins that do this for you. One is HeadSpace2. It uses the meta-tags and you can set them on a post by post or page by page basis.
Pawel Szulencki (18 comments.)
August 6th, 2008 at 12:51 pm
@travs no problem
@Stephan Miller I think you meant nofollow or noindex tags. And yes, for WordPress there are plugins that make it easier. You can also simply by using Meta Robots plugin select noindex or nofollow (or both) on each post or page you have written or are going to write in the future. It also helps prevent from duplicate content.
When using WordPress you may also choose to select “Keep this post private” which will disable search engines from indexing it as it will require a password to enter and read that post/page.
But if you dont use WordPress or other CMS platform the tips above are in use and help to prevent from indexing easily.
enkz (1 comments.)
August 7th, 2008 at 12:14 am
great post. thanks
Pawel Szulencki (18 comments.)
August 7th, 2008 at 12:09 pm
@enkz Your welcome
B Desire
March 20th, 2009 at 6:27 pm
“To prevent Google from indexing a site, place this meta tag in your section of that page:
”
Above statement said that if you put this line in to the page, it will prevent google to index the site..Does it true..?? and confusing
I guesss If you have above meta tag of effect will be specific to that page only. Not to the site. If you have that meta tag in home page then it might be difrerent.
What you say??
Pawel Szulencki (169 comments.)
March 20th, 2009 at 8:06 pm
@B Desire: The is a page specific meta tag and it works locally. I assumed that it is used site wide, meaning that you place it on every page of your website. I can add that as a description of that meta tag to make sure everyone understand its meaning.