We found that Google had indexed a site that shouldn’t be indexed so I setup a robots.txt file to deny all crawlers and locked down the site with http auth. I also put in a request to have the urls removed from the index and cache.
When I did this Google returned ~2,400 results when doing a “site:www.site.com”. A few days later it was returning ~54,000. Today it is returning ~133,000.
I’m not sure how Google managed to mix up “remove my site” with “index it more”. Maybe this is just part of the removal process?
Update: Google is now up to 217,000 results for this site. Maybe removing your site from the index is good for SEO?
I had a problem where Googlebot was indexing a development site, so we locked it down using apache basic http auth. Now Googlebot was being served with a 401 when accessing the site, but because it had no stored robots.txt it was persistently trying to crawl the site.
Using the following allows anyone to access robots.txt but denies access to the rest of the site:
AuthName “Client Access”
Eventually Googlebot will get the hint and stop indexing the site and we can remove existing content using webmaster tools.
With more and more people moving towards using a global DNS system (like Google DNS and OpenDNS) the speed advantages of a CDN may be cancelled out.
Most of the major CDNs use geotargeting based on where the DNS is being resolved from. For example Facebook’s CDN solution (using Akamai) resolves static.ak.fbcdn.net to 210.55.501.200, a 5ms response time. Using Google’s public DNS server (126.96.36.199) the same domain resolves to 188.8.131.52, 184.108.40.206, a 200ms response time.
So while using Google’s DNS/OpenDNS may save a few ms while resolving a domain, it may slow down a site by putting the CDN pops further away from end users. Until CDN’s can work with these public DNS providers the internet may become slower for those using these services.
I’m doing an experiment with Google Knol, so I’ve written a Knol about back pain
I will post the results of this test later on.