Archive

Archive for February, 2011

Remove site from Google

February 22nd, 2011 No comments

We found that Google had indexed a site that shouldn’t be indexed so I setup a robots.txt file to deny all crawlers and locked down the site with http auth. I also put in a request to have the urls removed from the index and cache.

When I did this Google returned ~2,400 results when doing a “site:www.site.com”. A few days later it was returning ~54,000. Today it is returning ~133,000.

I’m not sure how Google managed to mix up “remove my site” with “index it more”. Maybe this is just part of the removal process?

Update: Google is now up to 217,000 results for this site. Maybe removing your site from the index is good for SEO?

Categories: Uncategorized Tags:

Change CVS path

February 16th, 2011 No comments

Found a small command to run through code directories and change the CVS path, handy if you’re changing CVS username or path.

find . -name ‘Root’ -exec perl -pi -e ‘s/OLD_URL/NEW_URL/’ {} \;

Categories: Tech Problems Tags:

unzip 6 for RHEL 5.6

February 15th, 2011 2 comments

Redhat likes to run old packages with backported patches, but sometimes it’s nice to have the latest version that includes new features. Unfortunately the unzip from RHEL yum can’t unzip files > 4GB. I found the src rpm and compiled binaries that work with RHEL 5.6 (may work with older).

unzip-6.0-1.i386.rpm
unzip-6.0-1.x86_64.rpm

Categories: Uncategorized Tags: ,

Deny access to website, but allow robots.txt

February 15th, 2011 No comments

I had a problem where Googlebot was indexing a development site, so we locked it down using apache basic http auth. Now Googlebot was being served with a 401 when accessing the site, but because it had no stored robots.txt it was persistently trying to crawl the site.

Using the following allows anyone to access robots.txt but denies access to the rest of the site:
<Directory “/home/username/www”>
AuthUserFile /home/username/.htpasswd
AuthName “Client Access”
AuthType Basic
require valid-use

<Files “robots.txt”>
AuthType Basic
satisfy any
</Files>
</Directory>

Eventually Googlebot will get the hint and stop indexing the site and we can remove existing content using webmaster tools.

Categories: Uncategorized Tags: , , ,

Overheating Macbook Pro

February 14th, 2011 No comments

I’ve been having problems recently with it overheating. It’s not much fun when trying to code on the couch with a boiling hot mac on your lap.

I found a program called smcfancontrol which seems to take care of that. It’s now under 50oC and coding it now more fun!

Categories: Uncategorized Tags: