Thursday, 8 May 2014

Watching google crawl TeamMentor site (10m after blog post)

This is really interesting and telling of Google's crawling speed and updates.

I posted What are the main TeamMentor use cases? (and "Don't copy and paste from Google, copy and paste from TeamMentor") 10 minutes ago, and while looking at the new 'TM 3.4.1 real-time TeamMentor Activity' viewer, I noticed a number of 404s:

These 404 are for the articles mentioned (and linked) on that blog post, which made think strait away 'this is a bot, since a human would had stopped after the first couple 404 (and most likely get a free eval account to see the content)'.

So who was it?

Well, looking at the source IP:

I think it is a safe bet to say that this was Google crawler in action (here is more info on Googlebot)

