Blogging and tags

Posted by Nick Loadholtes on 11/1/2005 filed in Blogging, Fun, Python, Thinking, Web

I’ve really gotten into the idea of tagging blog posts, it makes your data more accessible to engines like technorati. Plus in the future as more people tag and more engines understand and index the tags its going to be possible to look at pages in a more accurate light, and I think that is exciting.

Along those lines the other day I was wondering how my tags stacked up against what I was writing about. Put simply, according to my banner at the top of the page this blog is about python, games, AI, programming, and other topics. How close to those topics are my postings?

I whipped up python script and had it pour over a backup of my posts (of which there are little over 150). The script basically counted the number of times individual words appeared. My thinking was that the words that show up the most frequently are probably what I’m writing about the most and it would be interesting to see if the matched up with the focus of this site.

Below are some of the top words. The list isn’t complete, but it is some of the more interesting ones and their frequency:

  1. 109 – Python
  2. 97 – Think
  3. 65 – Program (and Programming)
  4. 44 – Google
  5. 24 – Java
  6. 23 – Mac
  7. 22 – Games
  8. 19 – Tags
  9. 13 – Coding
  10. 12 – RPG
  11. 11 – Motivation
  12. 10 – Windows
  13. 10 – Wasteland
  14. 8 – XEmacs
  15. 15 – A9
  16. 6 – Dokken
  17. 3 – Sudoku

The last item in that lists, Sudoku, really surprises me. I’ve written about it I think twice, yet this month I’ve gotten more hits from search engines on that one word than any other! It accounts for 6.1% of the terms that lead to my site. Crazy….

But back to the list: I was happy to see that Python and other programming related topics were high on the list. That’s why I started this blog, as a place for me to talk turkey about programming in general (and python in particular). For the most part the list I generated fits with what I thought I was writing about which fits with what the site description is, so I guess I’m on target. The only phrase I didn’t really see was GTD or “Getting Things Done” and I think that’s because I usually mention it by the long name, which would get mangled up in this count.

So over all for this experiment, I think my tags are pretty close to the focus of my site. I think the next step would be to look at each post individually and try to determine if the tags I used match the content. But that’s another fun task for another fun day….

Leave a Comment

You must be logged in to post a comment.