URL Checker

I wrote a quick Python script designed to search a file / remote address for URLs and return the HTTP status codes for each one. It's quick and dirty, and the regex needs some tweaking, but for the most part it works. The reason I didn't just use a link checker is that I was actually testing RSS feeds, so this was designed to grab URLs throughout the feed as opposed to just A tags. It lists anything in the 40x range. Here's the source code:

...and here's an example of usage:

$ python checkurls.py http://www.google.com
GETTING REMOTE FILE.
SEARCHING FOR URLS.
CHECKING 21 URLS...
RESULTS:
========
There were 14 200s.
There were 2 302s.
There were 1 404s.
* http://www.google.com/ig%3Fhl%3Den%26source%3Diglk&usg=AFQjCNFA18XPfgb7dKnXfKz7x7g1GDH1tg
There were 1 405s.
* http://www.google.com/reader/view/?hl=en&tab=wy There were 3 301s.

You can also pass in a local file as the first parameter:

$ python checkurls.py file.htm

If you have any thoughts, improvements, etc. just post them in the comments and I'll update the script. I may make a "recursive" one eventually, so that it actually could function as a link checker, but I don't feel like adding that right now. :)