Sean’s Obsessions

Sean Walberg’s blog

Monitoring Changed Web Pages With RSS

I’d like to keep track of a few pages and be notified when they are changed. Since they aren’t news sites themselves, nor do they offer RSS, I figured it would be pretty easy to build an RSS feed myself. Code here.

Create a file called “checknewpages.cfg” with a list of sites and point the script to it (it’ll take it on the command line or edit the script). It writes to ~/public_html/new.xml (again, edit as needed). Point your RSS aggregator to the url of the output.

Fairly rough, I wanted to get If-Modified-Since working to save the site’s bandwidth, but I ended settling for a hash method instead.

Run it from cron with something like

12 */3 * * * /export/home/sean/bin/checknewpages.pl > /dev/null 2>&1

Updated: Follow the link above for the code

Comments

I’m trying something new here. Talk to me on Twitter with the button above, please.