I’d like to keep track of a few pages and be notified when they are changed. Since they aren’t news sites themselves, nor do they offer RSS, I figured it would be pretty easy to build an RSS feed myself. Code here.
Create a file called “checknewpages.cfg” with a list of sites and point the script to it (it’ll take it on the command line or edit the script). It writes to ~/public_html/new.xml (again, edit as needed). Point your RSS aggregator to the url of the output.
Fairly rough, I wanted to get If-Modified-Since working to save the site’s bandwidth, but I ended settling for a hash method instead.
Run it from cron with something like
12 */3 * * * /export/home/sean/bin/checknewpages.pl > /dev/null 2>&1
Updated: Follow the link above for the code