Sean’s Obsessions

  • Archives

08 Dec

Monitoring Changed Web Pages with RSS

I’d like to keep track of a few pages and be notified when they are changed. Since they aren’t news sites themselves, nor do they offer RSS, I figured it would be pretty easy to build an RSS feed myself. Code here.

Create a file called “checknewpages.cfg” with a list of sites and point the script to it (it’ll take it on the command line or edit the script). It writes to ~/public_html/new.xml (again, edit as needed). Point your RSS aggregator to the url of the output.

Fairly rough, I wanted to get If-Modified-Since working to save the site’s bandwidth, but I ended settling for a hash method instead.

Run it from cron with something like

12 */3 * * * /export/home/sean/bin/checknewpages.pl > /dev/null 2>&1

Updated: Follow the link above for the code

3 Responses to “Monitoring Changed Web Pages with RSS”

  1. 1
    rss Says:

    This is failing me with:
    Use of uninitialized value in scalar chomp at rss.pl line 40.
    Use of uninitialized value in split at rss.pl line 41.
    Use of uninitialized value in subroutine entry at rss.pl line 44.
    Use of uninitialized value in concatenation (.) or string at rss.pl line 49.
    I delete the 15 items comment.
    Any help.. ?
    I was trying to make an rss feed out of
    http://securitytracker.com/archives/summary/9000.html

  2. 2
    Sean Says:

    Argh, some stuff was lost in the HTML translation. The while() should be while (<CFG>)

    I’ll post a link to the script itself later today, that might help.

    Sean

  3. 3
    rss Says:

    Neat. Thanks :)

© 2008 Sean’s Obsessions | Entries (RSS) and Comments (RSS)

Powered by Wordpress, design by Web4Sudoku, based on Pinkline byGPS Gazette