Introduction

Newspapers The Guardian and The Observer are available digitally via Newspaper Direct in an "as printed" form. Although this isn't very useful for on-line reading, it would be a good format for an e-reader of some description. Ideally, it would be possible to automatically download the paper and synchronise a reader overnight and have the latest paper ready to go in the morning.

Unfortunately, Newspaper Direct don't make it easy to get the latest copy in PDF or ebook format. You have to connect to the site and download each section manually which pretty much kills the idea of a daily e-reader load.

Guardian Grab interacts with the Newspaper Direct site to log-on, identify available sections and download in PDF, mobi (Kindle) or ePub formats. Downloads are arranged by paper, date and format under a specified directory. Guardian Grab also maintains a directory holding the latest copy of each paper, for easy syncing.

NB: You will need a Newspaper Direct subscription to use this software.

Guardian Grab is an enhancement of work done by Ladislav Snizek in guardianpdf.

Install

Guardian Grab requires Perl and the following modules to be installed. Some are likely to already be present in your Perl distribution. For Windows I use ActivePerl, but <bbc>other distributions are available</bbc>.

Existing user? See the changelog.

UNIX

For BSD/Linux/other UNIX-like systems:

  1. Download guardiangrab-2.0.tar.gz
  2. Unpack
  3. make install
  4. Create configuration file named .guardiangrab in your home directory

Windows

For Microsoft Windows systems:

  1. Download guardiangrab-2.0.zip
  2. Unpack
  3. Place guardiangrab.pl somewhere of your choice
  4. Create configuration file named guardiangrab.ini in your AppData directory
    • Under Windows 7 this is c:\users\name\AppData
    • If in doubt just run guardiangrab.pl and the output will tell you where it's looking

Configuration

Your configuration file should look like:

# Your Newspaper Direct login ID
login=myself@me.com

# Your Newspaper Direct password
password=

# Base directory for stored files
destdir=/media/paper

# Base directory for stored files (Windows)
;destdir=c:\users\name\Documents\Paper

# Formats to retrieve
format=pdf
format=kindle

# Publication details
<Publication guardian>
  domain=guardian
  signinHost=users.guardian.co.uk
</Publication>

The format parameter can either pdf or one of the ebook formats, currently: cooler, edge, irex, kindle, kindledx, libra, nook, nuut, sonyreader. Specify format multiple times to download in multiple formats.

Using

Just run guardiangrab or guardiangrab.pl.

And finally...

TODO

These are just ideas, they may or may not happen.

Feedback

Feedback always appreciated, even if just to say you found this useful.