[NTLUG:Discuss] copying web documents

Thu May 18 12:11:04 CDT 2006

Is this a one-time event or an example of a general want?  Do
you want a utility or do you want to do some programming?

I have written Python code in the past to grab web pages - it is
pretty quick.  I have been intending to extend the code for the
kind of issue you bring up, but have not gotten around to doing
it yet.  Collaboration could be a good way to get me moving on
this again.

Walter Johnston

---- On Thu, 18 May 2006, Fred (fredstevens at yahoo.com) wrote:

> How does one copy a many paged online html document? I tried
wget but it tries
> to do the whole website (and is told to buzz off by the
server). If the
> document was available in pdf form it would be moot, but
someone stuck it on
> their web site in html. Y'know, link after bloody link... God
only knows how
> many pages.
> Something like wget which is able to start at the table of
contents and
> retrieve all pages.
> 
> Fred
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection
around 
> http://mail.yahoo.com 
> 
> _______________________________________________
> http://ntlug.pmichaud.com/mailman/listinfo/discuss
> 
>