[NTLUG:Discuss] copying web documents

Thu May 18 09:35:28 CDT 2006

Fred wrote:
> How does one copy a many paged online html document? I tried wget but it tries
> to do the whole website (and is told to buzz off by the server). If the
> document was available in pdf form it would be moot, but someone stuck it on
> their web site in html. Y'know, link after bloody link... God only knows how
> many pages.
> Something like wget which is able to start at the table of contents and
> retrieve all pages.

There's a lot of wget options.. like -np (no parent) and -I and -X that
can be used to control the recursion a bit.  No guarantees... but might
help.