[NTLUG:Discuss] wget

Leroy Tennison leroy_tennison at prodigy.net
Thu Mar 26 01:12:16 CDT 2009


Anybody have experience using wget to:

download a whole site which contains references to other sites
get it to download only the site specified?

I ran wget without the -H switch and it started downloading other sites. 
  Tried using "--exclude-domain ..." which was ignored.  Tried using -D 
... and all wget would download was index.html even though I had 
specified -r -p -l inf and a couple of other switches.

Another option is if anyone can tell me of an easy way to determine 
which files on a web site aren't being referenced.  I maintain a web 
site I inherited and there's a lot of "history" which needs to be 
addressed.  I also need to find out if a file I need to add to the site 
is in fact already being referenced (and where).  Since I have to use 
Windows at work I tried Frontpage for unlinked files, it included a page 
which is referenced in index.html - so much for that approach.  Another 
program I found on the Web did the same thing which is when I turned to 
wget only to encounter this problem.  Any help would be much appreciated.



More information about the Discuss mailing list