[NTLUG:Discuss] wget

terry trryhend at gmail.com
Thu Mar 26 08:50:07 CDT 2009


I've done ok with just -m
   wget -m mysite.com

>From man wget:
       -m
       --mirror
           Turn on options suitable for mirroring.  This option turns on
           recursion and time-stamping, sets infinite recursion depth and
           keeps FTP directory listings.  It is currently equivalent to -r -N
           -l inf --no-remove-listing.


On Thu, Mar 26, 2009 at 1:12 AM, Leroy Tennison
<leroy_tennison at prodigy.net> wrote:
> Anybody have experience using wget to:
>
> download a whole site which contains references to other sites
> get it to download only the site specified?
>
> I ran wget without the -H switch and it started downloading other sites.
>  Tried using "--exclude-domain ..." which was ignored.  Tried using -D
> ... and all wget would download was index.html even though I had
> specified -r -p -l inf and a couple of other switches.
>
> Another option is if anyone can tell me of an easy way to determine
> which files on a web site aren't being referenced.  I maintain a web
> site I inherited and there's a lot of "history" which needs to be
> addressed.  I also need to find out if a file I need to add to the site
> is in fact already being referenced (and where).  Since I have to use
> Windows at work I tried Frontpage for unlinked files, it included a page
> which is referenced in index.html - so much for that approach.  Another
> program I found on the Web did the same thing which is when I turned to
> wget only to encounter this problem.  Any help would be much appreciated.
>
> _______________________________________________
> http://www.ntlug.org/mailman/listinfo/discuss
>



-- 
<><



More information about the Discuss mailing list