[NTLUG:Discuss] copying web documents
David Stanaway
david at stanaway.net
Thu May 18 16:12:16 CDT 2006
Fred wrote:
> I may be trying Jay's suggestion about a Windoze prog since wget has resisted
> my puny efforts to make it work. Here's a thought: y'all try to get something
> to copy the manual at the following URL and tell me how you did it. That way we
> are on the (no pun intended) same page.
>
> http://www.globalsecurity.org/military/library/policy/army/fm/3-19-40/
>
> Thanks,
> Fred
>
> __________________________________________________
> Do You Yahoo!?
> Tired of spam? Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>
> _______________________________________________
> http://ntlug.pmichaud.com/mailman/listinfo/discuss
>
Okay, this is what I did.
$ cp `which wget` .
$ sed -i 's/robots.txt/nobots.txt/g' wget
$ ./wget -r -np -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12)
Gecko/20060205 Epiphany/1.8.3 (Debian)'
http://www.globalsecurity.org/military/
Alternately, you could get the source and take out the robots compliance.
More information about the Discuss
mailing list