[NTLUG:Discuss] copying web documents

David Stanaway david at stanaway.net
Thu May 18 16:12:16 CDT 2006


Fred wrote:
> I may be trying Jay's suggestion about a Windoze prog since wget has resisted
> my puny efforts to make it work. Here's a thought: y'all try to get something
> to copy the manual at the following URL and tell me how you did it. That way we
> are on the (no pun intended) same page.
> 
> http://www.globalsecurity.org/military/library/policy/army/fm/3-19-40/
> 
> Thanks,
> Fred
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around 
> http://mail.yahoo.com 
> 
> _______________________________________________
> http://ntlug.pmichaud.com/mailman/listinfo/discuss
> 



Okay, this is what I did.

$ cp `which wget` .
$ sed -i 's/robots.txt/nobots.txt/g' wget
$ ./wget -r -np -U 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12)
Gecko/20060205 Epiphany/1.8.3 (Debian)'
http://www.globalsecurity.org/military/

Alternately, you could get the source and take out the robots compliance.



More information about the Discuss mailing list