[NTLUG:Discuss] pulling tables out of web pages.
Greg Edwards
greg at nas-inet.com
Thu Apr 8 15:46:15 CDT 2004
Bobby Wrenn wrote:
> I have tried some html2txt tools and have had no success.
>
> I need to convert a web page into a tab delimited file (preferably
> keeping only the data table). My goal is to do several of these pages
> and cat them into a big table and delete duplicates.
>
> I think I can handle most of the problem if I can just convert the html
> to a tab delimited text file.
>
> Anyone know of a reliable tool?
>
> Here is a sample of the web pages I am working on:
> http://partsurfer.hp.com/cgi-bin/spi/main?sel_flg=partlist&model=KAYAK+XU+6%2F266MT&HP_model=&modname=Kayak+XU+6%2F266MT&template=secondary&plist_sval=ALL&plist_styp=flag&dealer_id=&callingsite=&keysel=X&catsel=X&ptypsel=X&strsrch=&pictype=I&picture=X&uniqpic=
>
>
> TIA
> Bobby
If this is a one time deal? Read the file in with StarOffice Calc, then
save as a comma delimited file (text CVS). Some of the other
spreadsheet progs can do this as well.
HTH
--
Greg Edwards
Hosted Websites from New Age Software - http://www.nas-inet.com
Anime, Manga, Lady Amaya - http://roseofcreation.nas-inet.com
Coppell Texas - http://coppell.nas-inet.com
Software Engineering - http://consult.nas-inet.com
More information about the Discuss
mailing list