[NTLUG:Discuss] pulling tables out of web pages.

Greg Edwards greg at nas-inet.com
Thu Apr 8 15:46:15 CDT 2004


Bobby Wrenn wrote:
> I have tried some html2txt tools and have had no success.
> 
> I need to convert a web page into a tab delimited file (preferably 
> keeping only the data table). My goal is to do several of these pages 
> and cat them into a big table and delete duplicates.
> 
> I think I can handle most of the problem if I can just convert the html 
> to a tab delimited text file.
> 
> Anyone know of a reliable tool?
> 
> Here is a sample of the web pages I am working on:
> http://partsurfer.hp.com/cgi-bin/spi/main?sel_flg=partlist&model=KAYAK+XU+6%2F266MT&HP_model=&modname=Kayak+XU+6%2F266MT&template=secondary&plist_sval=ALL&plist_styp=flag&dealer_id=&callingsite=&keysel=X&catsel=X&ptypsel=X&strsrch=&pictype=I&picture=X&uniqpic= 
> 
> 
> TIA
> Bobby

If this is a one time deal?  Read the file in with StarOffice Calc, then 
  save as a comma delimited file (text CVS).  Some of the other 
spreadsheet progs can do this as well.

HTH
-- 
Greg Edwards

Hosted Websites from New Age Software - http://www.nas-inet.com
   Anime, Manga, Lady Amaya - http://roseofcreation.nas-inet.com
   Coppell Texas            - http://coppell.nas-inet.com
   Software Engineering     - http://consult.nas-inet.com




More information about the Discuss mailing list