[NTLUG:Discuss] Short script question

Bobby Wrenn bobby at wrennest.com
Tue May 4 08:15:36 CDT 2010


I know this will be trivial to someone who deals with this sort of thing 
every day. However, I do not fall into that category.

I have been looking on the web for pointers on doing this and have come 
up dry. Usually you want to delete duplicate lines. But I need to do the 
opposite. I need to find lines in a tab delimited file which are partial 
matches and save the matches to a new file something like this;

read a line into a buffer 1
find another line that matches the regex of the line in buffer 1 put it 
in buffer 2
find another line that matches the regex of the line in buffer 1 put it 
in buffer 3
recurs to end of file
append all the buffered lines to another file
clear the buffer
go to the next line and do it again until the end of the file

The file is tab delimited and the regex will get the first word the 
first tab the next word space and the first three character/numbers of 
the next word as the search criteria. The rest of the line will be any 
character. The part to match will be everything up to the first three 
characters of the second word after the first tab.

Can someone point me in the right direction? Perhaps an on line tutorial 
that might cover something like this. I've looked at sed and awk but all 
the examples I can find expect that you want to remove duplicates.

Thanks in advance
Bobby Wrenn



More information about the Discuss mailing list