[NTLUG:Discuss] Short script question
Fred James
fredjame at fredjame.cnc.net
Tue May 4 12:07:29 CDT 2010
Bobby Wrenn wrote:
> Bobby Wrenn wrote:
>> I know this will be trivial to someone who deals with this sort of
>> thing every day. However, I do not fall into that category.
>>
>> I have been looking on the web for pointers on doing this and have
>> come up dry. Usually you want to delete duplicate lines. But I need
>> to do the opposite. I need to find lines in a tab delimited file
>> which are partial matches and save the matches to a new file
>> something like this;
>>
>> read a line into a buffer 1
>> find another line that matches the regex of the line in buffer 1 put
>> it in buffer 2
>> find another line that matches the regex of the line in buffer 1 put
>> it in buffer 3
>> recurs to end of file
>> append all the buffered lines to another file
>> clear the buffer
>> go to the next line and do it again until the end of the file
>>
>> The file is tab delimited and the regex will get the first word the
>> first tab the next word space and the first three character/numbers
>> of the next word as the search criteria. The rest of the line will be
>> any character. The part to match will be everything up to the first
>> three characters of the second word after the first tab.
>>
>> Can someone point me in the right direction? Perhaps an on line
>> tutorial that might cover something like this. I've looked at sed and
>> awk but all the examples I can find expect that you want to remove
>> duplicates.
>>
>> Thanks in advance
>> Bobby Wrenn
> Starting to answer my own question. I have the regex that will select
> the line
> ^([A-Z|0-9]+\t)([A-Z|0-9]+ [A-Z|0-9][A-Z|0-9][A-Z|0-9]).*
> So I can search for a match to \1 but then I have to copy the rest of
> the line that does not match \2 then append both lines to a file, and
> recurs.
Bobby Wrenn
'grep' should do what you want in terms of writing all (complete lines)
wherein a match is found ... so ... maybe you could ...
(1) read the part(s) of the lines in the original file that you want
to match into a "pattern_file"
(2) use grep with the -f option to use the pattern_file, and maybe
the -n to get line numbers as well
???
Hope this helps - or did I miss the point all together?
Regards
Fred James
More information about the Discuss
mailing list