[NTLUG:Discuss] sed awk question

Mon Aug 12 11:34:35 CDT 2002

Bobby Wrenn wrote:

>My reading led me to believe that awk was best for working with fields rather 
>than whole lines.
>
>My question last week made me think perl was better than either and more 
>versatile than either.
>
Usually true.  The case where perl won't be better is cases where
you cannot be sure of Perl's availability... which is becoming quite
a small list btw.  As more an more Unix vendors start shipping perl
as a standard part of their base OS install, there will be less and
less reasons to use things like sed and awk.  Might still be
some cases where memory and space are involved though.... or for
old folks like myself that think naturally in terms of sed and
awk.  Since perl is evolving (esp. v6)... I'll probably defer
on doing much perl.  The radical changes to perl REs, while
certainly techncially better, is a headache and a half.

>
>
>So, here is my question this week. I need to sort a comma delimited file and 
>return only those line where the first field is unique. Currently I am 
>pulling the file into Access and elimating duplicates there. There has to be 
>a better way.
>
Not as easy as some would like to think...
(this actually eliminates duplicates... technically you asked for data that
ONLY had unique entries.... that could be done too)  Outputs the first
occurence of the duplicate key lines when duplicates are found.

sort -t, -k1 csg.data | awk -F, '
{
        if ($1 != lastsaved) {
                print;
                lastsaved=$1;
        }
}'

The uniq program does not provide the flexibility to deal with field
delimited files.