[NTLUG:Discuss] Need help debugging simple script commands
Rick Matthews
RedHat.Linux at verizon.net
Wed Jul 31 22:37:13 CDT 2002
> Toss a '-i' in there - but I don't know whether that only affects
> the sort order or whether it also matters for the duplicated
> record thing.
Thank you sir, I believe you've hit on something. Using the -i
eliminated about 95,000 duplicates.
That confirms for me that the problem is trash in the data. As I
mentioned earlier, each time I've found trash I've added a step to
clean out that problem. So far it's blank lines, trailing spaces and
trailing tabs. Can anyone tell me how to clean the file of all
characters that are invalid in a domain name?
This solution let it find and remove duplicates, but it leaves the
garbage. The resulting domain list is loaded into a b-tree structure
and I need to keep the garbage out.
Thanks a bunch for your help!
Rick
> -----Original Message-----
> From: discuss-admin at ntlug.org [mailto:discuss-admin at ntlug.org]On Behalf
> Of Steve Baker
> Sent: Monday, July 22, 2002 3:21 AM
> To: discuss at ntlug.org
> Subject: Re: [NTLUG:Discuss] Need help debugging simple script commands
>
>
> Rick Matthews wrote:
> >>Don't know if you realize it or not, but there is an option within
> >>sort itself that will suppress output of duplicate lines:
> >
> >
> > Yes, I know it's there, and I've used it before, but I had not tried
> > it here.
> >
> > I just ran a test using:
> >
> > sort -u -o domains.uniq domains
> >
> > and got the same exact results as the other method. The dups are
> > still there.
>
> Duplicates usually result from differences in the white space (eg a tab in one
> record and a bunch of spaces in the other - or trailing white space on one
> and not on the other).
>
> Toss a '-i' in there - but I don't know whether that only affects the sort order
> or whether it also matters for the duplicated record thing.
>
> ----------------------------- Steve Baker -------------------------------
> Mail : <sjbaker1 at airmail.net> WorkMail: <sjbaker at link.com>
> URLs : http://www.sjbaker.org
> http://plib.sf.net http://tuxaqfh.sf.net http://tuxkart.sf.net
> http://prettypoly.sf.net http://freeglut.sf.net
> http://toobular.sf.net http://lodestone.sf.net
>
>
>
> _______________________________________________
> http://www.ntlug.org/mailman/listinfo/discuss
>
More information about the Discuss
mailing list