How to Remove Lines if Identical up to Second Comma?

Get help with installation and running here.

Moderators: DataMystic Support, Moderators

Post Reply
jorgejulio
Posts: 2
Joined: Tue Jan 10, 2006 6:28 am

How to Remove Lines if Identical up to Second Comma?

Post by jorgejulio » Sat Jan 28, 2006 7:34 am

I have perl code which works for removing lines in a .CSV(comma-separated values) file identical up to first comma(i.e. when first "key" is identical).

EXAMPLE INPUT (The line numbers are only for identification, they wouldn't be part of the input):
1: 123,abc,XYZ
2: 123,def,UVW
3: 456,abc,XYZ
4: 456,def,UVW
5: 123,abc,QRS
6: 789,abc,XYZ

OUTPUT:
1: 123,abc,XYZ
3: 456,abc,XYZ
6: 789,abc,XYZ

open (FILE,"mycsv.csv");
foreach $line (<FILE>)
{
($first,$second)=split(/,/,$line);
if (!$file{$first})
{
push (@newfile,$line);
$file{$first} = 1;
}
}
print @newfile;
close FILE;

How does one skip line if it is identical up to second comma (if first two keys are identical)?

"TASK 2" EXAMPLE INPUT (The line numbers are only for identification, they wouldn't be part of the input):
1: 123,abc,XYZ
2: 123,def,UVW
3: 456,abc,XYZ
4: 456,def,UVW
5: 123,abc,QRS
6: 789,abc,XYZ

NEEDED OUTPUT:
1: 123,abc,XYZ
2: 123,def,UVW
3: 456,abc,XYZ
4: 456,def,UVW
6: 789,abc,XYZ

Only line 5 should be skipped because "123,abc"(first two keys) were identical in line 1.

Thanks for any help!
j2

User avatar
DataMystic Support
Site Admin
Posts: 2206
Joined: Mon Jun 30, 2003 12:32 pm
Location: Melbourne, Australia
Contact:

Post by DataMystic Support » Fri Feb 03, 2006 10:33 am

Use a CSV field restriction to pad the first field out to the maximum width (say 30 characters), then use a Remove Uplicate lines, comparing from character 1 to 30.
Regards,

Simon Carter, http://DataMystic.com/forums/index.php
http://PredictBGL.com - Insulin dose calculator for Type 1 diabetes
http://DownloadPipe.com - 250,000 free software downloads
http://DetachPipe.com - send huge email attachments

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest