Excel question

Slap Shot · Mar 11, 2010

If I have a spreadsheet (Office version 2003) with say 10,000 records, many of which have dupcliated data in every column, other than sorting and manually deleting them, is there a systematic way to delete duplicate rows?

[edit] And if not via Excel what if I converted it to Access?

Thanks.

CTSioux · Mar 11, 2010

Re: Excel question

Not sure of your exact situation but I have used this proceedure in the past.

http://support.microsoft.com/kb/262277

Slap Shot · Mar 11, 2010

Re: Excel question

nm - I figured it out as long as it's sorted and the duplicates are next to each other: =IF(B2=B3,1,0)

All unique records are assinged a '0' and all the dupes are assinged a '1' as long as I drag the formula throughout the unique column. I then sorted that column and was able to delete all rows with a '1'.

JF_Gophers · Mar 11, 2010

Re: Excel question

Slap Shot said:
If I have a spreadsheet (Office version 2003) with say 10,000 records, many of which have dupcliated data in every column, other than sorting and manually deleting them, is there a systematic way to delete duplicate rows?

[edit] And if not via Excel what if I converted it to Access?

Thanks.

There is simple formula's you can use to de-dup a spreadsheet as well.

First you would want to sort your data so that the duplicates end up next to each other.

So maybe you sort by the first 3 columns: First name, Last name, Account Number (example).

Then you can add a new column to the end of your data called DUPS that will compare the rows to see if that data matches.

=IF(CONCATENATE(A2,B2,C2)=CONCATENATE(A1,B1,C1),"dup","")

That formula only checks in a single direction, but would catch all of them in the example situation.

Once you have all the DUPS identified you can simply filter on the DUP row to only show ones that say "dup". Then delete all of those rows. Once you unfilter your data you should only have the unique rows you want.

Edit: Yep you figured out the same basic principle.

Jupiter · Mar 11, 2010

Re: Excel question

Doing it that way will filter the data.... But the data will still be there unless you delete it.

The old school way of doing it would be to add a column and create an if statement. In that new column you would create an if statement in the 2nd to the top row. Something like this:

=IF(B7=B8,"yes","no")

B7 & B8 are the cell you are comparing, so it would obviously be different on your spreadsheet. But in this sample you testing to see if the value in B8 equals the value above it. If the value is the same it puts "yes" in the cell, if it does not match it puts "no" in that cell.

Once you have the first formula created, copy it down your entire list. Then you will have a bunch of yes and no's listed in that column. I would then highlight the entire column, copy it, then go to your edit menu and select past special. Then only select values. It will paste the yes and no's over themselves but the formulas will be gone. Then just sort that column and you can organize all of the yes's and no's and delete which ever you want in bulk delete. In this case you want to remove the duplicates, so you would delete all of the yes's.

Make sense?

That is the complete old school way of how I would do it.

Slap Shot · Mar 11, 2010

Re: Excel question

Jup and JF - thanks for the posts and that is exactly what I did, although it took me some trial and error because I first sorted without pasting the special values and the formula obviously no longer worked since the records by row had changed.

Thanks guys!

Jupiter · Mar 11, 2010

Re: Excel question

my replies are too slow....

St. Clown · Mar 11, 2010

Re: Excel question

Jupiter said:
my replies are too slow....

...like your skating?

Search

Search

Excel question

Slap Shot

I got nothing

CTSioux

mmmmm.... cupcakes

Slap Shot

I got nothing

JF_Gophers

2147483647

Jupiter

Supreme God of the Roman Pantheon

Slap Shot

I got nothing

Jupiter

Supreme God of the Roman Pantheon

St. Clown

Ideas Posted are Likely Not My Own