Register

If this is your first visit, please click the Sign Up now button to begin the process of creating your account so you can begin posting on our forums! The Sign Up process will only take up about a minute of two of your time.

Page 1 of 2 1 2 LastLast
Results 1 to 10 of 13
  1. #1
    Senior Member mossoi's Avatar
    Join Date
    Apr 2003
    Location
    Englandshire, United Kingdomsville y'all!
    Posts
    1,111
    Member #
    1206
    Liked
    1 times
    I've got a database with about 40,000 rows in it. I need to delete any duplicated rows. What would be the best way to do this using a mySQL query?

    I can do this through PHP but would rather something I can execute straight on the server.

    Cheers

  2.  

  3. #2
    WDF Staff smoseley's Avatar
    Join Date
    Mar 2003
    Location
    Boston, MA
    Posts
    9,729
    Member #
    819
    Liked
    205 times
    if the entire row is duplicated, there's no unique identifier to delete it by - the only way to go back and do it would be to add a unique id column and then run a complex query (w/ subqueries) to delete duplicated data. Assuming you already have the unique id column, it should look something like:

    Code:
    DELETE FROM table_name 
    WHERE unique_id IN (SELECT unique_id 
            FROM table_name 
            WHERE duplicated_column IN (SELECT duplicated_column, COUNT(duplicated_column)
                FROM table_name
                GROUP BY duplicated_column
                HAVING COUNT(duplicated_column) > 2))
    AND unique_id NOT IN (SELECT TOP 1 unique_id 
            FROM table_name 
            WHERE duplicated_column IN (SELECT duplicated_column, COUNT(duplicated_column)
                FROM table_name
                GROUP BY duplicated_column
                HAVING COUNT(duplicated_column) > 2))
    I think that's it, more or less... untested of course, and probably an easier and more elegant way to do it out there, but in 5 minutes this is all i could think of.

  4. #3
    Senior Member mossoi's Avatar
    Join Date
    Apr 2003
    Location
    Englandshire, United Kingdomsville y'all!
    Posts
    1,111
    Member #
    1206
    Liked
    1 times
    I've got the unique id so I'll give that a go today. Cheers.

  5. #4
    Senior Member Brak's Avatar
    Join Date
    Apr 2003
    Location
    San Francisco, CA
    Posts
    3,413
    Member #
    1217
    Liked
    2 times
    The much easier solution is to do a self-join:
    Code:
    DELETE FROM tableName
    USING tableName AS t1, tableName AS t2
    WHERE t1.fieldName = t2.fieldName
    Kyle Neath: Rockstar extraordinare
    The blog | The poetry site | The Spore site

  6. #5
    WDF Staff smoseley's Avatar
    Join Date
    Mar 2003
    Location
    Boston, MA
    Posts
    9,729
    Member #
    819
    Liked
    205 times
    brak,

    That will delete ALL rows, though!!!

  7. #6
    WDF Staff smoseley's Avatar
    Join Date
    Mar 2003
    Location
    Boston, MA
    Posts
    9,729
    Member #
    819
    Liked
    205 times
    Using Brak's self-join example, I think you could delete duplicates like this:
    Code:
    DELETE FROM tableName
    USING tableName AS t1, (SELECT duplicated_column FROM table_name GROUP BY duplicated_column HAVING COUNT(duplicated_column) > 1)AS t2
    WHERE t1.fieldName = t2.fieldName
    AND unique_id NOT IN (SELECT TOP 1 unique_id FROM table_name)
    Not tested, of course, and of course you should always do a select with this type of query first to test the results

  8. #7
    Senior Member Brak's Avatar
    Join Date
    Apr 2003
    Location
    San Francisco, CA
    Posts
    3,413
    Member #
    1217
    Liked
    2 times
    That will delete all duplicated rows... if you wanted to leave one of each duplicated row, that's different
    Kyle Neath: Rockstar extraordinare
    The blog | The poetry site | The Spore site

  9. #8
    Senior Member mossoi's Avatar
    Join Date
    Apr 2003
    Location
    Englandshire, United Kingdomsville y'all!
    Posts
    1,111
    Member #
    1206
    Liked
    1 times
    brak,

    That will delete ALL rows, though!!!
    How right you are transio!

    I ran in on a test copy of the db and that's just what it did.

    I'm going to give your alternative a go tomorrow and post the outcome.

  10. #9
    Senior Member Brak's Avatar
    Join Date
    Apr 2003
    Location
    San Francisco, CA
    Posts
    3,413
    Member #
    1217
    Liked
    2 times
    Ah! Another idea:

    Code:
    CREATE TABLE new_tablename
    SELECT DISTINCT [rows you want to keep] FROM currentable WHERE 1
    You'll have to exclude the ID as it won't return anything. Then just delete old table, rename new table to odl table's name..
    Kyle Neath: Rockstar extraordinare
    The blog | The poetry site | The Spore site

  11. #10
    Senior Member mossoi's Avatar
    Join Date
    Apr 2003
    Location
    Englandshire, United Kingdomsville y'all!
    Posts
    1,111
    Member #
    1206
    Liked
    1 times
    Good thinking. What's the "WHERE 1" bit up to though?


Page 1 of 2 1 2 LastLast

Remove Ads

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  

Search tags for this page

delete duplicate data in php mysql

,

mysql delete repeated datas

Click on a term to search for related topics.
All times are GMT -6. The time now is 11:12 PM.
Powered by vBulletin® Version 4.2.3
Copyright © 2019 vBulletin Solutions, Inc. All rights reserved.
vBulletin Skin By: PurevB.com