| 
				 
Answer: 
Delete
				Duplicate Records using SQL
				By duplicate record means that every field in one record is
				identical to every field in a different record, i.e. a duplicate
				is where there is no way of telling two or more records apart. If
				you just need to remove records which are similar (i.e. one or
				more fields are identical but there are one or more fields which
				are different) then instead refer to how to delete similar
				records. 
				To check that you have duplicate records in your table do the
				following: 
select
				count(*) from MyTable 
and 
select
				distinct * from MyTable 
				unfortunately SQL does not permit a select count(distinct). You
				have duplicate records if the number of records returned by the
				second query is less than the number of records returned by the
				first. 
				Unfortunately there is no way in SQL to delete one of these
				duplicates without deleting all of them. They are identical after
				all, so there is no SQL query that you could put together which
				could distinguish between them. 
				What you can do is to copy all the distinct records into a new
				table: 
select
				distinct * 
				from
				MyTable 
				This query will create a new table (NewTable in my example)
				containing all the records in the original table but without any
				records being duplicated. It will therefore preserve a single
				copy of those records which were duplicated. 
				Because this query creates (and populates) a new table, it will
				fail if the table already exists. 
				Delete Similar Records
				To delete similar records, i.e. where the records are not the
				same but one field is the same and only one copy needs to be
				preserved, try the following SQL: 
delete
				T1 from MyTable
				T1, MyTable
				T2 
where
				T1.dupField
				= T2.dupField 
and
				T1.uniqueField
				> T2.uniqueField 
				This will delete all records from the table MyTable which have
				the same value for the field dupField, leaving that record which
				has the lowest value in uniqueField. 
				The above SQL works on SQL-Server and MySQL but not in MS-Access.
				For MS-Access try: 
delete
				from MyTable
				 where uniqueField
				not in 
(select
				min(uniqueField)
				from MyTable
				T2  where T2.dupField=MyTable.dupField) 
				This syntax also works in SQLServer (but not in MySQL) but may
				not be as efficient for large tables. 
Using
				Index to eliminate duplicate row 
It's easy to introduce duplicate
				rows of data into Oracle tables by running a data load twice
				without the primary key or unique indexes created or
				enabled.
  Here's how you remove the duplicate rows before
				the primary key or unique indexes can be created: 
 
				 
DELETE
				FROM our_table 
WHERE
				rowid not in (SELECT MIN(rowid) 
FROM
				our_table 
GROUP
				BY column1, column2, column3... ; 
Here column1, column2, column3
				constitute the identifying key for each record.  Be sure to
				replace our_table with the table name from which you want
				to remove the duplicate rows. The GROUP BY is used on the columns
				that make the primary key for the table. This script deletes each
				row in the group after the first row. 
 | 
		
No comments:
Post a Comment