Skip to main content

Oracle Database find duplicate records on certain column in a table.

This is a Beginner level article but can be referred readily by guys using Oracle Database moderately for a refernence.

You can identify the duplicate records in Oracle Table by using a group by clause. You need to group by column / columns from which you are trying to identify the duplicates , use count() function to identify the duplicates : expression is self exploratory having count () > 1 .


Ex 1 : Identify duplicates first names from employees table.
 Note in this context : Employee records is not necessarily duplicate we are only trying to find employees with same first name.


select empl_first_name  , count(empl_first_name)
from employees
group by empl_first_name 
having count(empl_first_name) > 1

Ex 2 : Identify duplicates or repeats based on 2 or more columns.

select empl_first_name  , empl_last_name  ,  count(empl_first_name)
from employees 
group by empl_first_name , empl_last_name
having count(1) > 1

Identify duplicate / repeats based on 2 or more columns.

Ex 3. Use of Analytic function to identify duplicates.

select * from 
(select count(1) over (partition by empl_first_name ) fname_count , e.*   
from employees e)
where fname_count > 1

Advantage in this example is you were able to select all columns in the query resultset unlike the pure group by queries above.

Ex 4. Delete duplicate rows : (Be careful and wary of what you are doing)

delete from     employees where rowid in
 (select rowid from 
   (select 
     rowid,
     row_number()
    over 
     (partition by empl_first_name order by empl_first_name) dup
    from employees)
  where dup > 1)

Comments

Popular posts from this blog

Cyclic blocking session removal script written by me for Oracle Database.

Plenty of times we have cyclic database blocking sessions. We have the script which runs in loop and kills them , only to see new blocking sessions have resurfaced. This is a classic scenario that happens many time due to poor application design. A was blocking B , B was blocking C. Unless you have some automated blocking session clearing script. You would run a script at point in time killing session of A. Giving the lock ownership to B.   B could realize it late that he has the lock and not commit his work. So DBA again goes in and sees block kills B's session. In meantime Frustrated A logs in and start his activity again queuing him up in wait for lock retrieval . This goes on and you end up running the script 5-10 times to kill these sessions to finally clear the blocks. Note this is not a deadlock which is normally apprehended and identified by oracle. In such case we had written below plsql to clear the sessions. This basically goes in and check for locks ever

Oracle Connect By Clause to Perform Hierarchical Queries.

 If we want to perform hierarchical queries we need to use connect by clause in our sql query. Hierarchical query is any query wherin you want to display parent child relationship with the case that a parent may have multiple children and those have their own creating a tree like structure. One classic scenario highlighted on almost all sites is employee table wherein employee's manager is stored in the employee row the manager id is again one of the employee in the employee table. But we are not going to take it as you can reference it almost on any site that hosts oracle based content. We are going to do a query based on DBA_DEPEDENCIES or ALL_DEPENDANCIES if you do not have access to former. In DBA_DEPENDANCY table we have hierarchy of REFERENCED_NAME(parent) and the referring objects in column NAME(child ) that have dependency on the parent. with this in mind we would need to set our connect by clause as prior NAME= REFERENCED_NAME This confirms that REF

Modify Windows file Change / modified date.

There is no straightforward command in windows to change the file modify date (Commonly referred to as timestamp) . I have spent significant time searching UNIX touch -t like command equivalent in windows but could not find a command that could achieve this feat. However there is a a way around through  a VBS . This VBScript would let you modify the file timestamp (Modified date ) to the one you give in the parameter. In below example it would be 29-Oct-2015.   Sub ChangeModifiedDate(strFolder, strFile, dteNew) Dim oShell Dim objFolder Set oShell = CreateObject("Shell.Application") Set oFolder = oShell.NameSpace(strFolder) oFolder.Items.Item(strFile).ModifyDate = dteNew End Sub changemodifieddate "D:\myfolder\fmx","new_empl.fmx","29-10-2015" So here are the things to change. Change the last line 1 st argument is the windows folder. 2nd Argument is the file name 3rd Argument is the Date modified desired. One needs