Skip to main content

Oracle Database find duplicate records on certain column in a table.

This is a Beginner level article but can be referred readily by guys using Oracle Database moderately for a refernence.

You can identify the duplicate records in Oracle Table by using a group by clause. You need to group by column / columns from which you are trying to identify the duplicates , use count() function to identify the duplicates : expression is self exploratory having count () > 1 .


Ex 1 : Identify duplicates first names from employees table.
 Note in this context : Employee records is not necessarily duplicate we are only trying to find employees with same first name.


select empl_first_name  , count(empl_first_name)
from employees
group by empl_first_name 
having count(empl_first_name) > 1

Ex 2 : Identify duplicates or repeats based on 2 or more columns.

select empl_first_name  , empl_last_name  ,  count(empl_first_name)
from employees 
group by empl_first_name , empl_last_name
having count(1) > 1

Identify duplicate / repeats based on 2 or more columns.

Ex 3. Use of Analytic function to identify duplicates.

select * from 
(select count(1) over (partition by empl_first_name ) fname_count , e.*   
from employees e)
where fname_count > 1

Advantage in this example is you were able to select all columns in the query resultset unlike the pure group by queries above.

Ex 4. Delete duplicate rows : (Be careful and wary of what you are doing)

delete from     employees where rowid in
 (select rowid from 
   (select 
     rowid,
     row_number()
    over 
     (partition by empl_first_name order by empl_first_name) dup
    from employees)
  where dup > 1)

Comments

Popular posts from this blog

Use the Microsoft Visual Studio Code(VScode) with Java Maven project.

The VS Code is the go to editor now a days for the latest technologies and scripting options. VS Code as we all know works seamlessly with Python and JavaScripts, and offcourse the .NET family. But VS Code has many extensions which makes it big Java IDEs run for their money. Lets explore what we could do with VSCode if we are Java developer. Install VScode , no brainer. Jus type download Visual Studio code in google. Alrright once we have VSCode. Open it. Install Maven and Java extension. 2. Go to files and a small window will now appear on left side navigation bar. 3. Select New Maven project. 4. Select the archtype from dropdown. 5. As usual input the grpid, artifactid etc. 6. Done. Go through a video which has much detailed navigation flow. Till next time !!!

Oracle Analytical Functions : Tutorial Part 2 Covers sum avg lag lead.

In this tutorial video we have demonstrated how to use oracle analytical functions like lag , lead , sum and avg etc. Example 1. Use of oracle function sum to display running totals with the use of unbounded preceding. select sum(Salary) over (order by salary rows unbounded preceding) running_total , salary , e.* from employees e Identify gap in the contiguous sequences with the lag function. Ex. In Employees table we have contiguous sequence of employees but due to some anomaly we found that there is now gap in the sequences. Ex Employee Id 210 comes after 206 which is not contiguous. We can write a query as given below to identify such sequence gaps. Oracle Lag Function select * from ( select lag (e.employee_id , 1) over (order by e.employee_id) as prev_emp, e.* from employees e ) tmp where (tmp.employee_id - tmp.prev_emp) > 1

Oracle Connect By Clause to Perform Hierarchical Queries.

 If we want to perform hierarchical queries we need to use connect by clause in our sql query. Hierarchical query is any query wherin you want to display parent child relationship with the case that a parent may have multiple children and those have their own creating a tree like structure. One classic scenario highlighted on almost all sites is employee table wherein employee's manager is stored in the employee row the manager id is again one of the employee in the employee table. But we are not going to take it as you can reference it almost on any site that hosts oracle based content. We are going to do a query based on DBA_DEPEDENCIES or ALL_DEPENDANCIES if you do not have access to former. In DBA_DEPENDANCY table we have hierarchy of REFERENCED_NAME(parent) and the referring objects in column NAME(child ) that have dependency on the parent. with this in mind we would need to set our connect by clause as prior NAME= REFERENCED_NAME This co...