जगदीश खोलिया: UNION vs UNION ALL?

Tuesday, November 22, 2011

UNION vs UNION ALL?

 Why UNION ALL is faster than UNION?

 

UNION and UNION ALL are used to combine data or record sets from two different tables. One could combine more than 2 tables. Just like JOINS, UNION combines data into a single record-set but vertically by adding rows from another table. JOINS combine data horizontally by adding columns from another table.
UNION insures you get DISTINCT records from both the tables.
UNION ALL pulls out all records from both the tables with duplicates.
SQL Scripts below proves why UNION ALL is faster than UNION:
1-- Create First table: #tempTable1
2select FirstName, LastName into #tempTable1
3from Person.Contact
4where ContactID <= 100 -- Create Second table: #tempTable2 select FirstName, LastName into #tempTable2 from Person.Contact where ContactID > 100 and ContactID <= 200
Comparison:
1SET STATISTICS PROFILE ON
2
3-- Using UNION - Query Cost: 35%
4select * from #tempTable1
5UNION
6select * from #tempTable2
Query Plan for UNION:
  |--Sort(DISTINCT ORDER BY:([Union1008] ASC, [Union1009] ASC))
       |--Concatenation
            |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
            |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))
1-- Using UNION ALL - Query Cost: 13%
2select * from #tempTable1
3UNION ALL
4select * from #tempTable2
Query Plan for UNION ALL:
  |--Concatenation
       |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
       |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))
1-- Using UNION ALL with DISTINCT - Query Cost: 52%
2select DISTINCT * from #tempTable1
3UNION ALL
4select DISTINCT * from #tempTable2
Query Plan for UNION ALL with DISTINCT:
  |--Concatenation
       |--Sort(DISTINCT ORDER BY:([tempdb].[dbo].[#tempTable1].[FirstName] ASC,
[tempdb].[dbo].[#tempTable1].[LastName] ASC))
       |    |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable1]))
       |--Sort(DISTINCT ORDER BY:([tempdb].[dbo].[#tempTable2].[FirstName] ASC,
[tempdb].[dbo].[#tempTable2].[LastName] ASC))
            |--Table Scan(OBJECT:([tempdb].[dbo].[#tempTable2]))
1SET STATISTICS PROFILE OFF
2
3-- Final Cleanup
4drop table #tempTable1
5drop table #tempTable2
The Query plan for:
- UNION ALL SELECT above shows it just does a concatenation.
- UNION SELECT above shows it does a SORT on final concatenation.
- UNION ALL with DISTINCT SELECT above shows it does a SORT of individual tables than does the final concatenation.
The Query cost with:
- UNION ALL = 13%
- UNION = 35%
- UNION ALL with DISTINCT = 52%
This proves that:
  • UNION ALL is faster and more optimized than UNION. But this does not mean you use UNION ALL in every scenario.
  • UNION is not equivalent to “UNION ALL with DISTINCT”.
referenced from http://sqlwithmanoj.wordpress.com/2010/12/

No comments: