Comman table expression CTE
Defining and Using CTE's
Microsoft introduces a different way
of writing a statement called CTE , using
CTE you can write statement and nameed a TSQL SELECT statement and then
reference same named statement later on similar
to referencing a table or views .
Syntax for defining a CTE:
WITH <expression_name>
(Column1, Column2, …) AS (CTE Definition)
Where:
"<expression_name>"
is the name of the CTE
"Column 1, Column2,…" are
the names of the columns returned from the CTE (only required if column names
are not provided, or are not unique)
"CTE Definition" is a
SELECT that defines the result set of the CTE.
A CTE definition needs to be followed
by a single INSERT, UPDATE, DELETE, or SELECT statement that references the
CTE. If the CTE is part of a batch then the statement prior to the WITH clause
must end with a semi-colon.
When you define multiple CTE's, a CTE that references another CTE needs to be
define after the CTE it is referencing.
why and when should use cte
When should you use a CTE? We should use
CTE for various reason but some important reason is
ü If your query needs a recursive solution
ü If your query has multiple subqueries, or you have
repeated the same subquery code in a single statement
ü If your query is large and complex
ü When your TSQL code is complex and/or contains one or more
subqueries it become very hard to read. By breaking up your complex code into
one of more CTEs you will make your code much easier to follow, maintain and
debug.
what is the limitation of cte
These are the clauses which can’t be
used with CTE
ü ORDER BY (unless used with TOP clause)
ü INTO
ü OPTION clause with query hints
ü FOR XML
ü FOR BROWSE
Recursive CTE's
A recursive process is a process that
called itself. In each iteration returns a subset of the result. Recursive process
keeps calling itself until a condition is found that indicate it should stop or
exit. In last result set from a recursive CTE in SQL server is the joined of
all the result sets from all the calls of the recursive query. A CTE can
reference itself. When this is done it is called a recursive CTE. A recursive
CTE contains at least two query definitions, which are SELECT statements. One
query definition is known as the anchor member and the other query definition is known as the recursive member.
An anchor member query definition will not
reference the CTE whereas the recursive member will reference the CTE.
Additionally the anchor member queries need to appear in the CTE prior to the
recursive member queries. The number of columns returned from the anchor member
must be the same number of columns returned from the recursive member. There
can be multiple anchor query definitions where each one is combined with the others
using one of these operators: UNION ALL, UNION, INTERSECT, or EXCEPT. There can
also be multiple recursive query definitions. Each recursive query definition
must be combined using a UNION ALL operation. The UNION ALL operation is also
needed to connect the last anchor query definition with the first recursive
query definition.
Let's review a few examples of CTE's.
CTE's provide you
with a method to more easily document your complex code for readability.
Listing 1.
USE AdventureWorks2012;
GO
SELECT YearMonth, ProductID, SumLineTotal
FROM (
SELECT
CONVERT(CHAR(7),ModifiedDate,120) AS YearMonth
, ProductID
, SUM(LineTotal) AS SumLineTotal
FROM
Sales.SalesOrderDetail
GROUP
BY ProductId, CONVERT(CHAR(7),ModifiedDate,120)
) MonthlyProductSales
WHERE YearMonth = '2008-06';
Listing
1 has a SELECT statement that has a subquery submerged within the FROM clause.
The subquery is a derived table with the alias MonthlyProductSales, which summarizes the LineTotal amount for each Year/Month
combination of the ModifiedDate. From the results of myMonthlyProductSales subquery I constrained the results to
only return those rows that have a year and month of June, 2008.
Although
the code in listing 1 is fairly simple we can improve on its readability by
rewriting it using a CTE
Listing
2.
USE AdventureWorks2012;
GO
-- CTE Definition
WITH MonthlyProductSales AS
(
SELECT CONVERT(CHAR(7),ModifiedDate,120) AS YearMonth
, ProductID
, SUM(LineTotal) AS SumLineTotal
FROM Sales.SalesOrderDetail
GROUP BY ProductId, CONVERT(CHAR(7),ModifiedDate,120)
)
-- SELECT statement referencing CTE
SELECT * FROM MonthlyProductSales
WHERE YearMonth = '2008-06';
In
Listing 2 I moved to derived table subquery in Listing 1 up into a CTE named MonthlyProductSales.
I then replaced the subquery in my SELECT statement with the name of the CTE,
in this case MonthlyProductSales. By moving the
subquery in Listing 1, into a CTE definition, many people will find the code in
Listing 2 easier to read and maintain.
Example
of Using Multiple CTE's
If
your code is more complex and contains multiple subqueries you could consider
rewriting it to simplify maintenance and to make it more readable. One of those
way to rewrite it would be to take all the subqueries and rewrite them as CTEs.
To
demonstrate using more than one CTE in a single SELECT statement, suppose I had
the non-CTE query in Listing 3:
USE AdventureWorks2012;
GO
SELECT SalesPersonID
, SalesYear
, TotalSales
, SalesQuotaYear
, SalesQuota
FROM ( -- First Subquery
SELECT SalesPersonID
, SUM(TotalDue) AS TotalSales
, YEAR(OrderDate) AS SalesYear
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY SalesPersonID, YEAR(OrderDate)
) AS Sales
JOIN ( -- Second Subquery
SELECT BusinessEntityID
, SUM(SalesQuota)AS SalesQuota
, YEAR(QuotaDate) AS SalesQuotaYear
FROM Sales.SalesPersonQuotaHistory
GROUP BY BusinessEntityID, YEAR(QuotaDate)
) AS Sales_Quota
ON Sales_Quota.BusinessEntityID = Sales.SalesPersonID
AND Sales_Quota.SalesQuotaYear = Sales.SalesYear
ORDER BY SalesPersonID, SalesYear;
The
query in Listing 3 contains two different subqueries, both implemented as
derived tables. The first subquery calculates the TotalSales bySalesPersonID and SalesYear. The SalesQuota aggregated value, based on BusinessEntityID and SalesQuotaYear, is calculated in
second subquery. To simplify reading the query in Listing 3, we can rewrite the
subqueries as two different CTEs and then reference those CTEs in a simple
SELECT statement, as I have done in Listing 4.
USE AdventureWorks2012;
GO
WITH
-- First Subquery rewritten as CTE
WITH Sales AS
(
SELECT SalesPersonID
, SUM(TotalDue) AS TotalSales
, YEAR(OrderDate) AS SalesYear
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY SalesPersonID, YEAR(OrderDate)
),
-- Second Subquery rewritten as CTE
Sales_Quota AS
(
SELECT BusinessEntityID
, SUM(SalesQuota)AS SalesQuota
, YEAR(QuotaDate) AS SalesQuotaYear
FROM Sales.SalesPersonQuotaHistory
GROUP BY BusinessEntityID, YEAR(QuotaDate)
)
-- SELECT using multiple CTEs
SELECT SalesPersonID
, SalesYear
, TotalSales
, SalesQuotaYear
, SalesQuota
FROM Sales
JOIN Sales_Quota
ON Sales_Quota.BusinessEntityID = Sales.SalesPersonID
AND Sales_Quota.SalesQuotaYear = Sales.SalesYear
ORDER BY SalesPersonID, SalesYear;
In
Listing 4 I moved the two subqueries used in Listing 3 into two different CTEs.
The first CTE, defined with a name of Sales, contains the subquery that
produced the TotalSales amount by SalesPersonID and SalesYear. I defined the second
CTE, which is named SalesQuota, by placing a comma after the first CTE
defined. This second CTE contained the subquery to calculate the SalesQuota amount. After defining both CTEs I
then referenced them in my final SELECT statement.
Being
able to define multiple CTEs with a single WITH clause, and then referencing
these CTEs in my final TSQL statement allowed me to make my complicated TSQL
code in Listing 3 easier for me to read, develop and debug. Using multiple CTEs
for complicated TSQL logic allows you to break your code into manageable pieces
and parts.
CTE
referencing another CTE
A CTE
can reference another CTE. In order for a CTE to reference another CTE it needs
to:
·
be
defined within the same WITH clause as the CTE being referenced
·
be
defined after the CTE being referenced
To
demonstrate a CTE referencing another CTE let's review the code in Listing 5.
USE AdventureWorks2012;
GO
WITH
-- First Subquery rewritten as CTE
Sales AS
(
SELECT SalesPersonID
, SUM(TotalDue) AS TotalSales
, YEAR(OrderDate) AS SalesYear
FROM Sales.SalesOrderHeader
WHERE SalesPersonID IS NOT NULL
GROUP BY SalesPersonID, YEAR(OrderDate)
),
-- Second Subquery that references first CTE
TotalSales AS
(
SELECT SUM(TotalSales)AS TotalSales
, SalesYear
FROM Sales
GROUP BY SalesYear
)
-- Calling Second CTE that uses first CTE
SELECT *
FROM TotalSales
ORDER BY SalesYear;
In
Listing 5 I define a CTE named Sales that is referenced by the second CTE
definition. In the second CTE, named TotalSales, I reference the Sales
CTE. In this CTE I summarized the TotalSales column returned from the Sales CTE by SalesYear. At the end I used a SELECT statement
to reference just the second CTE. When this TotalSales CTE is executed it will then call the Sales CTE.
Example
of a Recursive CTE
Another
feature of a CTE is that it can call itself. When you do this the CTE is known
as a recursive CTE.
A
recursive CTE has two different parts called the anchor member and the
recursive member. The anchor member proceeds the recursive member in the CTE.
You can identify the anchor member queries as the queries without a reference
to the CTE. The recursive members will have a reference to the CTE in their
queries. The anchor member identifies that initial record set for the CTE. This
initial record set is then used by the recursive member. The recursive member
is the part that is connected to the anchor member using the UNION ALL clause,
and references the CTE in its query. The recursive member returns the same
number of columns as the anchor member and references the CTE. A recursive CTE
continues calling itself until the recursive part of the CTE doesn't return any
rows.
To
better understand a recursive CTE I will be creating some sample data for use
with a recursive CTE. You can find the code to create my sample data in Listing
6.
USE tempdb;
GO
-- Create an Employee table
CREATE TABLE dbo.Employee
(
EmpID smallint NOT NULL,
EmpName nvarchar(100) NOT NULL,
Position nvarchar(50) NOT NULL,
MgrId int NULL
);
-- Populate Employee Table with values
INSERT INTO dbo.Employee VALUES
(1, N'Joe Steel', N'President',NULL)
,(2, N'John Smith', N'VP Western Region Sales',1)
,(3, N'Sue Jones', N'VP Easter Region',1)
,(4, N'Lynn Holland', N'Sales Person',2)
,(5, N'Linda Thomas', N'Sales Person',3 )
,(6, N'Kathy Johnson', N'Admin Assistant',1)
,(7, N'Rich Little', N'Sales Person',3)
,(8, N'David Nelson', N'Sales Person', 2)
,(9, N'Mary Jackson', N'Sales Person', 3);
In
Listing 6 I created a table named Employee to contain information about
employees. This table contains 9 different employees. The MgrIDcolumn
identifies the manager to which an employee reports. There is one Employee record where the MgrID is NULL. This employee is the
president of the company and he reports to no one.
To
see how I can use the data created in Listing 6 in a recursive CTE, look at the
code in Listing 7.
USE tempdb;
GO
WITH ReportingStructure(MgrID, EmpID, EmpName, Position, OrgLevel) AS
(
-- Anchor part
SELECT MgrID, EmpID, EmpName, Position, 0 AS OrgLevel
FROM dbo.Employee
WHERE MgrID IS NULL
UNION ALL
-- Recursive part
SELECT e.MgrID, e.EmpID, e.EmpName
, e.Position, r.OrgLevel + 1
FROM dbo.Employee AS e
INNER JOIN ReportingStructure AS r
ON e.MgrID = r.EmpID
)
SELECT MgrID, EmpID, EmpName, Position, OrgLevel
FROM ReportingStructure;
When
I run the code in Listing 7 I get the output in Report 1.
MgrID EmpID EmpName Position OrgLevel
----- ------ -------------- -------------------------- -----------
NULL 1 Joe Steel President 0
1 2 John Smith VP Western Region Sales 1
1 3 Sue Jones VP Easter Region 1
1 6 Kathy Johnson Admin Assistant 1
2 4 Lynn Holland Sales Person 2
2 8 David Nelson Sales Person 2
3 5 Linda Thomas Sales Person 2
3 7 Rich Little Sales Person 2
3 9 Mary Jackson Sales Person 2
By
reviewing the output in Report 1 you can see the reporting structure of all
employees. Note that the OrgLevel column identifies the reporting
structure hierarchy. As you can see the employee that holds the “President”
position has an OrgLevel of 0. Each employee that
reports to the president has an OrgLevel of 1. If you look at the other
employees you can see they each have an OrgLevel of 2 because they report to
one of the Vice Presidents.
To
better understand how a recursive CTE works let me step you through the process
SQL Server uses when executing my recursive CTE in Listing 7. The first pass
through my recursive CTE SQL Server runs the anchor statement. When the anchor
statement is run it returns the employees that have a NULL MgrID value. In this example that
would be the employee with an EmpID of 1. This first pass through the CTE
also sets the Orglevel to 0. The second pass through my CTE
runs the recursive part of the CTE. In the recursive part of the CTE all
employees that have a MgrId equal to the employees found in the
first pass, or MgrId of 1 are returned. This second pass returns EmpID 2, 3 and 6 and increments the OrgLevel by 1. The third pass through my CTE
runs the recursive part of the CTE again. This time my CTE returns the
employees that have MgrID's equal to one of the EmpIDs returned from the second pass
through the CTE, which means the employees that have a MgrId of 2,3 or 6. The third pass
returns EmpIds 5, 7, 8 and 9. The forth pass
through my CTE the recursive part of the CTE looks forEmployee records that have a MgrID of 5, 7, 8, and 9. This pass returns
no values because there are no employees with MgrId value of 5, 7, 8 and 9. Since the
forth pass returns no records this ends the processing of my recursive CTE.
Controlling
Recursion
It is
possible to write a recursive CTE that is an infinite loop. SQL Server has a
default MAXRECURSION setting that will keep you from running a CTE that might
be an infinite loop. The default MAXRECURSION setting is 100.
The
code in Listing 8 is an example of a poorly written CTE that creates an
infinite loop.
USE tempdb;
GO
WITH InfiniteLoopCTE as
(
-- Anchor Part
SELECT EmpID, MgrID, Position
FROM dbo.Employee
WHERE MgrID = 1
UNION ALL
-- Recursive Part
SELECT InfiniteLoopCTE.EmpID
, InfiniteLoopCTE.MgrID
, InfiniteLoopCTE.Position
FROM InfiniteLoopCTE
JOIN dbo.Employee AS e
ON e.EmpID = InfiniteLoopCTE.MgrID
)
SELECT * FROM InfiniteLoopCTE;
The
code is Listing 8 causes an infinite loop because the recursive part of this
CTE will always return one or more rows. The recursive part of the query always
returns MrgID value of 1 because the recursive part of the query returns
InfiniteLoopCTE.MgrId, instead of the Employee.MgrID value. When I run the code
in Listing 8 it will run until the CTE has hit the MAXRECUSION default setting,
which is 100. If a CTE hits the MAXRECURSIVE setting SQL Server will return an
error when running the statement.
If
your CTE needs to process more than 100 recursive call then you can override
the default MAXRECUSION setting by using the MAXRECURSION query option, like I
have done in the code in Listing 9.
USE tempdb;
GO
--Creates an infinite loop
WITH InfiniteLoopCTE as
(
-- Anchor Part
SELECT EmpID, MgrID, Position
FROM dbo.Employee
WHERE MgrID = 1
UNION ALL
-- Recursive Part
SELECT InfiniteLoopCTE.EmpID
, InfiniteLoopCTE.MgrID
, InfiniteLoopCTE.Position
FROM InfiniteLoopCTE
JOIN dbo.Employee AS e
ON e.EmpID = InfiniteLoopCTE.MgrID
)
SELECT * FROM InfiniteLoopCTE OPTION (MAXRECURSION 150);
In
Listing 9 I overrode the default MAXRECUSION setting by adding the query option
“MAXRECURSION 150”. This allows my infinite loop CTE to process 150 recursive
calls before the SQL Server engine stops my recursive CTE with an error.
The
maximum value you can set the MAXRECURSION option is 32,767. If for some reason
you have a recursive CTE that needs to process to more than 32,767 recursive
CTE calls then you can do this by setting the MAXRECURSION value to 0. By using
0 for a MAXRECURSION setting, you are allowing the database engine to run as
many iteration as it needs to complete the recursive part of the CTE. If your
code is not well tested, you could end up with a truly infinity loop that will
never stop.
--Reference Microsoft
No comments:
Post a Comment