Wednesday, September 2, 2015

CTE (Comman table expression)

Comman table expression CTE


Defining and Using CTE's


Microsoft introduces a different way of writing a statement called CTE , using  CTE you can write statement and nameed a TSQL SELECT statement and then reference same  named statement later on similar to referencing a table or views .
Syntax for defining a CTE:
WITH <expression_name> (Column1, Column2, …) AS (CTE Definition)
Where:
"<expression_name>" is the name of the CTE
"Column 1, Column2,…" are the names of the columns returned from the CTE (only required if column names are not provided, or are not unique)
"CTE Definition" is a SELECT that defines the result set of the CTE.
A CTE definition needs to be followed by a single INSERT, UPDATE, DELETE, or SELECT statement that references the CTE. If the CTE is part of a batch then the statement prior to the WITH clause must end with a semi-colon. When you define multiple CTE's, a CTE that references another CTE needs to be define after the CTE it is referencing.

why and when should use cte

When should you use a CTE? We should use CTE for various reason but some important reason is
ü  If your query needs a recursive solution
ü  If your query has multiple subqueries, or you have repeated the same subquery code in a single statement
ü  If your query is large and complex
ü  When your TSQL code is complex and/or contains one or more subqueries it become very hard to read. By breaking up your complex code into one of more CTEs you will make your code much easier to follow, maintain and debug.

what is the limitation of cte

These are the clauses which can’t be used with CTE
ü  ORDER BY (unless used with TOP clause)
ü  INTO
ü  OPTION clause with query hints
ü  FOR XML
ü  FOR BROWSE

Recursive CTE's

A recursive process is a process that called itself. In each iteration returns a subset of the result. Recursive process keeps calling itself until a condition is found that indicate it should stop or exit. In last result set from a recursive CTE in SQL server is the joined of all the result sets from all the calls of the recursive query. A CTE can reference itself. When this is done it is called a recursive CTE. A recursive CTE contains at least two query definitions, which are SELECT statements. One query definition is known as the anchor member and the other query definition is known as the recursive member.
 An anchor member query definition will not reference the CTE whereas the recursive member will reference the CTE. Additionally the anchor member queries need to appear in the CTE prior to the recursive member queries. The number of columns returned from the anchor member must be the same number of columns returned from the recursive member. There can be multiple anchor query definitions where each one is combined with the others using one of these operators: UNION ALL, UNION, INTERSECT, or EXCEPT. There can also be multiple recursive query definitions. Each recursive query definition must be combined using a UNION ALL operation. The UNION ALL operation is also needed to connect the last anchor query definition with the first recursive query definition.
Let's review a few examples of CTE's.
CTE's provide you with a method to more easily document your complex code for readability.
Listing 1.
USE AdventureWorks2012;
GO
SELECT YearMonth, ProductID, SumLineTotal
FROM (
                    SELECT CONVERT(CHAR(7),ModifiedDate,120) AS YearMonth
                                          , ProductID
                                          , SUM(LineTotal) AS SumLineTotal
                    FROM Sales.SalesOrderDetail
                    GROUP BY ProductId, CONVERT(CHAR(7),ModifiedDate,120)
                    ) MonthlyProductSales
WHERE YearMonth = '2008-06';
Listing 1: Code with SubQuery
Listing 1 has a SELECT statement that has a subquery submerged within the FROM clause. The subquery is a derived table with the alias MonthlyProductSales, which summarizes the LineTotal amount for each Year/Month combination of the ModifiedDate. From the results of myMonthlyProductSales subquery I constrained the results to only return those rows that have a year and month of June, 2008.
Although the code in listing 1 is fairly simple we can improve on its readability by rewriting it using a CTE
Listing 2.
USE AdventureWorks2012;
GO
-- CTE Definition
WITH MonthlyProductSales AS
(
SELECT CONVERT(CHAR(7),ModifiedDate,120) AS YearMonth
, ProductID
, SUM(LineTotal) AS SumLineTotal
FROM Sales.SalesOrderDetail
GROUP BY ProductId, CONVERT(CHAR(7),ModifiedDate,120)
)
-- SELECT statement referencing CTE
SELECT * FROM MonthlyProductSales
WHERE YearMonth = '2008-06';
Listing 2: Rewrite of Listing 1 code using a CTE
In Listing 2 I moved to derived table subquery in Listing 1 up into a CTE named MonthlyProductSales. I then replaced the subquery in my SELECT statement with the name of the CTE, in this case MonthlyProductSales. By moving the subquery in Listing 1, into a CTE definition, many people will find the code in Listing 2 easier to read and maintain.

Example of Using Multiple CTE's

If your code is more complex and contains multiple subqueries you could consider rewriting it to simplify maintenance and to make it more readable. One of those way to rewrite it would be to take all the subqueries and rewrite them as CTEs.
To demonstrate using more than one CTE in a single SELECT statement, suppose I had the non-CTE query in Listing 3:
USE AdventureWorks2012;
GO
SELECT SalesPersonID
  , SalesYear
  , TotalSales
  , SalesQuotaYear
  , SalesQuota
FROM ( -- First Subquery
       SELECT SalesPersonID
            , SUM(TotalDue) AS TotalSales
            , YEAR(OrderDate) AS SalesYear
       FROM Sales.SalesOrderHeader
       WHERE SalesPersonID IS NOT NULL
       GROUP BY SalesPersonID, YEAR(OrderDate)
                    ) AS Sales 
JOIN ( -- Second Subquery
       SELECT BusinessEntityID
            , SUM(SalesQuota)AS SalesQuota
            , YEAR(QuotaDate) AS SalesQuotaYear
       FROM Sales.SalesPersonQuotaHistory
       GROUP BY BusinessEntityID, YEAR(QuotaDate)
                    ) AS Sales_Quota 
ON Sales_Quota.BusinessEntityID = Sales.SalesPersonID
AND Sales_Quota.SalesQuotaYear = Sales.SalesYear   
ORDER BY SalesPersonID, SalesYear;
Listing 3: Using multiple Subqueries in a single SELECT statement
The query in Listing 3 contains two different subqueries, both implemented as derived tables. The first subquery calculates the TotalSales bySalesPersonID and SalesYear. The SalesQuota aggregated value, based on BusinessEntityID and SalesQuotaYear, is calculated in second subquery. To simplify reading the query in Listing 3, we can rewrite the subqueries as two different CTEs and then reference those CTEs in a simple SELECT statement, as I have done in Listing 4.
USE AdventureWorks2012;
GO
WITH 
-- First Subquery rewritten as CTE
WITH Sales AS
(
    SELECT SalesPersonID
         , SUM(TotalDue) AS TotalSales
         , YEAR(OrderDate) AS SalesYear
    FROM Sales.SalesOrderHeader
    WHERE SalesPersonID IS NOT NULL
    GROUP BY SalesPersonID, YEAR(OrderDate)
),
-- Second Subquery rewritten as CTE
Sales_Quota AS
( 
    SELECT BusinessEntityID
         , SUM(SalesQuota)AS SalesQuota
         , YEAR(QuotaDate) AS SalesQuotaYear
    FROM Sales.SalesPersonQuotaHistory
    GROUP BY BusinessEntityID, YEAR(QuotaDate)
)  
-- SELECT using multiple CTEs
SELECT SalesPersonID
     , SalesYear
     , TotalSales
     , SalesQuotaYear
     , SalesQuota
FROM Sales
  JOIN Sales_Quota 
    ON Sales_Quota.BusinessEntityID = Sales.SalesPersonID
        AND Sales_Quota.SalesQuotaYear = Sales.SalesYear  
ORDER BY SalesPersonID, SalesYear;
Listing 4: Using two CTEs instead of subqueries
In Listing 4 I moved the two subqueries used in Listing 3 into two different CTEs. The first CTE, defined with a name of Sales, contains the subquery that produced the TotalSales amount by SalesPersonID and SalesYear. I defined the second CTE, which is named SalesQuota, by placing a comma after the first CTE defined. This second CTE contained the subquery to calculate the SalesQuota amount. After defining both CTEs I then referenced them in my final SELECT statement.
Being able to define multiple CTEs with a single WITH clause, and then referencing these CTEs in my final TSQL statement allowed me to make my complicated TSQL code in Listing 3 easier for me to read, develop and debug. Using multiple CTEs for complicated TSQL logic allows you to break your code into manageable pieces and parts.

CTE referencing another CTE

A CTE can reference another CTE. In order for a CTE to reference another CTE it needs to:
·         be defined within the same WITH clause as the CTE being referenced
·         be defined after the CTE being referenced
To demonstrate a CTE referencing another CTE let's review the code in Listing 5.
USE AdventureWorks2012;
GO
WITH 
-- First Subquery rewritten as CTE
Sales AS
(
    SELECT SalesPersonID
         , SUM(TotalDue) AS TotalSales
         , YEAR(OrderDate) AS SalesYear
    FROM Sales.SalesOrderHeader
    WHERE SalesPersonID IS NOT NULL
    GROUP BY SalesPersonID, YEAR(OrderDate)
),
-- Second Subquery that references first CTE
TotalSales AS
( 
    SELECT SUM(TotalSales)AS TotalSales
         , SalesYear
    FROM Sales
    GROUP BY SalesYear
)  
-- Calling Second CTE that uses first CTE
SELECT *
FROM TotalSales
ORDER BY SalesYear;
Listing 5: CTE referencing another CTE
In Listing 5 I define a CTE named Sales that is referenced by the second CTE definition. In the second CTE, named TotalSales, I reference the Sales CTE. In this CTE I summarized the TotalSales column returned from the Sales CTE by SalesYear. At the end I used a SELECT statement to reference just the second CTE. When this TotalSales CTE is executed it will then call the Sales CTE.

Example of a Recursive CTE

Another feature of a CTE is that it can call itself. When you do this the CTE is known as a recursive CTE.
A recursive CTE has two different parts called the anchor member and the recursive member. The anchor member proceeds the recursive member in the CTE. You can identify the anchor member queries as the queries without a reference to the CTE. The recursive members will have a reference to the CTE in their queries. The anchor member identifies that initial record set for the CTE. This initial record set is then used by the recursive member. The recursive member is the part that is connected to the anchor member using the UNION ALL clause, and references the CTE in its query. The recursive member returns the same number of columns as the anchor member and references the CTE. A recursive CTE continues calling itself until the recursive part of the CTE doesn't return any rows.
To better understand a recursive CTE I will be creating some sample data for use with a recursive CTE. You can find the code to create my sample data in Listing 6.
USE tempdb;
GO
-- Create an Employee table 
CREATE TABLE dbo.Employee
(
EmpID smallint NOT NULL,
EmpName  nvarchar(100) NOT NULL,
Position nvarchar(50) NOT NULL,
MgrId int NULL
);
-- Populate Employee Table with values
INSERT INTO dbo.Employee VALUES 
 (1, N'Joe Steel', N'President',NULL)
,(2, N'John Smith', N'VP Western Region Sales',1)
,(3, N'Sue Jones', N'VP Easter Region',1)
,(4, N'Lynn Holland', N'Sales Person',2)
,(5, N'Linda Thomas', N'Sales Person',3 )
,(6, N'Kathy Johnson', N'Admin Assistant',1)
,(7, N'Rich Little', N'Sales Person',3)
,(8,  N'David Nelson', N'Sales Person', 2)
,(9,  N'Mary Jackson', N'Sales Person', 3);
Listing 6: Script to create sample data for recursive CTE example
In Listing 6 I created a table named Employee to contain information about employees. This table contains 9 different employees. The MgrIDcolumn identifies the manager to which an employee reports. There is one Employee record where the MgrID is NULL. This employee is the president of the company and he reports to no one.
To see how I can use the data created in Listing 6 in a recursive CTE, look at the code in Listing 7.
USE tempdb;
GO
WITH ReportingStructure(MgrID, EmpID, EmpName, Position, OrgLevel) AS 
(
    -- Anchor part 
    SELECT MgrID, EmpID, EmpName, Position, 0 AS OrgLevel
    FROM dbo.Employee 
    WHERE MgrID IS NULL
    UNION ALL
    -- Recursive part
    SELECT e.MgrID, e.EmpID, e.EmpName
                         , e.Position, r.OrgLevel + 1
    FROM dbo.Employee AS e
    INNER JOIN ReportingStructure AS r
    ON e.MgrID = r.EmpID 
)
SELECT MgrID, EmpID, EmpName, Position, OrgLevel 
FROM ReportingStructure;
Listing 7: Recursive CTE
When I run the code in Listing 7 I get the output in Report 1.
MgrID EmpID  EmpName        Position                   OrgLevel
----- ------ -------------- -------------------------- -----------
NULL  1      Joe Steel      President                  0
1     2      John Smith     VP Western Region Sales    1
1     3      Sue Jones      VP Easter Region           1
1     6      Kathy Johnson  Admin Assistant            1
2     4      Lynn Holland   Sales Person               2
2     8      David Nelson   Sales Person               2
3     5      Linda Thomas   Sales Person               2
3     7      Rich Little    Sales Person               2
3     9      Mary Jackson   Sales Person               2
Report 1: Output from running code in Listing 7
By reviewing the output in Report 1 you can see the reporting structure of all employees. Note that the OrgLevel column identifies the reporting structure hierarchy. As you can see the employee that holds the “President” position has an OrgLevel of 0. Each employee that reports to the president has an OrgLevel of 1. If you look at the other employees you can see they each have an OrgLevel of 2 because they report to one of the Vice Presidents.
To better understand how a recursive CTE works let me step you through the process SQL Server uses when executing my recursive CTE in Listing 7. The first pass through my recursive CTE SQL Server runs the anchor statement. When the anchor statement is run it returns the employees that have a NULL MgrID value. In this example that would be the employee with an EmpID of 1. This first pass through the CTE also sets the Orglevel to 0. The second pass through my CTE runs the recursive part of the CTE. In the recursive part of the CTE all employees that have a MgrId equal to the employees found in the first pass, or MgrId of 1 are returned. This second pass returns EmpID 2, 3 and 6 and increments the OrgLevel by 1. The third pass through my CTE runs the recursive part of the CTE again. This time my CTE returns the employees that have MgrID's equal to one of the EmpIDs returned from the second pass through the CTE, which means the employees that have a MgrId of 2,3 or 6. The third pass returns EmpIds 5, 7, 8 and 9. The forth pass through my CTE the recursive part of the CTE looks forEmployee records that have a MgrID of 5, 7, 8, and 9. This pass returns no values because there are no employees with MgrId value of 5, 7, 8 and 9. Since the forth pass returns no records this ends the processing of my recursive CTE.

Controlling Recursion

It is possible to write a recursive CTE that is an infinite loop. SQL Server has a default MAXRECURSION setting that will keep you from running a CTE that might be an infinite loop. The default MAXRECURSION setting is 100.
The code in Listing 8 is an example of a poorly written CTE that creates an infinite loop.
USE tempdb;
GO
 
WITH InfiniteLoopCTE as
(
    -- Anchor Part
    SELECT EmpID, MgrID, Position
    FROM dbo.Employee
    WHERE MgrID = 1  
  UNION ALL
    -- Recursive Part
    SELECT InfiniteLoopCTE.EmpID
                         , InfiniteLoopCTE.MgrID
           , InfiniteLoopCTE.Position
    FROM InfiniteLoopCTE  
    JOIN  dbo.Employee AS e 
        ON e.EmpID = InfiniteLoopCTE.MgrID 
)
SELECT * FROM InfiniteLoopCTE;
Listing 8: Infinite loop CTE
The code is Listing 8 causes an infinite loop because the recursive part of this CTE will always return one or more rows. The recursive part of the query always returns MrgID value of 1 because the recursive part of the query returns InfiniteLoopCTE.MgrId, instead of the Employee.MgrID value. When I run the code in Listing 8 it will run until the CTE has hit the MAXRECUSION default setting, which is 100. If a CTE hits the MAXRECURSIVE setting SQL Server will return an error when running the statement.
If your CTE needs to process more than 100 recursive call then you can override the default MAXRECUSION setting by using the MAXRECURSION query option, like I have done in the code in Listing 9.
USE tempdb;
GO
--Creates an infinite loop
WITH InfiniteLoopCTE as
(
    -- Anchor Part
    SELECT EmpID, MgrID, Position
    FROM dbo.Employee
    WHERE MgrID = 1  
  UNION ALL
    -- Recursive Part
    SELECT InfiniteLoopCTE.EmpID
                         , InfiniteLoopCTE.MgrID
           , InfiniteLoopCTE.Position
    FROM InfiniteLoopCTE  
                    
                        JOIN  dbo.Employee AS e 
        ON e.EmpID = InfiniteLoopCTE.MgrID 
)
SELECT * FROM InfiniteLoopCTE OPTION (MAXRECURSION 150);
Listing 9: Setting the MAXRECURSION option
In Listing 9 I overrode the default MAXRECUSION setting by adding the query option “MAXRECURSION 150”. This allows my infinite loop CTE to process 150 recursive calls before the SQL Server engine stops my recursive CTE with an error.
The maximum value you can set the MAXRECURSION option is 32,767. If for some reason you have a recursive CTE that needs to process to more than 32,767 recursive CTE calls then you can do this by setting the MAXRECURSION value to 0. By using 0 for a MAXRECURSION setting, you are allowing the database engine to run as many iteration as it needs to complete the recursive part of the CTE. If your code is not well tested, you could end up with a truly infinity loop that will never stop.

--Reference Microsoft
                                                                     

No comments:

Post a Comment