Comman table expression CTE

Defining and Using CTE's

Microsoft introduces a different way of writing a statement called CTE , using CTE you can write statement and nameed a TSQL SELECT statement and then reference same named statement later on similar to referencing a table or views .

Syntax for defining a CTE:

WITH <expression_name> (Column1, Column2, …) AS (CTE Definition)

Where:

"<expression_name>" is the name of the CTE

"Column 1, Column2,…" are the names of the columns returned from the CTE (only required if column names are not provided, or are not unique)

"CTE Definition" is a SELECT that defines the result set of the CTE.

A CTE definition needs to be followed by a single INSERT, UPDATE, DELETE, or SELECT statement that references the CTE. If the CTE is part of a batch then the statement prior to the WITH clause must end with a semi-colon. When you define multiple CTE's, a CTE that references another CTE needs to be define after the CTE it is referencing.

why and when should use cte

When should you use a CTE? We should use CTE for various reason but some important reason is

ü If your query needs a recursive solution

ü If your query has multiple subqueries, or you have repeated the same subquery code in a single statement

ü If your query is large and complex

ü When your TSQL code is complex and/or contains one or more subqueries it become very hard to read. By breaking up your complex code into one of more CTEs you will make your code much easier to follow, maintain and debug.

what is the limitation of cte

These are the clauses which can’t be used with CTE

ü ORDER BY (unless used with TOP clause)

ü INTO

ü OPTION clause with query hints

ü FOR XML

ü FOR BROWSE

Recursive CTE's

A recursive process is a process that called itself. In each iteration returns a subset of the result. Recursive process keeps calling itself until a condition is found that indicate it should stop or exit. In last result set from a recursive CTE in SQL server is the joined of all the result sets from all the calls of the recursive query. A CTE can reference itself. When this is done it is called a recursive CTE. A recursive CTE contains at least two query definitions, which are SELECT statements. One query definition is known as the anchor member and the other query definition is known as the recursive member.

An anchor member query definition will not reference the CTE whereas the recursive member will reference the CTE. Additionally the anchor member queries need to appear in the CTE prior to the recursive member queries. The number of columns returned from the anchor member must be the same number of columns returned from the recursive member. There can be multiple anchor query definitions where each one is combined with the others using one of these operators: UNION ALL, UNION, INTERSECT, or EXCEPT. There can also be multiple recursive query definitions. Each recursive query definition must be combined using a UNION ALL operation. The UNION ALL operation is also needed to connect the last anchor query definition with the first recursive query definition.

Let's review a few examples of CTE's.

CTE's provide you with a method to more easily document your complex code for readability.

Listing 1.

USE AdventureWorks2012;

SELECT YearMonth, ProductID, SumLineTotal

FROM (

SELECT CONVERT(CHAR(7),ModifiedDate,120) AS YearMonth

, ProductID

, SUM(LineTotal) AS SumLineTotal

FROM Sales.SalesOrderDetail

GROUP BY ProductId, CONVERT(CHAR(7),ModifiedDate,120)

) MonthlyProductSales

WHERE YearMonth = '2008-06';

Listing 1: Code with SubQuery

Listing 1 has a SELECT statement that has a subquery submerged within the FROM clause. The subquery is a derived table with the alias MonthlyProductSales, which summarizes the LineTotal amount for each Year/Month combination of the ModifiedDate. From the results of myMonthlyProductSales subquery I constrained the results to only return those rows that have a year and month of June, 2008.

Although the code in listing 1 is fairly simple we can improve on its readability by rewriting it using a CTE

Listing 2.

USE AdventureWorks2012;

GO

-- CTE Definition

WITH MonthlyProductSales AS

SELECT CONVERT(CHAR(7),ModifiedDate,120) AS YearMonth

, ProductID

, SUM(LineTotal) AS SumLineTotal

FROM Sales.SalesOrderDetail

GROUP BY ProductId, CONVERT(CHAR(7),ModifiedDate,120)

-- SELECT statement referencing CTE

SELECT * FROM MonthlyProductSales

WHERE YearMonth = '2008-06';

Listing 2: Rewrite of Listing 1 code using a CTE

In Listing 2 I moved to derived table subquery in Listing 1 up into a CTE named MonthlyProductSales. I then replaced the subquery in my SELECT statement with the name of the CTE, in this case MonthlyProductSales. By moving the subquery in Listing 1, into a CTE definition, many people will find the code in Listing 2 easier to read and maintain.

Example of Using Multiple CTE's

If your code is more complex and contains multiple subqueries you could consider rewriting it to simplify maintenance and to make it more readable. One of those way to rewrite it would be to take all the subqueries and rewrite them as CTEs.

To demonstrate using more than one CTE in a single SELECT statement, suppose I had the non-CTE query in Listing 3:

USE AdventureWorks2012;

GO

SELECT SalesPersonID

  , SalesYear

  , TotalSales

  , SalesQuotaYear

  , SalesQuota

FROM ( -- First Subquery

       SELECT SalesPersonID

            , SUM(TotalDue) AS TotalSales

            , YEAR(OrderDate) AS SalesYear

       FROM Sales.SalesOrderHeader

       WHERE SalesPersonID IS NOT NULL

       GROUP BY SalesPersonID, YEAR(OrderDate)

                    ) AS Sales

JOIN ( -- Second Subquery

       SELECT BusinessEntityID

            , SUM(SalesQuota)AS SalesQuota

            , YEAR(QuotaDate) AS SalesQuotaYear

       FROM Sales.SalesPersonQuotaHistory

       GROUP BY BusinessEntityID, YEAR(QuotaDate)

                    ) AS Sales_Quota

ON Sales_Quota.BusinessEntityID = Sales.SalesPersonID

AND Sales_Quota.SalesQuotaYear = Sales.SalesYear

ORDER BY SalesPersonID, SalesYear;

Listing 3: Using multiple Subqueries in a single SELECT statement

The query in Listing 3 contains two different subqueries, both implemented as derived tables. The first subquery calculates the TotalSales bySalesPersonID and SalesYear. The SalesQuota aggregated value, based on BusinessEntityID and SalesQuotaYear, is calculated in second subquery. To simplify reading the query in Listing 3, we can rewrite the subqueries as two different CTEs and then reference those CTEs in a simple SELECT statement, as I have done in Listing 4.

USE AdventureWorks2012;

GO

WITH

-- First Subquery rewritten as CTE

WITH Sales AS

    SELECT SalesPersonID

         , SUM(TotalDue) AS TotalSales

         , YEAR(OrderDate) AS SalesYear

    FROM Sales.SalesOrderHeader

    WHERE SalesPersonID IS NOT NULL

    GROUP BY SalesPersonID, YEAR(OrderDate)

),

-- Second Subquery rewritten as CTE

Sales_Quota AS

    SELECT BusinessEntityID

         , SUM(SalesQuota)AS SalesQuota

         , YEAR(QuotaDate) AS SalesQuotaYear

    FROM Sales.SalesPersonQuotaHistory

    GROUP BY BusinessEntityID, YEAR(QuotaDate)

-- SELECT using multiple CTEs

SELECT SalesPersonID

     , SalesYear

     , TotalSales

     , SalesQuotaYear

     , SalesQuota

FROM Sales

  JOIN Sales_Quota

    ON Sales_Quota.BusinessEntityID = Sales.SalesPersonID

        AND Sales_Quota.SalesQuotaYear = Sales.SalesYear

ORDER BY SalesPersonID, SalesYear;

Listing 4: Using two CTEs instead of subqueries

In Listing 4 I moved the two subqueries used in Listing 3 into two different CTEs. The first CTE, defined with a name of Sales, contains the subquery that produced the TotalSales amount by SalesPersonID and SalesYear. I defined the second CTE, which is named SalesQuota, by placing a comma after the first CTE defined. This second CTE contained the subquery to calculate the SalesQuota amount. After defining both CTEs I then referenced them in my final SELECT statement.

Being able to define multiple CTEs with a single WITH clause, and then referencing these CTEs in my final TSQL statement allowed me to make my complicated TSQL code in Listing 3 easier for me to read, develop and debug. Using multiple CTEs for complicated TSQL logic allows you to break your code into manageable pieces and parts.

CTE referencing another CTE

A CTE can reference another CTE. In order for a CTE to reference another CTE it needs to:

· be defined within the same WITH clause as the CTE being referenced

· be defined after the CTE being referenced

To demonstrate a CTE referencing another CTE let's review the code in Listing 5.

USE AdventureWorks2012;

GO

WITH

-- First Subquery rewritten as CTE

Sales AS

    SELECT SalesPersonID

         , SUM(TotalDue) AS TotalSales

         , YEAR(OrderDate) AS SalesYear

    FROM Sales.SalesOrderHeader

    WHERE SalesPersonID IS NOT NULL

    GROUP BY SalesPersonID, YEAR(OrderDate)

),

-- Second Subquery that references first CTE

TotalSales AS

    SELECT SUM(TotalSales)AS TotalSales

         , SalesYear

    FROM Sales

    GROUP BY SalesYear

-- Calling Second CTE that uses first CTE

SELECT *

FROM TotalSales

ORDER BY SalesYear;

Listing 5: CTE referencing another CTE

In Listing 5 I define a CTE named Sales that is referenced by the second CTE definition. In the second CTE, named TotalSales, I reference the Sales CTE. In this CTE I summarized the TotalSales column returned from the Sales CTE by SalesYear. At the end I used a SELECT statement to reference just the second CTE. When this TotalSales CTE is executed it will then call the Sales CTE.

Example of a Recursive CTE

Another feature of a CTE is that it can call itself. When you do this the CTE is known as a recursive CTE.

A recursive CTE has two different parts called the anchor member and the recursive member. The anchor member proceeds the recursive member in the CTE. You can identify the anchor member queries as the queries without a reference to the CTE. The recursive members will have a reference to the CTE in their queries. The anchor member identifies that initial record set for the CTE. This initial record set is then used by the recursive member. The recursive member is the part that is connected to the anchor member using the UNION ALL clause, and references the CTE in its query. The recursive member returns the same number of columns as the anchor member and references the CTE. A recursive CTE continues calling itself until the recursive part of the CTE doesn't return any rows.

To better understand a recursive CTE I will be creating some sample data for use with a recursive CTE. You can find the code to create my sample data in Listing 6.

USE tempdb;

GO

-- Create an Employee table

CREATE TABLE dbo.Employee

EmpID smallint NOT NULL,

EmpName  nvarchar(100) NOT NULL,

Position nvarchar(50) NOT NULL,

MgrId int NULL

);

-- Populate Employee Table with values

INSERT INTO dbo.Employee VALUES

 (1, N'Joe Steel', N'President',NULL)

,(2, N'John Smith', N'VP Western Region Sales',1)

,(3, N'Sue Jones', N'VP Easter Region',1)

,(4, N'Lynn Holland', N'Sales Person',2)

,(5, N'Linda Thomas', N'Sales Person',3 )

,(6, N'Kathy Johnson', N'Admin Assistant',1)

,(7, N'Rich Little', N'Sales Person',3)

,(8,  N'David Nelson', N'Sales Person', 2)

,(9,  N'Mary Jackson', N'Sales Person', 3);

Listing 6: Script to create sample data for recursive CTE example

In Listing 6 I created a table named Employee to contain information about employees. This table contains 9 different employees. The MgrIDcolumn identifies the manager to which an employee reports. There is one Employee record where the MgrID is NULL. This employee is the president of the company and he reports to no one.

To see how I can use the data created in Listing 6 in a recursive CTE, look at the code in Listing 7.

USE tempdb;

GO

WITH ReportingStructure(MgrID, EmpID, EmpName, Position, OrgLevel) AS

    -- Anchor part

    SELECT MgrID, EmpID, EmpName, Position, 0 AS OrgLevel

    FROM dbo.Employee

    WHERE MgrID IS NULL

    UNION ALL

    -- Recursive part

    SELECT e.MgrID, e.EmpID, e.EmpName

                         , e.Position, r.OrgLevel + 1

    FROM dbo.Employee AS e

    INNER JOIN ReportingStructure AS r

    ON e.MgrID = r.EmpID

SELECT MgrID, EmpID, EmpName, Position, OrgLevel

FROM ReportingStructure;

Listing 7: Recursive CTE

When I run the code in Listing 7 I get the output in Report 1.

MgrID EmpID  EmpName        Position                   OrgLevel

----- ------ -------------- -------------------------- -----------

NULL  1      Joe Steel      President                  0

1     2      John Smith     VP Western Region Sales    1

1     3      Sue Jones      VP Easter Region           1

1     6      Kathy Johnson  Admin Assistant            1

2     4      Lynn Holland   Sales Person               2

2     8      David Nelson   Sales Person               2

3     5      Linda Thomas   Sales Person               2

3     7      Rich Little    Sales Person               2

3     9      Mary Jackson   Sales Person               2

Report 1: Output from running code in Listing 7

By reviewing the output in Report 1 you can see the reporting structure of all employees. Note that the OrgLevel column identifies the reporting structure hierarchy. As you can see the employee that holds the “President” position has an OrgLevel of 0. Each employee that reports to the president has an OrgLevel of 1. If you look at the other employees you can see they each have an OrgLevel of 2 because they report to one of the Vice Presidents.

To better understand how a recursive CTE works let me step you through the process SQL Server uses when executing my recursive CTE in Listing 7. The first pass through my recursive CTE SQL Server runs the anchor statement. When the anchor statement is run it returns the employees that have a NULL MgrID value. In this example that would be the employee with an EmpID of 1. This first pass through the CTE also sets the Orglevel to 0. The second pass through my CTE runs the recursive part of the CTE. In the recursive part of the CTE all employees that have a MgrId equal to the employees found in the first pass, or MgrId of 1 are returned. This second pass returns EmpID 2, 3 and 6 and increments the OrgLevel by 1. The third pass through my CTE runs the recursive part of the CTE again. This time my CTE returns the employees that have MgrID's equal to one of the EmpIDs returned from the second pass through the CTE, which means the employees that have a MgrId of 2,3 or 6. The third pass returns EmpIds 5, 7, 8 and 9. The forth pass through my CTE the recursive part of the CTE looks forEmployee records that have a MgrID of 5, 7, 8, and 9. This pass returns no values because there are no employees with MgrId value of 5, 7, 8 and 9. Since the forth pass returns no records this ends the processing of my recursive CTE.

Controlling Recursion

It is possible to write a recursive CTE that is an infinite loop. SQL Server has a default MAXRECURSION setting that will keep you from running a CTE that might be an infinite loop. The default MAXRECURSION setting is 100.

The code in Listing 8 is an example of a poorly written CTE that creates an infinite loop.

USE tempdb;

GO

WITH InfiniteLoopCTE as

    -- Anchor Part

    SELECT EmpID, MgrID, Position

    FROM dbo.Employee

    WHERE MgrID = 1

  UNION ALL

    -- Recursive Part

    SELECT InfiniteLoopCTE.EmpID

                         , InfiniteLoopCTE.MgrID

           , InfiniteLoopCTE.Position

    FROM InfiniteLoopCTE

    JOIN  dbo.Employee AS e

        ON e.EmpID = InfiniteLoopCTE.MgrID

SELECT * FROM InfiniteLoopCTE;

Listing 8: Infinite loop CTE

The code is Listing 8 causes an infinite loop because the recursive part of this CTE will always return one or more rows. The recursive part of the query always returns MrgID value of 1 because the recursive part of the query returns InfiniteLoopCTE.MgrId, instead of the Employee.MgrID value. When I run the code in Listing 8 it will run until the CTE has hit the MAXRECUSION default setting, which is 100. If a CTE hits the MAXRECURSIVE setting SQL Server will return an error when running the statement.

If your CTE needs to process more than 100 recursive call then you can override the default MAXRECUSION setting by using the MAXRECURSION query option, like I have done in the code in Listing 9.

USE tempdb;

GO

--Creates an infinite loop

WITH InfiniteLoopCTE as

    -- Anchor Part

    SELECT EmpID, MgrID, Position

    FROM dbo.Employee

    WHERE MgrID = 1

  UNION ALL

    -- Recursive Part

    SELECT InfiniteLoopCTE.EmpID

                         , InfiniteLoopCTE.MgrID

           , InfiniteLoopCTE.Position

    FROM InfiniteLoopCTE

                        JOIN  dbo.Employee AS e

        ON e.EmpID = InfiniteLoopCTE.MgrID

SELECT * FROM InfiniteLoopCTE OPTION (MAXRECURSION 150);

Listing 9: Setting the MAXRECURSION option

In Listing 9 I overrode the default MAXRECUSION setting by adding the query option “MAXRECURSION 150”. This allows my infinite loop CTE to process 150 recursive calls before the SQL Server engine stops my recursive CTE with an error.

The maximum value you can set the MAXRECURSION option is 32,767. If for some reason you have a recursive CTE that needs to process to more than 32,767 recursive CTE calls then you can do this by setting the MAXRECURSION value to 0. By using 0 for a MAXRECURSION setting, you are allowing the database engine to run as many iteration as it needs to complete the recursive part of the CTE. If your code is not well tested, you could end up with a truly infinity loop that will never stop.

--Reference Microsoft

Pages

Wednesday, September 2, 2015

CTE (Comman table expression)