What happens when a Query execute against a table with a covering index and another query with the same index structure but without the included columns?
“The covering index gives you the
knack to avoid the trip back to the table for satisfying the request, as all of
the columns requested already included in non-clustered index. It means you are
able to escape the logical or physical reads to trip back to read the table for
that information.”
But we can’t ensure this is not enough avoidance of logical or physical
reads. There is also the consideration of the necessary task between the
columns in the non-clustered index and the columns that need to be looked-up in
the table, you can say putting data together.
To explain this let's create two
identical tables; same schema, same data:
Use TSQL2012
CREATE TABLE TSQL2012.dbo.EmplyeeDetails1
(
pkid INT
IDENTITY(1,1) NOT NULL,
EmployeeId VARCHAR(20)
NOT NULL,
firstname VARCHAR(50) NOT NULL,
lastname VARCHAR(50) NOT NULL,
address1 VARCHAR(100) NULL,
address2 VARCHAR(100) NULL,
city VARCHAR(100) NULL,
stateregion VARCHAR(100) NULL,
Areapostal_code VARCHAR(10) NULL,
Empcompanyid VARCHAR(20) NULL,
notes VARCHAR(200) NULL,
registration_notes VARCHAR(200) NULL
);
GO
ALTER TABLE dbo.EmplyeeDetails1 ADD CONSTRAINT
PK_EmplyeeDetails1 PRIMARY KEY CLUSTERED
(
pkid
)
WITH
(
PAD_INDEX
= OFF
,
FILLFACTOR =
100
,
STATISTICS_NORECOMPUTE =
OFF
,
IGNORE_DUP_KEY =
OFF
,
ALLOW_ROW_LOCKS =
ON
,
ALLOW_PAGE_LOCKS =
ON
) ON [PRIMARY];
GO
CREATE TABLE TSQL2012.dbo.EmplyeeDetails2
(
pkid INT
IDENTITY(1,1) NOT NULL,
EmployeeId VARCHAR(20)
NOT NULL,
firstname VARCHAR(50) NOT NULL,
lastname VARCHAR(50) NOT NULL,
address1 VARCHAR(100) NULL,
address2 VARCHAR(100) NULL,
city VARCHAR(100) NULL,
stateregion VARCHAR(100) NULL,
Areapostal_code VARCHAR(10) NULL,
Empcompanyid VARCHAR(20) NULL,
notes VARCHAR(200) NULL,
registration_notes VARCHAR(200) NULL
);
ALTER
TABLE dbo.EmplyeeDetails2
ADD CONSTRAINT
PK_EmplyeeDetails2 PRIMARY KEY CLUSTERED
(
pkid
)
WITH
(
PAD_INDEX
= OFF
,
FILLFACTOR =
100
,
STATISTICS_NORECOMPUTE =
OFF
,
IGNORE_DUP_KEY =
OFF
,
ALLOW_ROW_LOCKS =
ON
,
ALLOW_PAGE_LOCKS =
ON
) ON [PRIMARY];
GO
Data in both the table is
identical, also added identical clustered index. Due to auto incremented
identity column added fill factor =100%.
Now making some differences
between tables.
EmplyeeDetails1
Adding non clustered index on firstname
and lastname with an included column (EmployeeId)
CREATE NONCLUSTERED INDEX [IX_EmplyeeDetails1_lastname] ON [dbo].EmplyeeDetails1
(
[lastname] ASC,
[firstname] ASC
)
INCLUDE
(
EmployeeId
)
WITH
(
PAD_INDEX = OFF
, STATISTICS_NORECOMPUTE = OFF
, SORT_IN_TEMPDB = OFF
, DROP_EXISTING = OFF
, ONLINE = OFF
, ALLOW_ROW_LOCKS = ON
, ALLOW_PAGE_LOCKS = ON
, FILLFACTOR = 80
)
GO
EmplyeeDetails2
Adding non clustered index on firstname
and lastname but without the inclusion of column.
CREATE NONCLUSTERED INDEX [IX_EmplyeeDetails2_lastname] ON [dbo].EmplyeeDetails2
(
[lastname] ASC,
[firstname] ASC
)
WITH
(
PAD_INDEX = OFF
, STATISTICS_NORECOMPUTE = OFF
, SORT_IN_TEMPDB = OFF
, DROP_EXISTING = OFF
, ONLINE = OFF
, ALLOW_ROW_LOCKS = ON
, ALLOW_PAGE_LOCKS = ON
, FILLFACTOR = 80
)
GO
We have two identical tables, each having 5000
records. The only difference is in their single non-clustered
indexes; one has an included column, the other does not. They have
identical primary keys and clustered indexes.
If we execute the following query
against the table with the included column in the clustered index we see the
following execution plan:
SELECT employeeid, lastname, firstname
FROM EmplyeeDetails1
WHERE lastname = 'samsamada43032' AND firstname = 'samada43032';
go
SELECT employeeid, lastname, firstname
FROM EmplyeeDetails2
WHERE lastname = 'samsamada43032' AND firstname = 'samada43032';
The whole action occurs within the
non-clustered index itself. We never have to travel back to the table for more
data. If we execute the identical query against the table with the same index
structure, minus the included column in the non-clustered index, we get
different behavior.
Since the non-clustered index
doesn't contain all the information we need to satisfy the query, it's necessary
to hit the index for the base information, and then using the row pointer in
the non-clustered index, pull in the employeeid from the table and run a Nested
Loop Join in order to marry the two working sets of data together before
returning them to the end user. We can see the difference in expense for reads
when looking at the I/O statistics from the two queries:
Note that these are all logical
reads since the pages were already in the buffer. If this was a situation where
it was a much larger data set you were contending with on an active server with
the pages still on disk, this could be a very expensive operation.
No comments:
Post a Comment