Thursday, February 4, 2016

Memo Structure in SQL Server

Query optimizer generate alternate execution plans through transformation rules for optimal plan, these alternate execution plans are stored in a structure called MEMO for the duration of optimization process.
MEMO provides efficient way to store alternate plan through groups, each group in memo initially contain one entry.

Optimizer initially copy the query tree logical expression in memo structure based on the operator it copies into equivalent groups and optimization process start through transformation.

Transformation process starts from initial logical expression and generate alternative plan, if required it add new logical expression and generate new group and new alternative plan and adding them to equivalent group.



We have some undocumented trace flag which help us to see the memo structure, please do not try these in production environment.

Scenarios/Examples in test environment only:
 CREATE TABLE TempMemo
( id INT )

CREATE TABLE TempMemo1
( id INT )


INSERT INTO TempMemo
SELECT 1
UNION
SELECT 2

INSERT INTO TempMemo1
SELECT 1
UNION
SELECT 3


-- Check the data in tables
SELECT *  FROM   TempMemo1
SELECT *  FROM   TempMemo

-- start trace flag 3604 to return output info to client
DBCC traceon(3604);

-- clear buffer/cache
DBCC freeproccache
DBCC dropcleanbuffers


-- Start Trace flag 8608, which will show Initial memo strucute, marked in color Yellow
SELECT     *
FROM       TempMemo tm  INNER JOIN TempMemo tm1
       ON         tm.id =tm1.id
OPTION(QueryTraceON 8608);

/*
--- Initial Memo Structure ---

Root Group 5: Card=2 (Max=1e+008, Min=0)

   0 LogOp_Join 3 4 2 (Distance = 0)



Group 4: Card=2 (Max=10000, Min=0)

   0 LogOp_Get (Distance = 0)



Group 3: Card=2 (Max=10000, Min=0)

   0 LogOp_Get (Distance = 0)



Group 2:
   0 ScaOp_Comp  0 1 (Distance = 0)



Group 1:
   0 ScaOp_Identifier  (Distance = 0)



Group 0:
   0 ScaOp_Identifier  (Distance = 0)



------------------------------


(2 row(s) affected)

*/


DBCC freeproccache
DBCC dropcleanbuffers

-- Start Trace flag 8615, which will show Final memo strucute, marked in color Green
SELECT     * FROM       TempMemo tm
       INNER JOIN TempMemo tm1 ON         tm.id =tm1.id
OPTION (QueryTraceON 8615);

/*
--- Final Memo Structure ---

Group 7: Card=2 (Max=10000, Min=0)

   1 PhyOp_Range 0 ASC   Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 0.0032842(Distance = 2)


   0 LogOp_GetIdx (Distance = 1)



Group 6: Card=2 (Max=10000, Min=0)

   1 PhyOp_Range 0 ASC   Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 0.0032842(Distance = 2)


   0 LogOp_GetIdx (Distance = 1)



Root Group 5: Card=2 (Max=1e+008, Min=0)

   3 PhyOp_HashJoinx_jtInner 4.2 3.2 2.0  Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 0.0243645(Distance = 2)


   1 LogOp_Join 4 3 2 (Distance = 1)


   0 LogOp_Join 3 4 2 (Distance = 0)



Group 4: Card=2 (Max=10000, Min=0)

   6 PhyOp_Sort 4.2  Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 0.0146486(Distance = 0)


   2 PhyOp_Range 0 ASC   Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 0.0032842(Distance = 1)


   0 LogOp_Get (Distance = 0)



Group 3: Card=2 (Max=10000, Min=0)

   4 PhyOp_Sort 3.2  Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 0.0146486(Distance = 0)


   2 PhyOp_Range 0 ASC   Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 0.0032842(Distance = 1)


   0 LogOp_Get (Distance = 0)



Group 2:
   0 ScaOp_Comp  0.0 1.0  Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 3(Distance = 0)



Group 1:
   0 ScaOp_Identifier   Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 1(Distance = 0)



Group 0:
   0 ScaOp_Identifier   Cost(RowGoal 0,ReW 0,ReB 0,Dist 0,Total 0)= 1(Distance = 0)



----------------------------

DBCC execution completed. If DBCC printed error messages, contact your system administrator.
DBCC execution completed. If DBCC printed error messages, contact your system administrator.

(2 row(s) affected)

*/



Wednesday, January 27, 2016

HASHING & HASH COLLISSION in SQL SERVER

HASHING:
Hashing is the mechanism/algorithm to generate hash key values for provided inputs, there is different SQL server algorithms/in build functions to generate hash values.
HASHBYTES is the function to generate hash values using algorithms (MD2 | MD4 | MD5 | SHA | SHA1 | SHA2_256 | SHA2_512)
 
We also use CHECKSUM & BINARY_CHECKSUM to generate hash values.

 

HASH COLLISSION:  When hashing mechanism generate same Hash values for not very similar inputs.

Execute below queries for examples and watch result marked in particular fashion.
Examples:
DECLARE @V Varchar(88) ='AAADB'
DECLARE @V1 Varchar(88) = 'AAAAAAAAAAAAAAAAAAADB'
SELECT CHECKSUM(@V) AS HashValue,@V AS InputString
UNION ALL
SELECT CHECKSUM(@V1) AS HashValue,@V1 AS ChangedInputString

SET @V = 'DBAAADB'
SET @V1 = 'DBAAAAAAAAAAAAAAAAAAADB'
SELECT  CHECKSUM(@V) AS CHECKSUM_HashValue, @V AS  InputString
UNION ALL
SELECT  CHECKSUM(@V1) AS CHECKSUM_HashValue, @V1 AS ChangedInputString

SET @V = 'DBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADB'
SET @V1 = 'DBAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAADB'

SELECT  HASHBYTES('SHA', @V) AS SHA_HASHBYTES_HashValue, @V AS InputString
UNION ALL
SELECT  HASHBYTES('SHA', @V1) AS SHA_HASHBYTES_HashValue, @V1 AS ChangedInputString

SELECT  HASHBYTES('SHA1', @V) AS SHA1_HASHBYTES_HashValue, @V AS InputString
UNION ALL
SELECT  HASHBYTES('SHA1', @V1) AS SHA1_HASHBYTES_HashValue, @V1 AS ChangedInputString

SELECT HASHBYTES('SHA2_256', @V) AS SHA2_HASHBYTES_HashValue, @V AS InputString
UNION ALL
SELECT HASHBYTES('SHA2_256', @V1) AS SHA2_HASHBYTES_HashValue, @V1 AS ChangedInputString

SELECT HASHBYTES('SHA2_512', @V) AS SHA2_512_HASHBYTES_HashValue, @V AS InputString
UNION ALL
SELECT HASHBYTES('SHA2_512', @V1) AS SHA2_512_HASHBYTES_HashValue , @V1 AS ChangedInputString 

Queries Output:

From above result set it is very much clear hashing functions/algorithms generate same hash values for different input strings which result to hash collision.

Wednesday, December 23, 2015

Query to retrieve Schema/Script of a user table


/*In below SQL Script enter user defined table name i.e. table Employee*/

DECLARE @table_name SYSNAME
SELECT @table_name = 'dbo.Employee'

DECLARE
      @object_name SYSNAME
    , @object_id INT


SELECT
      @object_name = '[' + s.name + '].[' + o.name + ']'
    , @object_id = o.[object_id]
FROM sys.objects o WITH (NOWAIT)
JOIN sys.schemas s WITH (NOWAIT) ON o.[schema_id] = s.[schema_id]
WHERE s.name + '.' + o.name = @table_name
    AND o.[type] = 'U'
    AND o.is_ms_shipped = 0

DECLARE @SQL NVARCHAR(MAX) = ''

;WITH index_column AS
(
    SELECT
          ic.[object_id]
        , ic.index_id
        , ic.is_descending_key
        , ic.is_included_column
        , c.name
    FROM sys.index_columns ic WITH (NOWAIT)
    JOIN sys.columns c WITH (NOWAIT) ON ic.[object_id] = c.[object_id] AND ic.column_id = c.column_id
    WHERE ic.[object_id] = @object_id
),
fk_columns AS
(
     SELECT
          k.constraint_object_id
        , cname = c.name
        , rcname = rc.name
    FROM sys.foreign_key_columns k WITH (NOWAIT)
    JOIN sys.columns rc WITH (NOWAIT) ON rc.[object_id] = k.referenced_object_id AND rc.column_id = k.referenced_column_id
    JOIN sys.columns c WITH (NOWAIT) ON c.[object_id] = k.parent_object_id AND c.column_id = k.parent_column_id
    WHERE k.parent_object_id = @object_id
)
SELECT @SQL = 'CREATE TABLE ' + @object_name + CHAR(13) + '(' + CHAR(13) + STUFF((
    SELECT CHAR(9) + ', [' + c.name + '] ' +
        CASE WHEN c.is_computed = 1
            THEN 'AS ' + cc.[definition]
            ELSE UPPER(tp.name) +
                CASE WHEN tp.name IN ('varchar', 'char', 'varbinary', 'binary', 'text')
                       THEN '(' + CASE WHEN c.max_length = -1 THEN 'MAX' ELSE CAST(c.max_length AS VARCHAR(5)) END + ')'
                     WHEN tp.name IN ('nvarchar', 'nchar', 'ntext')
                       THEN '(' + CASE WHEN c.max_length = -1 THEN 'MAX' ELSE CAST(c.max_length / 2 AS VARCHAR(5)) END + ')'
                     WHEN tp.name IN ('datetime2', 'time2', 'datetimeoffset')
                       THEN '(' + CAST(c.scale AS VARCHAR(5)) + ')'
                     WHEN tp.name = 'decimal'
                       THEN '(' + CAST(c.[precision] AS VARCHAR(5)) + ',' + CAST(c.scale AS VARCHAR(5)) + ')'
                    ELSE ''
                END +
                CASE WHEN c.collation_name IS NOT NULL THEN ' COLLATE ' + c.collation_name ELSE '' END +
                CASE WHEN c.is_nullable = 1 THEN ' NULL' ELSE ' NOT NULL' END +
                CASE WHEN dc.[definition] IS NOT NULL THEN ' DEFAULT' + dc.[definition] ELSE '' END +
                CASE WHEN ic.is_identity = 1 THEN ' IDENTITY(' + CAST(ISNULL(ic.seed_value, '0') AS CHAR(1)) + ',' + CAST(ISNULL(ic.increment_value, '1') AS CHAR(1)) + ')' ELSE '' END
        END + CHAR(13)
    FROM sys.columns c WITH (NOWAIT)
    JOIN sys.types tp WITH (NOWAIT) ON c.user_type_id = tp.user_type_id
    LEFT JOIN sys.computed_columns cc WITH (NOWAIT) ON c.[object_id] = cc.[object_id] AND c.column_id = cc.column_id
    LEFT JOIN sys.default_constraints dc WITH (NOWAIT) ON c.default_object_id != 0 AND c.[object_id] = dc.parent_object_id AND c.column_id = dc.parent_column_id
    LEFT JOIN sys.identity_columns ic WITH (NOWAIT) ON c.is_identity = 1 AND c.[object_id] = ic.[object_id] AND c.column_id = ic.column_id
    WHERE c.[object_id] = @object_id
    ORDER BY c.column_id
    FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, CHAR(9) + ' ')
    + ISNULL((SELECT CHAR(9) + ', CONSTRAINT [' + k.name + '] PRIMARY KEY (' +
                    (SELECT STUFF((
                         SELECT ', [' + c.name + '] ' + CASE WHEN ic.is_descending_key = 1 THEN 'DESC' ELSE 'ASC' END
                         FROM sys.index_columns ic WITH (NOWAIT)
                         JOIN sys.columns c WITH (NOWAIT) ON c.[object_id] = ic.[object_id] AND c.column_id = ic.column_id
                         WHERE ic.is_included_column = 0
                             AND ic.[object_id] = k.parent_object_id
                             AND ic.index_id = k.unique_index_id  
                         FOR XML PATH(N''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, ''))
            + ')' + CHAR(13)
            FROM sys.key_constraints k WITH (NOWAIT)
            WHERE k.parent_object_id = @object_id
                AND k.[type] = 'PK'), '') + ')'  + CHAR(13)
    + ISNULL((SELECT (
        SELECT CHAR(13) +
             'ALTER TABLE ' + @object_name + ' WITH'
            + CASE WHEN fk.is_not_trusted = 1
                THEN ' NOCHECK'
                ELSE ' CHECK'
              END +
              ' ADD CONSTRAINT [' + fk.name  + '] FOREIGN KEY('
              + STUFF((
                SELECT ', [' + k.cname + ']'
                FROM fk_columns k
                WHERE k.constraint_object_id = fk.[object_id]
                FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
               + ')' +
              ' REFERENCES [' + SCHEMA_NAME(ro.[schema_id]) + '].[' + ro.name + '] ('
              + STUFF((
                SELECT ', [' + k.rcname + ']'
                FROM fk_columns k
                WHERE k.constraint_object_id = fk.[object_id]
                FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '')
               + ')'
            + CASE
                WHEN fk.delete_referential_action = 1 THEN ' ON DELETE CASCADE'
                WHEN fk.delete_referential_action = 2 THEN ' ON DELETE SET NULL'
                WHEN fk.delete_referential_action = 3 THEN ' ON DELETE SET DEFAULT'
                ELSE ''
              END
            + CASE
                WHEN fk.update_referential_action = 1 THEN ' ON UPDATE CASCADE'
                WHEN fk.update_referential_action = 2 THEN ' ON UPDATE SET NULL'
                WHEN fk.update_referential_action = 3 THEN ' ON UPDATE SET DEFAULT'
                ELSE ''
              END
            + CHAR(13) + 'ALTER TABLE ' + @object_name + ' CHECK CONSTRAINT [' + fk.name  + ']' + CHAR(13)
        FROM sys.foreign_keys fk WITH (NOWAIT)
        JOIN sys.objects ro WITH (NOWAIT) ON ro.[object_id] = fk.referenced_object_id
        WHERE fk.parent_object_id = @object_id
        FOR XML PATH(N''), TYPE).value('.', 'NVARCHAR(MAX)')), '')
    + ISNULL(((SELECT
         CHAR(13) + 'CREATE' + CASE WHEN i.is_unique = 1 THEN ' UNIQUE' ELSE '' END
                + ' NONCLUSTERED INDEX [' + i.name + '] ON ' + @object_name + ' (' +
                STUFF((
                SELECT ', [' + c.name + ']' + CASE WHEN c.is_descending_key = 1 THEN ' DESC' ELSE ' ASC' END
                FROM index_column c
                WHERE c.is_included_column = 0
                    AND c.index_id = i.index_id
                FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '') + ')'
                + ISNULL(CHAR(13) + 'INCLUDE (' +
                    STUFF((
                    SELECT ', [' + c.name + ']'
                    FROM index_column c
                    WHERE c.is_included_column = 1
                        AND c.index_id = i.index_id
                    FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)'), 1, 2, '') + ')', '')  + CHAR(13)
        FROM sys.indexes i WITH (NOWAIT)
        WHERE i.[object_id] = @object_id
            AND i.is_primary_key = 0
            AND i.[type] = 2
        FOR XML PATH(''), TYPE).value('.', 'NVARCHAR(MAX)')
    ), '')

PRINT @SQL

Wednesday, December 9, 2015

Query to get User tables Information

Query to retrieve Rows, Pages and Size of Tables in SQL Server



SELECT
ObjectId= p.[Object_id]
,TableName = OBJECT_NAME(p.[object_Id])
, TotalRows = MAX(p.[rows])
, TotalPages = SUM(a.total_pages)
, UsedPages = SUM(a.used_pages)
, Pages = SUM(
CASE
WHEN it.internal_type IN (202, 204, 207, 211, 212, 213, 214, 215, 216, 221, 222) THEN 0
WHEN a.[type]! = 1 AND p.index_id < 2 THEN a.used_pages
WHEN p.index_id < 2 THEN a.data_pages ELSE 0
END
)

, SizeInMB = (SUM(
CASE
WHEN it.internal_type IN (202, 204, 207, 211, 212, 213, 214, 215, 216, 221, 222) THEN 0
WHEN a.[type]! = 1 AND p.index_id < 2 THEN a.used_pages
WHEN p.index_id < 2 THEN a.data_pages ELSE 0
END
)*8.)/1024.0

FROM sys.partitions p
JOIN sys.allocation_units a ON p.[partition_id] = a.container_id
LEFT JOIN sys.internal_tables it ON p.[object_id] = it.[object_id]
GROUP BY p.[object_id]
HAVING OBJECTPROPERTY(p.[object_id],'IsUserTable') = 1

Please suggest your valuable inputs.


Friday, December 4, 2015

Page and Extent in SQL Server




SQL server stores everything inside pages in different types of pages.

Disk space is allocated to data files (.MDF/.LDF/.NDF) is logically organized in contiguous pages (0-n)
Disk operation is perform at page level to read /write

1 Page                     = 8 KB
1 Extent                   = 8 Pages/ 64 KB
1 MB                       =128 Pages



Each page has 96 byte header which contains system information which is Page number, Page type, free space on page and allocation unit id of object which contain page and Row offset which contain row number info growing from end of the page

                                                    Page: 
 




Pages are following types:
S#
Page type
Description
1
Data page
Data records, Leaf of Clustered Index
2
Index page
Non Clustered index, Non leaf of Clustered Index
3
Text/Image Page
for large object data types(text,ntext,images,varbinary etc.)
4
GAM/SGAM
info for extent tracking
5
Page Free Space
Info about Page allocation and free space
6
IAM
info for extent usage for table or index
7
Bulk Changed map
info for extent modification by bulk operation since last log bkp per allocation unit
8
Differential Changed map
info for extent modification since last db bkp per allocation unit
9
Sort Page
Temp page for sorting
10
Boot Page
Info about Page
11
File Header Page
info about file
12
Server Config Page
page for sp_configure

Extent
Extent is used to manage space, extent are formed by eight contiguous pages


A new table or index generally consist mixed extent once its grows to accommodate eight pages then it switches to uniform extent. If we create index on large table which can accommodate eight contiguous pages then it allocate pages to uniform extents.


Friday, November 27, 2015

PAD_INDEX in SQL Server



PAD_INDEX is the option we specify while index creation/Rebuild as below.
Pad_Index {ON|OFF}
Pad_Index is only useful when fill factor is specified in index definition, if PAD_Index is ON, it means fill factor value will be applied to intermediate nodes of balance tree formed by index.

If PAD_Index is OFF, it means fill factor value will not be applied to intermediate nodes of balance tree.


/* PRACTICAL SCENARIOS
 CREATE TEST TABLE EMPLOYEE_PADINDEX  */

CREATE TABLE Employee_PadIndex
(Id INT NOT NULL,
Name VARCHAR(20)
)

/*INSERT SOME TEST RECORDS*/
INSERT INTO Employee_PadIndex
SELECT 1,'Ram'
UNION ALL
SELECT 2,'Milan'
UNION ALL
SELECT 3,'Sandy'
UNION ALL
SELECT 4,'Babu'
UNION ALL
SELECT 5,'Allen'


/* HERE FILL FACTOR WILL NOT BE APPLICABLE FOR CLUSTERED INDEX INTERMEDIATE NODES AS PAD_INDEX  = OFF */
CREATE CLUSTERED INDEX  Idx_Employees_ID on Employee_PadIndex(id)
WITH (FILLFACTOR = 80, PAD_INDEX  = OFF) ON [PRIMARY]

/* HERE FILL FACTOR WILL BE APPLICABLE FOR NONCLUSTERED INDEX INTERMEDIATE NODES AS PAD_INDEX  = ON */
CREATE NONCLUSTERED INDEX  Idx_Employees_NAME on Employee_PadIndex(NAME)
WITH (FILLFACTOR = 80, PAD_INDEX  = ON) ON [PRIMARY]

/* HERE FILL FACTOR WILL BE APPLICABLE FOR CLUSTERED INDEX INTERMEDIATE NODES AS PAD_INDEX  = ON */
ALTER INDEX  Idx_Employees_ID on Employee_PadIndex
REBUILD
WITH (FILLFACTOR = 80, PAD_INDEX  = ON ) 

/* DROP ABOVE CREATED CLUSTERED INDEX */
DROP INDEX  Idx_Employees_ID on Employee_PadIndex

/* CREATE PRIMARY KEY WITH CLUSTERED INDEX WITH OPTIONS
HERE FILL FACTOR WILL NOT BE APPLICABLE FOR INTERMEDIATE NODES AS PAD_INDEX  = OFF */
ALTER TABLE Employee_PadIndex
ADD CONSTRAINT [PK_Employees_Id] PRIMARY KEY CLUSTERED 
(
    [Id] ASC
) WITH (FILLFACTOR = 80,PAD_INDEX  = OFF) ON [PRIMARY]
    
/* DROP TEST TABLE EMPLOYEE_PADINDEX */
DROP TABLE Employee_PadIndex

Note:
Please try above scenarios in Test/Development Environment only.