Exam 70-464 Passed!

I’ve just taken the 70-464 – Developing Microsoft SQL Server 2012 Databases exam.  I did the 3 hour in 1 hour 15 mins and got 876 out of 1000 so I’m quite pleased with myself.  70-463 or 70-465 next.

Advertisements

Query Execution Phases

Here’s how SQL Server executes a query…

Performed By Phase Description
Relational Engine T-SQL Statement that needs to be executed
Parse Check the T-SQL syntax
Create a tree of logical operators called the parse tree
Bind Check whether the objects exist
Bind the objects to the logical operators from the parse phase and create a tree of logical operators bound to the actual objects called the algebrised tree
Optimise Generate candidate plans and select the optimal one
Create the execution plan with the logical operators mapped to the physical operators
Storage Engine Execute Execute the query
Cache the plan

Exam 70-464 – Updated 24/09/2013

I’ve decided to skip exam 70-463 (Implementing a Data Warehouse with SQL Server 2012) and move onto 70-464. (See my previous post for the full 2012 certification path). Microsoft have reintroduced the second shot vouchers so I thought I take the plunge and book the exam for 26th September. If I do fail I have until May 2014 to resit for free.

I’ve had a look through the list of skills measured and have highlighted the following areas that I need to recap or improve my knowledge in.

I’m going to use this page to add links to various sources of information as I work through the different areas and will also add references to the 70-461 and 70-462 exam guide books if appropriate which will be the comments that begin with 70-461 or 70-462.

For information here is the full list of skills measured: http://www.microsoft.com/learning/en-us/exam-70-464.aspx

In no particular order:

Columnstore Indexes

Syntax:

CREATE NONCLUSTERED COLUMNSTORE INDEX

Links:
My Blog
70-461: Chapter 15 Lesson 1
70-461: Chapter 17 Lesson 2
70-462: Chapter 10 Lesson 1
http://msdn.microsoft.com/en-us/library/gg492088.aspx
http://blog.sqlauthority.com/2011/10/29/sql-server-fundamentals-of-columnstore-index/

CLR

Introduced in 2005 (I think) the Common Language Runtime allows you to use the power of the .NET framework within your T-SQL code. You can create CLR triggers, stored procedures, functions, data types and aggregates and you compile the .NET code into an assembly (DLL) that you register in your database. You then create the appropriate SQL object that references the method in the assembly. Whenever you use the SQL object in your T-SQL code, the method in the .NET code is executed. This is quite a big topic and the majority of the work is in understanding what you can do with the .NET code and what properties you need to set in the methods you create.

To get started you need to enable CLR on your instance by running

EXEC sp_configure 'clr enabled', 1;
RECONFIGURE;

You then register your assembly using something like

CREATE ASSEMBLY [SimonBlog]
FROM path
WITH PERMISSION_SET = SAFE

Here the PERMISSION_SET can be SAFE, UNSAFE or EXTERNAL_ACCESS. See links below for more details

Finally you create your SQL object using something like this.

CREATE AGGREGATE [dbo].[ConcatMagic]
(@value [nvarchar](4000), @delimiter [nvarchar](4000))
RETURNS[nvarchar](4000)
EXTERNAL NAME [SimonBlog].[ConcatMagic]
GO

The EXTERNAL NAME bit specifies the method in the assembly that will be run whenever dbo.ConcatMagic is run.

Links:
My Blog – CLR Aggregates
CREATE ASSEMBLY
http://msdn.microsoft.com/en-us/library/ms254498.aspx

Sparse columns

When a column is marked as SPARSE the null values in that column do not use any storage on disk. However non-null values require more storage so SPARSE columns are only useful if most of the rows in that column contain null values. There is a threshold for each data type where the percentage of null values in the column could benefit from the use of SPARSE columns. The link below has full details.

Links:
http://technet.microsoft.com/en-us/library/cc280604.aspx

Implicit and explicit conversions

An explicit conversion is where you use a CAST or CONVERT function to emplicitly convert a value from one data type to another. NB. CAST is a SQL standard whereas CONVERT is an addition to the T-SQL dialect. It’s considered good practise to use SQL standard ahead of language specific functionality so you should use CAST unless you need to make use of the CONVERT function’s formatting functionality.

An implicit conversion is where SQL Server is asked to compare data of two different types. Before it can do the comparison it implicitly converts the data type of one of the values to the data type of the other. There is a precedence that SQL Server uses to determine what data types get implicitly converted and there is a table on the CAST and CONVERT link below that shows what data types can be converted to what.

For example, if you have the column numberValue of INT data type in the table SimonTest, the following code will run successfully.

SELECT * FROM dbo.SimonTest WHERE numberValue = ‘2’

In this case the string value ‘2’ gets implicitly converted into an INT.

Implicit conversions can be costly and they should be avoided if possible.

Jes Schultz Borland has an article on the Brent Ozar blog that discusses
Identifying and Correcting SQL Server Implicit Conversion

Links:
CAST and CONVERT
Identifying and Correcting SQL Server Implicit Conversion

Work with XML data

This is obviously a very big topic.  The “stairway” on SQL Server Central is very good.

Links:
70-461: Chapter 7
SQL Server Central – Stairway to XML

Data version control and management

SQL Server 2012 has two pieces of functionality to handle data version controlling: Change Tracking and Change Data Capture.

Links:
My Blog – Change Tracking

Recursive CTEs

Links:
My Blog

Understand the difference between @Table and #Table

Links:
My Blog – Some Differences Between Temporary Tables and Table Variables
70-461: Chapter 16 Lesson 2

Partitioned tables

A table (or index) can be partitioned so that subsets of rows of data can be spread across multiple file groups.  This is handy if your database has to handle lots of range queries.  For example, if you have an application that runs queries for data for last month, you could partition your data by a date column and then the optimiser will know that it only has to read the data for that particular partition.

There are four basic steps to create a partitioned table…

  1. Create the file groups that you want to use.
  2. Create a partition function that maps subsets of rows of data into the different partitions based on the values in one of the columns.
  3. Create a partition schema that maps the individual partitions to the various file groups.
  4. Create or modify a table to use the partition schema.

Links:
70-462: Chapter 3 Lesson 2
Partitioned Tables and Indexes

Filestream

Links:
70-462: Chapter 3 Lesson 1
http://technet.microsoft.com/en-us/library/gg471497.aspx
https://www.simple-talk.com/sql/learn-sql-server/an-introduction-to-sql-server-filestream/

Filetable

Links:
70-462: Chapter 3 Lesson 1
http://technet.microsoft.com/en-us/library/ff929144.aspx

Statistics

Whenever an index is created on a table the optimiser generates statistics on how the values in the columns in the index are distributed across the table. The number of rows that a given value exists in is called the cardinality of the value. When a query runs the optimiser can then use these statistics do determine how selective the query is and we’re it would be more efficient to use an index seek rather than an index scan.

In the vast majority of cases you just let SQL Server maintain the statistics.

Depending on how they were created or updated and on the numbers of rows in the table, statistics will either be based on every row in the table or on a sample of the rows.

Statistics will automatically be updated if

  • an empty table has a row added
  • a table with fewer than 500 rows has more than 500 rows added
  • a table with more than 500 rows has more than 500 rows added and the number of rows added is more than 20% of the total number of rows in the table.

Links:
70-462: Chapter 10 Lesson 1
http://technet.microsoft.com/en-us/library/ms190397.aspx

Index Maintenance

  • Identify fragmented indexes
  • Identify unused indexes

There are several handy dynamic management objects (DMOs) that you can use to maintain indexes.  Any cumulative counts that are returned are reset whenever the SQL Server service is reset.

sys.dm_db_index_physical_stats is a UDF that returns details about the physical structure of the heap or index.  You use the parameters of the function to filter the objects you want to look at and you must supply at least a database_id value.  Here’s an example of running this for the Person.Address table in AdventureWorks2012.

SELECT I.name AS IndexName, PS.*
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('Person.Address'), NULL, NULL, 'DETAILED') PS
INNER JOIN sys.indexes I ON I.index_id = PS.index_id AND I.object_id = PS.object_id
ORDER BY PS.index_id, PS.index_level;

This returns details such as the type of index (or whether it’s a heap), level within the index and page counts per level, number of records at each level.  It also gives details of any fragmentation including a percentage.

For more details see sys.dm_db_index_physical_stats

sys.dm_db_index_operational_stats is function that returns details of how the index has been used.  You use the parameters of the function to filter the objects you want to look at and you must supply at least a database_id value.  Here’s an example of running this for the Person.Address table in AdventureWorks2012.

SELECT I.name AS IndexName, OS.*
 FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('Person.Address'), NULL, NULL) OS
 INNER JOIN sys.indexes I ON I.index_id = OS.index_id AND I.object_id = OS.object_id;

This will return cumulative counts of details such as inserts, updates and deletes at the leaf and non-leaf levels of the index, range scans and singleton lookups, locks and waits and lock escalation.

For more details see sys.dm_db_index_operational_stats

sys.dm_db_index_usage_stats is a view that returns details on the usage of indexes.  Here’s an example of running this for the Person.Address table in AdventureWorks2012.

SELECT *
FROM sys.dm_db_index_usage_stats 
WHERE database_id = DB_ID()
AND object_id = OBJECT_ID('Person.Address');

This returns cumulative counts of index scans, seeks and  lookups and also gives a date of the last time these occurred.  This is good for checking whether you’re indexes are being used.

Design, implement and troubleshoot security

Links:
70-462: Chapter 6

Describe advantages/disadvantages of using a GUID as a clustered index

You can generate nonsequential global unique identifiers by using the NEWID() T-SQl function and you can store these in a column with the UNIQUEIDENTIFIER data type. An advantage is that GUIDs can also easily be generated from application code and are guaranteed to be unique across space and time.
You can generate sequential GUIDs by usnig the NEWSEQUENTIALID() T-SQL function.
A disadvantage is that UNIQUEIDENTIFER data types are 16 bytes so they use four times as much space as an INT data type. If you use a UNIQUEIDENTIFER type as a clustered index key then this will be included in any non clustered indexes that you create.

Use Spatial Data

Links:
Spatial Data Types Overview

Extended Events

extended events are a lightweight replacement for the existing SQL Trace functionality which is deprecated along with it’s GUI SQL Profiler.
It provides two GUIs, the New Session Wizard and the New Session UI which replace SQL Profiler.
There are several events that you can monitor including data from the operating system and other applications.
SQL Server Audit is based on extended events.
Links:
Extended Events

Develop a CLR data type

Links:
http://technet.microsoft.com/en-us/library/ms131120.aspx

To do…

Partitioned views

isolation levels and how they affect locking and blocking

locks and lock escalation

identify bottlenecks

when to rebuild versus reorg and index

LOB data types

shrink file

run an SSIS job

check disk space

automate backups

encryption

impact of read committed snapshot/snapshot isolation

distributed transaction escaltion

database tuning advisor

query hints

full text search

impact of recovery model on database size

how to clean up if .mdf and .ldf files get too large

identify and fix transactional replication problems

Profiler

Some Differences between Temporary Tables and Table Variables

There are three temporary table object types (although two are quite similar): a table variable, prefixed with an @; a local temporary table, prefixed with a #; and a global temporary table prefixed with a ##.

Local temporary tables are visible within the session that created then, across batches, i.e. a GO batch terminator does not remove them, and they are also available in inner levels of the call stack, so if one stored proc creates a local table variable and then calls a second stored proc, the local temporary table is visible in the called stored proc.

Global temporary tables are visible across all sessions so long as the session that created it is still open and there are no active references to it.

You use a CREATE TABLE statement to create both types of temporary table.

Table variables are declared in the same way as any other variable. They are just declared to be of type TABLE, e.g. DECLARE @SimonTemp TABLE. They are visible only in the batch that created them and they get destroyed as soon as the batch terminates. They are not available within the inner call stack. The following code will fail…

DECLARE @SimonTemp TABLE (Id INT);
GO
SELECT * FROM @SimonTemp;

When you create a temporary table it creates it in the dbo schema of tempdb. You can create temporary tables with the same name in different sessions. This is because SQL Server actually appends a unique ID to the end of the name when it is created in tempdb. Any objects referencing the table can just use the name without the unique identifier. For example, if I create two connection and run the following in both…

CREATE TABLE #Simon (Id INT);

I can then view the objects in the tempdb and can see the following…

TempTables1

You can create constraints on temporary tables but beware if you give these constraints a name in code that could be run across multiple sessions. Any constraints created on the temporary table do not have a unique suffix added and are created exactly as stated. If another session tries to create the same named constraint on another temporary table it will fail as SQL Server will enforce it’s rule where object names must be unique within a schema. The best thing to do it not name the constraints and let SQL Server generate a unique internal name for them.

You can create indexes and run DDL commands once a temporary table is created, for example, to add a new column.

Table variables don’t allow you create any named constraints, indexes or perform any DDL. However you can create unnamed PRIMARY and UNIQUE constraints and these will both create unique clustered and non clustered indexes respectively.

Table variables are created in tempdb the same as temporary tables and are assigned a hex value name, e.g. #BD042834.

Like any other variable, table variables are not affected by rolling back a transaction where temporary tables are.

There are lots of blogs and comments on whether you should use table variables or temporary tables in your queries.  But with most things in SQL Server it depends on the situation.  The only real difference in term of performance is that tempoaray tables contain statistics whereas table variables do not.

In general if you have a small amount of data in your table or the query plan is very simple then it’s probably better to use table variables as there is no extra overhead in creating the statistics.  If you have a fair bit of data or your query is selective enough for the optimiser to benefit from using an index then temporary tables are probably better.

Recursive Common Table Expressions

Common tables expressions (CTE) are pretty simple to understand but there is a recursive version that is a little more complicated.  They are good if you’ve got hierarchical data stored in a self-referencing table.  For example, you may have an EmployeeId column and a ManagerId column in the same table.  The ManagerId value of a single row references an EmployeeId in another row.  You may have a management hierarchy consisting of many levels.  Recursive CTEs are a good way to get this hierarchical data out in one query and it doesn’t care how many levels there are in the hierarchy.  There is a maximum recursion level vaue that you can set that stops your recursive CTEs from entering an infinite loop and I’ll show you this later along with an example of a more complicated CTE.  Firstly, here’s a very basic example which shows the structure of a recursive CTE.

This is from a comment by elvedrano on the following link: http://technet.microsoft.com/en-us/library/ms186243(v=sql.105).aspx

WITH cte
AS ( 
 SELECT -- anchor member
 1 AS n
UNION ALL
SELECT -- recursive member
 n + 1
 FROM cte
 WHERE n < 50 -- terminator
)
SELECT
*
FROM cte;

The comments in this T-SQL highlight the different parts of the recursive CTE.  The three parts you need to include are the anchor member, the recursive member and the terminator.

The anchor member is usually a select statement that gets the first level in the hierarchy.  You then add a second select statement, separated by a UNION ALL, that references the CTE and the column that controls the hierarchy.  You can then add an optional terminator that will limit the recursion to a certain number of level.  If you only have a set number of levels you can omit the terminator.  The anchor member, recursive member and terminator all existing within the CTE declaration and you can then just select straight from the CTE to see the hierarchical results.

Here’s an example using employees and managers.

CREATE TABLE dbo.Employees
(
 EmployeeId INT NOT NULL PRIMARY KEY,
 EmployeeName NVARCHAR(50) NOT NULL,
 EmployeeRole NVARCHAR(50) NOT NULL,
 ManagerId INT NULL
);
INSERT INTO dbo.Employees (EmployeeId, EmployeeName, EmployeeRole, ManagerId)
VALUES
(1, 'Mickey Mouse', 'CEO', 1),
(2, 'Minnie Mouse', 'Chairman', 1),
(3, 'Donald Duck', 'CTO', 1),
(4, 'Tom the Cat', 'Head of Sales', 2),
(5, 'Jerry the Mouse', 'Head of Development', 3),
(6, 'Spot the Dog', 'Head of Testing', 3),
(7, 'Officer Dibble', 'Sales Manager', 4),
(8, 'Fred Flintstone', 'Sales', 7),
(9, 'Wilma Flintstone', 'Sales', 7),
(10, 'Barney Rubble', 'Sales', 7),
(11, 'Betty Rubble', 'Sales', 7),
(12, 'Captain Blackadder', 'Developer', 5),
(13, 'Baldrick', 'Developer', 5),
(14, 'Captain Darling', 'Tester', 6);
WITH EmployeeHierarchy AS (
 SELECT EmployeeId, EmployeeName, EmployeeRole, ManagerId, 1 AS Level
 FROM dbo.Employees
 WHERE EmployeeId = ManagerId
UNION ALL
SELECT E.EmployeeId, E.EmployeeName, E.EmployeeRole, E.ManagerId, EH.Level + 1 AS Level
 FROM dbo.Employees E
 INNER JOIN EmployeeHierarchy EH ON EH.EmployeeId = E.ManagerId
 WHERE E.EmployeeId <> E.ManagerId
)
SELECT *
FROM EmployeeHierarchy
DROP TABLE dbo.Employees;

RecursiveCTE

With recursive CTEs it’s possible to enter an endless loop.  You can prevent this from happening by using the MAXRECURSION option in the query.  We can modify the original simple CTE by removing the terminator and adding the MAXRECURSION option.  This query returns the same results, although it does return an error once the max recursion level is reached.

WITH cte
AS ( 
 SELECT -- anchor member
 1 AS n
UNION ALL
SELECT -- recursive member
 n + 1
 FROM cte
)
SELECT
*
FROM cte
OPTION (MAXRECURSION 49)

Transactions – the basics

I’m writing this blog because I’m currently working toward the 70-464 exam and transactions are bound to be covered. I wanted to consolidated my understanding and this is what I think I know…

A transaction is a logical unit of work that either completes as a whole or doesn’t complete at all. There is no way for a transaction to partially complete with only some data being updated. Anything that updates anything in SQL Server is handled in a transaction. Reading data out of SQL Server is also handled in a transaction.

SQL Server adheres to the ACID properties of transactions which ensure data integrity.

The ACID properties stand for Atomicity, Consistency, Isolation and Durability.

The atomicity property ensure that each transaction is handled as a single piece work that either succeeds in it’s entirety or not at all. If you’re trying to update 100 rows and one of them causes a error none of the 100 rows will be updated.

The consistency property ensures that the DB is left in a consistent state after the transaction completes. If not the transaction is rolled back. SQL Server uses constraints such as foreign keys to enforce this.

The isolation property ensures that transactions are isolated from each other during processing. There are various levels of isolation that make the locking of data behaving in different ways including a level that causes no locking and uses row versioning but this will be the subject of another blog.

The durability property ensures that even if there is a service interruption the data is in a usable state. SQL Server uses the transaction log to enforce durability.

By default all individual T-SQL statements are automatically committed. If you run two update statements in the same batch SQL Server will treat them as two single transactions. The first will either completely commit or not and then the second one will either completely commit or not.

You can configure SQL Server so that it does not automatically commit each statement. This forces you to issue a COMMIT or ROLLBACK command to finish the transaction. This is called implicit transaction mode. I have never used this mode but I think this is default behaviour in Oracle databases.

There is also explicit transaction mode which is where you issue a BEGIN TRAN command to start a transaction. You then execute your statement or batch or statements within this explicit transaction and either COMMIT or ROLLBACK all statements within that transaction. It’s a good idea to have error handling in your explicit transactions so that you can rollback the entire transaction completely if any statement fails.

Whenever a transaction starts the @@TRANCOUNT function gets increment by 1. If there are no transactions open the @@TRANCOUNT is 0. When a transaction starts it gets upped to 1. You can use this in your error handling to determine if there are any open transactions that need to be closed.

There is also a function XACT_STATE() that indicates the current state of any transaction. This is another useful function in error handling. A state of 0 means there are no open transactions. 1 means there is an uncommitted open transaction that can be committed. 2 means there is an open transaction that has caused a fatal error as cannot be committed. This must be rolled back.

You can turn on the XACT_ABORT property by running SET XACT_ABORT ON before starting a transaction. This will automatically rollback all statements within a transaction if any one of them fails. This property is enabled by default in triggers and causes not only the statement(s) in the trigger to rollback but also the statement that caused the trigger to fire.

You can nest transaction within each other, i.e. issue a BEGIN TRAN statement within another BEGIN TRAN statement. However be aware that nested transaction do not actually work in a completely nested way.

When you issue a BEGIN TRAN command the @@TRANCOUNT is upped to 1 (assuming it was originally 0). You can execute some statements within this transaction. If you then crated a nested transaction by issuing another BEGIN TRAN command the @@TRANCOUNT is upped to 2. You can execute some more statements and then issue a COMMIT command. All that actually happens here is the @@TRANCOUNT value is lowered to 1. Nothing is actually committed to the DB as the initial transaction is still open. If you now issue a ROLLBACK command the entire set of statements since the first BEGIN TRAN are rolled back. If you issue a COMMIT command the entire set of statements are committed. Nested transactions just force you to issue multiple COMMIT commands and it’s only the COMMIT that is executed when the @@TRANCOUNT is 1 that commits all the statements to the DB.

Rolling back statements in a nested transaction is handled slightly differently. If you’ve got five transactions nested within each other you do not have to issue five ROLLBACK commands to rollback the statements. As soon as you issue a ROLLBACK command the entire transaction is rolled back regardless of the nesting level.

You can rollback part of a transaction by using save points.  Here’s a little demo to show how save points work.

Create a table.

CREATE TABLE dbo.SavePointTest
(
 Id INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
 SomeData VARCHAR(10) NOT NULL
);

Check the @@TRANCOUNT and XACT_STATE.  Both should be 0 as we’ve not started a transaction yet.

SELECT @@TRANCOUNT;
SELECT XACT_STATE();

Start a transaction.

BEGIN TRAN;

Recheck the @@TRANCOUNT and XACT_STATE.  Both should now be 1.

Insert a row into the table.

INSERT INTO dbo.SavePointTest (SomeData)
VALUES ('Simon');

If you run a SELECT statement within the transaction you will be able to see the new row in the table.

SELECT *
FROM dbo.SavePointTest;

Transactions1

Now we can create a named save point by the running the following.

SAVE TRANSACTION t1;

@@TRANCOUNT and XACT_STATE are still set to 1.  We can now insert a second row and it gets returned.

INSERT INTO dbo.SavePointTest (SomeData)
VALUES ('Peacock');
SELECT *
FROM dbo.SavePointTest;

Transactions2

Now if we decide we don’t the second row we can now rollback to the point in the code where we issued our save point.  We just need to include the name of the save point in the ROLLBACK command as follows.

ROLLBACK TRAN t1;
SELECT *
FROM dbo.SavePointTest;

Transactions3

The transaction is still open and @@TRANCOUNT and XACT_STATE are still set to 1.  At this point we can either commit or rollback the open transaction.  If we commit the row is added to the table, if we rollback the table remains empty.

COMMIT TRAN;

@@TRANCOUNT and XACT_STATE will now be set to zero.

This is a good time to point out that not everything gets completely rolled back when we issue a ROLLBACK command.  All the DDL and DML statements will be completed rolled back but one thing that isn’t are identity values.  During the transaction above we added a row “Simon” to the table.  This upped the identity value on the Id to 1.  We then added a second row “Peacock”.  This upped the identity value to 2.  Even though we then issued a ROLLBACK only the data change was rolled back and the identity value remained at 2.  So if we now try and add the row “Peacock” again we see the new row if added but this time it’s been given an Id of 3.

INSERT INTO dbo.SavePointTest (SomeData)
VALUES ('Peacock');
SELECT *
FROM dbo.SavePointTest;

Transactions4

 

 

 

Change Tracking

First we need to enabled change tracking on the database.  This can be done using T-SQL or in SSMS.

ALTER DATABASE SimonBlog
SET CHANGE_TRACKING = ON
(CHANGE_RETENTION = 2 DAYS, AUTO_CLEANUP = ON);

ChangeTracking1

Now we can create a test table to track our changes against.

CREATE TABLE dbo.ChangeTrackingTest
(
 Id INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
 SomeData VARCHAR(100) NOT NULL
);

Then we also need to enable change tracking on the table.  Again this can be done with either T-SQL or SSMS.

ALTER TABLE dbo.ChangeTrackingTest
ENABLE CHANGE_TRACKING
WITH (TRACK_COLUMNS_UPDATED = ON);

ChangeTracking2

There are some change tracking functions that we can use to check the changes.  Firstly we can run the following to get the current version of the tracked database.

SELECT CHANGE_TRACKING_CURRENT_VERSION()

As we’ve not made any updates to the database this will return 0.  If change tracking was not enabled this would return NULL.

Now let’s insert some data

INSERT INTO dbo.ChangeTrackingTest (SomeData)
VALUES ('Simon'),('Peacock');

We’ve now got two rows in the table.  If we check the version against we get 1 returned.  Now we can use another function to view what changes have occurred since the last version.  We use the CHANGETABLE function as shown below.  The second parameter with a value of 0 tells the function to return the changes made since version 0.  This seems to be contrary to what BOL say.  Their examples pass the current version into the function.  If you do this you get no rows returned.  We’re currently at version 1 so if we pass in version 0 it will return the change made between version 0 and the current version.

SELECT * 
FROM CHANGETABLE(CHANGES dbo.ChangeTrackingTest, 0) AS CT

This gives us the following results.

ChangeTracking3

This show our two inserts for Ids 1 and 2.  The SYS_CHANGE_VERSION and SYS_CHANGE_CREATION_VERSION show that these changes were made as part of version 1.  The SYS_CHANGE_OPERATION value “I” shows that these were inserts.

Now we add a third row.

INSERT INTO dbo.ChangeTrackingTest (SomeData)
VALUES ('Eleanor')

The version number has increased to 2.  We now check CHANGETABLE for the changes made since version 1

SELECT * 
FROM CHANGETABLE(CHANGES dbo.ChangeTrackingTest, 1) AS CT

ChangeTracking4

This shows the new row with Id of 3 and that is was inserted as part of version 2.

We can also check for all changes since version 0, i.e. all the changes so far by passing in 0 as follows.

SELECT * 
FROM CHANGETABLE(CHANGES dbo.ChangeTrackingTest, 0) AS CT

This returns the following which shows all three inserts.

ChangeTracking5

Now let’s update a row.

UPDATE dbo.ChangeTrackingTest
SET SomeData = 'Nomis'
WHERE Id = 1

The version is now up to 3.  Let check for all changes since the beginning (version 0).

ChangeTracking6

You can see that the last time the row with Id = 1 was updated in version 3.  The creation version is still version 1.  When I first saw these results I was quite surprised that the SYS_CHANGE_OPERATION value still had “I” for insert.  However, if you just check for changes since the last version, i.e. since version 2 we get the following results.

ChangeTracking7

This shows us the update made in version 3.

You can view the current version of a particular row by using the VERSION command within the CHANGETABLE function as follows:

SELECT * 
FROM CHANGETABLE(VERSION dbo.ChangeTrackingTest, (Id), (1)) AS CT

As before the first parameter is the table name, the second is the name of the primary key column and the third is the value from the primary key column.  So this statement is asking for the current version of Id = 1 in dbo.ChangeTrackingTest and returns the following results as you would expect.

ChangeTracking8

The function CHANGE_TRACKING_MIN_VALID_VERSION returns the minimum valid version that we should use for tracking.

SELECT CHANGE_TRACKING_MIN_VALID_VERSION(OBJECT_ID('dbo.ChangeTrackingTest'));

This returns a value of 1 and if we use that in the CHANGETABLE function we can see only the insert of row 3 and the update to row 1.

You may want to filter changes on the context in which they were made. You may have changes that are made by an application and you may have changes that are made by your DBA. You can use the WITH CHANGE_TRACKING_CONTEXT function when you make the change as in the following example. NB. the parameter supplied is a VARBINARY(128) data type.

DECLARE @originator_id VARBINARY(128);
SET @originator_id = CAST('SimonDBA' AS VARBINARY(128));
WITH CHANGE_TRACKING_CONTEXT (@originator_id)
	INSERT INTO dbo.ChangeTrackingTest (SomeData)
	VALUES ('Simon');

If we now run the following call to CHANGETABLE to view the changes since the last version we get the following results.

SELECT *, CAST(SYS_CHANGE_CONTEXT AS VARCHAR(100)) 
FROM CHANGETABLE(CHANGES dbo.ChangeTrackingTest, 3) AS CT

ChangeTracking9

The context value supplied is held in the SYS_CHANGE_CONTEXT column and can be cast back to the VARCHAR to make it human readable.