Error when trying to create SSISDB catalog: An error occurred during Service Master Key decryption

I’ve just installed SSIS on my PC and tried to create the SSISDB instance in SSMS as shown below

01

When I entered a password and clicked OK I got the following error:

An error occurred during Service Master Key decryption. Changed database context to ‘SSISDB’. (.Net SqlClient Data Provider).

To fix this I had to run the following

ALTER SERVICE MASTER KEY FORCE REGENERATE

SSIS Checkpoints

There is a very handy feature available in SSIS 2012 onwards that allows you to restart a package from the point of failure and skip any previously successful tasks. This is particularly useful if you have a few tasks that take a very long time to run that have been successful and a task further down the process chain fails. You don’t want to have to rerun the long tasks again if there is no need to. Adding checkpoints to package can do this. You just need to enable your package to allow checkpoints and then set which tasks (it could be all of them) are points in the package that you can restart from if the package fails further down the line. Here’s an example of a simple process using checkpoints…

Here we have a simple little package that truncates a table, inserts some data and then runs an update on that data. I’m going to force the update task to fail and get it to restart without having to truncate the data and load it in again. If the insert of the data took several minutes or even hours then this would save a lot of time.

First of all just create a test table

USE SimonBlog
GO
CREATE TABLE dbo.SSISCheckPointTest
(
 ID INT NOT NULL IDENTITY(1,1),
 SomeNumber INT NOT NULL
);

Now create a new SSIS package with three Execute SQL Tasks.

1

The first one just truncates the table dbo.SSISCheckPointTest

TRUNCATE TABLE dbo.SSISCheckPointTest

The second one runs the following just to populate the table with some data and then returns a single value which is the number of rows inserted into the table.  This gets written to the SSIS variable @SomeNumber.

INSERT INTO dbo.SSISCheckPointTest
SELECT ROW_NUMBER() OVER (ORDER BY SalesOrderId)
FROM [AdventureWorks2012].Sales.SalesOrderDetail
CROSS JOIN [AdventureWorks2012].[Sales].[SalesPerson]
SELECT @@ROWCOUNT AS NumRows;

The third step runs an update statement that attempts to set the SomeNumber integer value to a character value as follows…

UPDATE dbo.SSISCheckPointTest
SET SomeNumber = 'A'
WHERE ID = 100;

Now lets run the package and the third task will fail

2

As we can see from the Progress tab the insert took 40 secs which was by far the longest part of the whole package.  If this was 40 minutes rather than 40 seconds we wouldn’t want to run the insert part again.  We’d just want to fix the failed step and then run it from there.

Just to confirm we can see that the insert was successful and we have some rows.

3

Now let’s set the package up so that it creates a checkpoint after doing the insert.  Firstly enable it in the package properties by setting the SaveCheckpoints property to True.  Then specify an XML file in the CheckpointFileName property.  This will record when each checkpoint is hit and which ones were successful.  It also stores the state of each variable.  Finally set the CheckpointUsage to IfExists.  This means that if the checkpoint file doesn’t exist then it will create it.  If the package completes successfully then the checkpoint file is deleted so the next time it runs it will start from the beginning.

4

Next we need to specify which tasks will create a checkpoint.  We want to be able to restart the package after the insert task so we will set a checkpoint on that task.  I’m also going to get a checkpoint on the truncate task.  To do this set the FailPackageOnFailure to True.

5

Now let’s run the package again.  As there was previously no checkpoints enabled the package starts from the beginning and fails again on the update task.

6

We can now see that the checkpoint XML file was created as follows. We can see that the SomeNumber variable has a value of 2062389.

7

Now let’s fix the update statement and change it to use the value of the SomeNumber variable in the update.  As you can see below the default value of the SomeNumber variable is 0 so if it get’s set to 2062389 this will prove that the checkpoints also remember the values of variables.

8

9

Now lets rerun the package.

10

As you can see only the update task ran and this time it was successful.  We can also see that the SomeNumber value for the row with ID = 100 has been set to the value in the SomeNumber variable.

11

The checkpoint file has also now been deleted so the next time the package executes it will start from the beginning.

SQL Server 2012 Exam Progress

I’ve managed to pass the following exams so far:

  • 461 – Querying Microsoft SQL Server 2012
  • 462 – Administering Microsoft SQL Server 2012 Databases
  • 464 – Developing Microsoft SQL Server 2012 Databases

Since then I’ve had a break and have been going through Kalen Delaney’s excellent Microsoft SQL Server 2012 Internals book but now I’m going to start looking at

  • 463 – Implementing a Data Warehouse with Microsoft SQL Server 2012

This one along with 461 and 462 will give me the Microsoft Certified Solutions Associate (MCSA) qualification.

Then I’ve got

  • 465 – Designing Database Solutions for SQL Server 2012

which along with 464 increases the MCSA to a Microsoft Certified Solutions Expert (MCSE).  There are two flavours of MCSE, this one would be the Data Platform one.  There is also the Business Intelligence one which has the following exams instead of 464 and 465.

  • 466 – Implementing Data Models and Reports with Microsoft SQL Server 2012
  • 467 – Designing Business Intelligence Solutions with Microsoft SQL Server 2012

I haven’t decided if I’ll go for the BI ones as well but we’ll see.

If you’re looking for a good resource for helping with the exams then I’d recommend the official guide books and practise tests. The first 2012 exam I did was the DBA one (462) and I read that cover to cover before even looking at the practise tests.  For 461 I found it much better to read the book not in chapter order but in objective order.  The front couple of pages of the book list the exam objectives along with the relevant chapters for each one. I read all the chapters for the first objective and kept at the practise tests for that objective until I was getting 80 – 90%, then moved on to the next one.  You can set up the practise tests to only include questions for a particular objective, or multiple objectives which is very handy. Once I was passing the questions for the second objective I ran through a few tests containing questions for both the first and second objectives and then moved on to the third.  I carried on until all objectives where complete and I was getting high scores on the entire test.   I’m planning on doing the same for the data warehousing book (463).

Query Execution Phases

Here’s how SQL Server executes a query…

Performed By Phase Description
Relational Engine T-SQL Statement that needs to be executed
Parse Check the T-SQL syntax
Create a tree of logical operators called the parse tree
Bind Check whether the objects exist
Bind the objects to the logical operators from the parse phase and create a tree of logical operators bound to the actual objects called the algebrised tree
Optimise Generate candidate plans and select the optimal one
Create the execution plan with the logical operators mapped to the physical operators
Storage Engine Execute Execute the query
Cache the plan

Exam 70-464 – Updated 24/09/2013

I’ve decided to skip exam 70-463 (Implementing a Data Warehouse with SQL Server 2012) and move onto 70-464. (See my previous post for the full 2012 certification path). Microsoft have reintroduced the second shot vouchers so I thought I take the plunge and book the exam for 26th September. If I do fail I have until May 2014 to resit for free.

I’ve had a look through the list of skills measured and have highlighted the following areas that I need to recap or improve my knowledge in.

I’m going to use this page to add links to various sources of information as I work through the different areas and will also add references to the 70-461 and 70-462 exam guide books if appropriate which will be the comments that begin with 70-461 or 70-462.

For information here is the full list of skills measured: http://www.microsoft.com/learning/en-us/exam-70-464.aspx

In no particular order:

Columnstore Indexes

Syntax:

CREATE NONCLUSTERED COLUMNSTORE INDEX

Links:
My Blog
70-461: Chapter 15 Lesson 1
70-461: Chapter 17 Lesson 2
70-462: Chapter 10 Lesson 1
http://msdn.microsoft.com/en-us/library/gg492088.aspx
http://blog.sqlauthority.com/2011/10/29/sql-server-fundamentals-of-columnstore-index/

CLR

Introduced in 2005 (I think) the Common Language Runtime allows you to use the power of the .NET framework within your T-SQL code. You can create CLR triggers, stored procedures, functions, data types and aggregates and you compile the .NET code into an assembly (DLL) that you register in your database. You then create the appropriate SQL object that references the method in the assembly. Whenever you use the SQL object in your T-SQL code, the method in the .NET code is executed. This is quite a big topic and the majority of the work is in understanding what you can do with the .NET code and what properties you need to set in the methods you create.

To get started you need to enable CLR on your instance by running

EXEC sp_configure 'clr enabled', 1;
RECONFIGURE;

You then register your assembly using something like

CREATE ASSEMBLY [SimonBlog]
FROM path
WITH PERMISSION_SET = SAFE

Here the PERMISSION_SET can be SAFE, UNSAFE or EXTERNAL_ACCESS. See links below for more details

Finally you create your SQL object using something like this.

CREATE AGGREGATE [dbo].[ConcatMagic]
(@value [nvarchar](4000), @delimiter [nvarchar](4000))
RETURNS[nvarchar](4000)
EXTERNAL NAME [SimonBlog].[ConcatMagic]
GO

The EXTERNAL NAME bit specifies the method in the assembly that will be run whenever dbo.ConcatMagic is run.

Links:
My Blog – CLR Aggregates
CREATE ASSEMBLY
http://msdn.microsoft.com/en-us/library/ms254498.aspx

Sparse columns

When a column is marked as SPARSE the null values in that column do not use any storage on disk. However non-null values require more storage so SPARSE columns are only useful if most of the rows in that column contain null values. There is a threshold for each data type where the percentage of null values in the column could benefit from the use of SPARSE columns. The link below has full details.

Links:
http://technet.microsoft.com/en-us/library/cc280604.aspx

Implicit and explicit conversions

An explicit conversion is where you use a CAST or CONVERT function to emplicitly convert a value from one data type to another. NB. CAST is a SQL standard whereas CONVERT is an addition to the T-SQL dialect. It’s considered good practise to use SQL standard ahead of language specific functionality so you should use CAST unless you need to make use of the CONVERT function’s formatting functionality.

An implicit conversion is where SQL Server is asked to compare data of two different types. Before it can do the comparison it implicitly converts the data type of one of the values to the data type of the other. There is a precedence that SQL Server uses to determine what data types get implicitly converted and there is a table on the CAST and CONVERT link below that shows what data types can be converted to what.

For example, if you have the column numberValue of INT data type in the table SimonTest, the following code will run successfully.

SELECT * FROM dbo.SimonTest WHERE numberValue = ‘2’

In this case the string value ‘2’ gets implicitly converted into an INT.

Implicit conversions can be costly and they should be avoided if possible.

Jes Schultz Borland has an article on the Brent Ozar blog that discusses
Identifying and Correcting SQL Server Implicit Conversion

Links:
CAST and CONVERT
Identifying and Correcting SQL Server Implicit Conversion

Work with XML data

This is obviously a very big topic.  The “stairway” on SQL Server Central is very good.

Links:
70-461: Chapter 7
SQL Server Central – Stairway to XML

Data version control and management

SQL Server 2012 has two pieces of functionality to handle data version controlling: Change Tracking and Change Data Capture.

Links:
My Blog – Change Tracking

Recursive CTEs

Links:
My Blog

Understand the difference between @Table and #Table

Links:
My Blog – Some Differences Between Temporary Tables and Table Variables
70-461: Chapter 16 Lesson 2

Partitioned tables

A table (or index) can be partitioned so that subsets of rows of data can be spread across multiple file groups.  This is handy if your database has to handle lots of range queries.  For example, if you have an application that runs queries for data for last month, you could partition your data by a date column and then the optimiser will know that it only has to read the data for that particular partition.

There are four basic steps to create a partitioned table…

  1. Create the file groups that you want to use.
  2. Create a partition function that maps subsets of rows of data into the different partitions based on the values in one of the columns.
  3. Create a partition schema that maps the individual partitions to the various file groups.
  4. Create or modify a table to use the partition schema.

Links:
70-462: Chapter 3 Lesson 2
Partitioned Tables and Indexes

Filestream

Links:
70-462: Chapter 3 Lesson 1
http://technet.microsoft.com/en-us/library/gg471497.aspx
https://www.simple-talk.com/sql/learn-sql-server/an-introduction-to-sql-server-filestream/

Filetable

Links:
70-462: Chapter 3 Lesson 1
http://technet.microsoft.com/en-us/library/ff929144.aspx

Statistics

Whenever an index is created on a table the optimiser generates statistics on how the values in the columns in the index are distributed across the table. The number of rows that a given value exists in is called the cardinality of the value. When a query runs the optimiser can then use these statistics do determine how selective the query is and we’re it would be more efficient to use an index seek rather than an index scan.

In the vast majority of cases you just let SQL Server maintain the statistics.

Depending on how they were created or updated and on the numbers of rows in the table, statistics will either be based on every row in the table or on a sample of the rows.

Statistics will automatically be updated if

  • an empty table has a row added
  • a table with fewer than 500 rows has more than 500 rows added
  • a table with more than 500 rows has more than 500 rows added and the number of rows added is more than 20% of the total number of rows in the table.

Links:
70-462: Chapter 10 Lesson 1
http://technet.microsoft.com/en-us/library/ms190397.aspx

Index Maintenance

  • Identify fragmented indexes
  • Identify unused indexes

There are several handy dynamic management objects (DMOs) that you can use to maintain indexes.  Any cumulative counts that are returned are reset whenever the SQL Server service is reset.

sys.dm_db_index_physical_stats is a UDF that returns details about the physical structure of the heap or index.  You use the parameters of the function to filter the objects you want to look at and you must supply at least a database_id value.  Here’s an example of running this for the Person.Address table in AdventureWorks2012.

SELECT I.name AS IndexName, PS.*
FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('Person.Address'), NULL, NULL, 'DETAILED') PS
INNER JOIN sys.indexes I ON I.index_id = PS.index_id AND I.object_id = PS.object_id
ORDER BY PS.index_id, PS.index_level;

This returns details such as the type of index (or whether it’s a heap), level within the index and page counts per level, number of records at each level.  It also gives details of any fragmentation including a percentage.

For more details see sys.dm_db_index_physical_stats

sys.dm_db_index_operational_stats is function that returns details of how the index has been used.  You use the parameters of the function to filter the objects you want to look at and you must supply at least a database_id value.  Here’s an example of running this for the Person.Address table in AdventureWorks2012.

SELECT I.name AS IndexName, OS.*
 FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('Person.Address'), NULL, NULL) OS
 INNER JOIN sys.indexes I ON I.index_id = OS.index_id AND I.object_id = OS.object_id;

This will return cumulative counts of details such as inserts, updates and deletes at the leaf and non-leaf levels of the index, range scans and singleton lookups, locks and waits and lock escalation.

For more details see sys.dm_db_index_operational_stats

sys.dm_db_index_usage_stats is a view that returns details on the usage of indexes.  Here’s an example of running this for the Person.Address table in AdventureWorks2012.

SELECT *
FROM sys.dm_db_index_usage_stats 
WHERE database_id = DB_ID()
AND object_id = OBJECT_ID('Person.Address');

This returns cumulative counts of index scans, seeks and  lookups and also gives a date of the last time these occurred.  This is good for checking whether you’re indexes are being used.

Design, implement and troubleshoot security

Links:
70-462: Chapter 6

Describe advantages/disadvantages of using a GUID as a clustered index

You can generate nonsequential global unique identifiers by using the NEWID() T-SQl function and you can store these in a column with the UNIQUEIDENTIFIER data type. An advantage is that GUIDs can also easily be generated from application code and are guaranteed to be unique across space and time.
You can generate sequential GUIDs by usnig the NEWSEQUENTIALID() T-SQL function.
A disadvantage is that UNIQUEIDENTIFER data types are 16 bytes so they use four times as much space as an INT data type. If you use a UNIQUEIDENTIFER type as a clustered index key then this will be included in any non clustered indexes that you create.

Use Spatial Data

Links:
Spatial Data Types Overview

Extended Events

extended events are a lightweight replacement for the existing SQL Trace functionality which is deprecated along with it’s GUI SQL Profiler.
It provides two GUIs, the New Session Wizard and the New Session UI which replace SQL Profiler.
There are several events that you can monitor including data from the operating system and other applications.
SQL Server Audit is based on extended events.
Links:
Extended Events

Develop a CLR data type

Links:
http://technet.microsoft.com/en-us/library/ms131120.aspx

To do…

Partitioned views

isolation levels and how they affect locking and blocking

locks and lock escalation

identify bottlenecks

when to rebuild versus reorg and index

LOB data types

shrink file

run an SSIS job

check disk space

automate backups

encryption

impact of read committed snapshot/snapshot isolation

distributed transaction escaltion

database tuning advisor

query hints

full text search

impact of recovery model on database size

how to clean up if .mdf and .ldf files get too large

identify and fix transactional replication problems

Profiler

Recursive Common Table Expressions

Common tables expressions (CTE) are pretty simple to understand but there is a recursive version that is a little more complicated.  They are good if you’ve got hierarchical data stored in a self-referencing table.  For example, you may have an EmployeeId column and a ManagerId column in the same table.  The ManagerId value of a single row references an EmployeeId in another row.  You may have a management hierarchy consisting of many levels.  Recursive CTEs are a good way to get this hierarchical data out in one query and it doesn’t care how many levels there are in the hierarchy.  There is a maximum recursion level vaue that you can set that stops your recursive CTEs from entering an infinite loop and I’ll show you this later along with an example of a more complicated CTE.  Firstly, here’s a very basic example which shows the structure of a recursive CTE.

This is from a comment by elvedrano on the following link: http://technet.microsoft.com/en-us/library/ms186243(v=sql.105).aspx

WITH cte
AS ( 
 SELECT -- anchor member
 1 AS n
UNION ALL
SELECT -- recursive member
 n + 1
 FROM cte
 WHERE n < 50 -- terminator
)
SELECT
*
FROM cte;

The comments in this T-SQL highlight the different parts of the recursive CTE.  The three parts you need to include are the anchor member, the recursive member and the terminator.

The anchor member is usually a select statement that gets the first level in the hierarchy.  You then add a second select statement, separated by a UNION ALL, that references the CTE and the column that controls the hierarchy.  You can then add an optional terminator that will limit the recursion to a certain number of level.  If you only have a set number of levels you can omit the terminator.  The anchor member, recursive member and terminator all existing within the CTE declaration and you can then just select straight from the CTE to see the hierarchical results.

Here’s an example using employees and managers.

CREATE TABLE dbo.Employees
(
 EmployeeId INT NOT NULL PRIMARY KEY,
 EmployeeName NVARCHAR(50) NOT NULL,
 EmployeeRole NVARCHAR(50) NOT NULL,
 ManagerId INT NULL
);
INSERT INTO dbo.Employees (EmployeeId, EmployeeName, EmployeeRole, ManagerId)
VALUES
(1, 'Mickey Mouse', 'CEO', 1),
(2, 'Minnie Mouse', 'Chairman', 1),
(3, 'Donald Duck', 'CTO', 1),
(4, 'Tom the Cat', 'Head of Sales', 2),
(5, 'Jerry the Mouse', 'Head of Development', 3),
(6, 'Spot the Dog', 'Head of Testing', 3),
(7, 'Officer Dibble', 'Sales Manager', 4),
(8, 'Fred Flintstone', 'Sales', 7),
(9, 'Wilma Flintstone', 'Sales', 7),
(10, 'Barney Rubble', 'Sales', 7),
(11, 'Betty Rubble', 'Sales', 7),
(12, 'Captain Blackadder', 'Developer', 5),
(13, 'Baldrick', 'Developer', 5),
(14, 'Captain Darling', 'Tester', 6);
WITH EmployeeHierarchy AS (
 SELECT EmployeeId, EmployeeName, EmployeeRole, ManagerId, 1 AS Level
 FROM dbo.Employees
 WHERE EmployeeId = ManagerId
UNION ALL
SELECT E.EmployeeId, E.EmployeeName, E.EmployeeRole, E.ManagerId, EH.Level + 1 AS Level
 FROM dbo.Employees E
 INNER JOIN EmployeeHierarchy EH ON EH.EmployeeId = E.ManagerId
 WHERE E.EmployeeId <> E.ManagerId
)
SELECT *
FROM EmployeeHierarchy
DROP TABLE dbo.Employees;

RecursiveCTE

With recursive CTEs it’s possible to enter an endless loop.  You can prevent this from happening by using the MAXRECURSION option in the query.  We can modify the original simple CTE by removing the terminator and adding the MAXRECURSION option.  This query returns the same results, although it does return an error once the max recursion level is reached.

WITH cte
AS ( 
 SELECT -- anchor member
 1 AS n
UNION ALL
SELECT -- recursive member
 n + 1
 FROM cte
)
SELECT
*
FROM cte
OPTION (MAXRECURSION 49)