SSIS Checkpoints

There is a very handy feature available in SSIS 2012 onwards that allows you to restart a package from the point of failure and skip any previously successful tasks. This is particularly useful if you have a few tasks that take a very long time to run that have been successful and a task further down the process chain fails. You don’t want to have to rerun the long tasks again if there is no need to. Adding checkpoints to package can do this. You just need to enable your package to allow checkpoints and then set which tasks (it could be all of them) are points in the package that you can restart from if the package fails further down the line. Here’s an example of a simple process using checkpoints…

Here we have a simple little package that truncates a table, inserts some data and then runs an update on that data. I’m going to force the update task to fail and get it to restart without having to truncate the data and load it in again. If the insert of the data took several minutes or even hours then this would save a lot of time.

First of all just create a test table

USE SimonBlog
GO
CREATE TABLE dbo.SSISCheckPointTest
(
 ID INT NOT NULL IDENTITY(1,1),
 SomeNumber INT NOT NULL
);

Now create a new SSIS package with three Execute SQL Tasks.

1

The first one just truncates the table dbo.SSISCheckPointTest

TRUNCATE TABLE dbo.SSISCheckPointTest

The second one runs the following just to populate the table with some data and then returns a single value which is the number of rows inserted into the table.  This gets written to the SSIS variable @SomeNumber.

INSERT INTO dbo.SSISCheckPointTest
SELECT ROW_NUMBER() OVER (ORDER BY SalesOrderId)
FROM [AdventureWorks2012].Sales.SalesOrderDetail
CROSS JOIN [AdventureWorks2012].[Sales].[SalesPerson]
SELECT @@ROWCOUNT AS NumRows;

The third step runs an update statement that attempts to set the SomeNumber integer value to a character value as follows…

UPDATE dbo.SSISCheckPointTest
SET SomeNumber = 'A'
WHERE ID = 100;

Now lets run the package and the third task will fail

2

As we can see from the Progress tab the insert took 40 secs which was by far the longest part of the whole package.  If this was 40 minutes rather than 40 seconds we wouldn’t want to run the insert part again.  We’d just want to fix the failed step and then run it from there.

Just to confirm we can see that the insert was successful and we have some rows.

3

Now let’s set the package up so that it creates a checkpoint after doing the insert.  Firstly enable it in the package properties by setting the SaveCheckpoints property to True.  Then specify an XML file in the CheckpointFileName property.  This will record when each checkpoint is hit and which ones were successful.  It also stores the state of each variable.  Finally set the CheckpointUsage to IfExists.  This means that if the checkpoint file doesn’t exist then it will create it.  If the package completes successfully then the checkpoint file is deleted so the next time it runs it will start from the beginning.

4

Next we need to specify which tasks will create a checkpoint.  We want to be able to restart the package after the insert task so we will set a checkpoint on that task.  I’m also going to get a checkpoint on the truncate task.  To do this set the FailPackageOnFailure to True.

5

Now let’s run the package again.  As there was previously no checkpoints enabled the package starts from the beginning and fails again on the update task.

6

We can now see that the checkpoint XML file was created as follows. We can see that the SomeNumber variable has a value of 2062389.

7

Now let’s fix the update statement and change it to use the value of the SomeNumber variable in the update.  As you can see below the default value of the SomeNumber variable is 0 so if it get’s set to 2062389 this will prove that the checkpoints also remember the values of variables.

8

9

Now lets rerun the package.

10

As you can see only the update task ran and this time it was successful.  We can also see that the SomeNumber value for the row with ID = 100 has been set to the value in the SomeNumber variable.

11

The checkpoint file has also now been deleted so the next time the package executes it will start from the beginning.

Advertisements

JSON Support in SQL Server 2016

Just seen a blog from Aaron Bertrand about JSON support in SQL SERVER 2016. In my last job we wanted to export some data stored in SQL SERVER into Elasticsearch and converting large amounts of data into JSON documents was quite tricky to get working at first. There are lots of other database applications that store their data as JSON, for example, MongoDB and this looks like it will be helpful loading data from SQL SERVER into these if needed. Microsoft have got their own rival for the likes of MongoDB named DocumentDB and this stores its documents in JSON so I guess this JSON support is something they needed to add.

MongoDB University Courses Completed!

For the last few weeks I’ve been doing two of the free online courses at https://university.mongodb.com/.  If you’re interested in learning about MongoDB then they are definitely worth doing.  So long as you followed through the short videos the weekly homework and final exams are pretty straight forward.  I’ve just found out my scores and I got the following.

M101N – MongoDB for .Net Developers – 95%

M102 – MongoDB for DBAs – 95%

I’m now thinking about doing M202 and then maybe the certification but for the meantime back to SQL Server and exam 70-463 which is scheduled for 2nd July 2015.