I’ve decided to skip exam 70-463 (Implementing a Data Warehouse with SQL Server 2012) and move onto 70-464. (See my previous post for the full 2012 certification path). Microsoft have reintroduced the second shot vouchers so I thought I take the plunge and book the exam for 26th September. If I do fail I have until May 2014 to resit for free.
I’ve had a look through the list of skills measured and have highlighted the following areas that I need to recap or improve my knowledge in.
I’m going to use this page to add links to various sources of information as I work through the different areas and will also add references to the 70-461 and 70-462 exam guide books if appropriate which will be the comments that begin with 70-461 or 70-462.
For information here is the full list of skills measured: http://www.microsoft.com/learning/en-us/exam-70-464.aspx
In no particular order:
CREATE NONCLUSTERED COLUMNSTORE INDEX
70-461: Chapter 15 Lesson 1
70-461: Chapter 17 Lesson 2
70-462: Chapter 10 Lesson 1
Introduced in 2005 (I think) the Common Language Runtime allows you to use the power of the .NET framework within your T-SQL code. You can create CLR triggers, stored procedures, functions, data types and aggregates and you compile the .NET code into an assembly (DLL) that you register in your database. You then create the appropriate SQL object that references the method in the assembly. Whenever you use the SQL object in your T-SQL code, the method in the .NET code is executed. This is quite a big topic and the majority of the work is in understanding what you can do with the .NET code and what properties you need to set in the methods you create.
To get started you need to enable CLR on your instance by running
EXEC sp_configure 'clr enabled', 1; RECONFIGURE;
You then register your assembly using something like
CREATE ASSEMBLY [SimonBlog] FROM path WITH PERMISSION_SET = SAFE
Here the PERMISSION_SET can be SAFE, UNSAFE or EXTERNAL_ACCESS. See links below for more details
Finally you create your SQL object using something like this.
CREATE AGGREGATE [dbo].[ConcatMagic] (@value [nvarchar](4000), @delimiter [nvarchar](4000)) RETURNS[nvarchar](4000) EXTERNAL NAME [SimonBlog].[ConcatMagic] GO
The EXTERNAL NAME bit specifies the method in the assembly that will be run whenever dbo.ConcatMagic is run.
When a column is marked as SPARSE the null values in that column do not use any storage on disk. However non-null values require more storage so SPARSE columns are only useful if most of the rows in that column contain null values. There is a threshold for each data type where the percentage of null values in the column could benefit from the use of SPARSE columns. The link below has full details.
Implicit and explicit conversions
An explicit conversion is where you use a CAST or CONVERT function to emplicitly convert a value from one data type to another. NB. CAST is a SQL standard whereas CONVERT is an addition to the T-SQL dialect. It’s considered good practise to use SQL standard ahead of language specific functionality so you should use CAST unless you need to make use of the CONVERT function’s formatting functionality.
An implicit conversion is where SQL Server is asked to compare data of two different types. Before it can do the comparison it implicitly converts the data type of one of the values to the data type of the other. There is a precedence that SQL Server uses to determine what data types get implicitly converted and there is a table on the CAST and CONVERT link below that shows what data types can be converted to what.
For example, if you have the column numberValue of INT data type in the table SimonTest, the following code will run successfully.
SELECT * FROM dbo.SimonTest WHERE numberValue = ‘2’
In this case the string value ‘2’ gets implicitly converted into an INT.
Implicit conversions can be costly and they should be avoided if possible.
Jes Schultz Borland has an article on the Brent Ozar blog that discusses
Identifying and Correcting SQL Server Implicit Conversion
Work with XML data
This is obviously a very big topic. The “stairway” on SQL Server Central is very good.
70-461: Chapter 7
SQL Server Central – Stairway to XML
Data version control and management
SQL Server 2012 has two pieces of functionality to handle data version controlling: Change Tracking and Change Data Capture.
My Blog – Change Tracking
Understand the difference between @Table and #Table
My Blog – Some Differences Between Temporary Tables and Table Variables
70-461: Chapter 16 Lesson 2
A table (or index) can be partitioned so that subsets of rows of data can be spread across multiple file groups. This is handy if your database has to handle lots of range queries. For example, if you have an application that runs queries for data for last month, you could partition your data by a date column and then the optimiser will know that it only has to read the data for that particular partition.
There are four basic steps to create a partitioned table…
- Create the file groups that you want to use.
- Create a partition function that maps subsets of rows of data into the different partitions based on the values in one of the columns.
- Create a partition schema that maps the individual partitions to the various file groups.
- Create or modify a table to use the partition schema.
70-462: Chapter 3 Lesson 2
Partitioned Tables and Indexes
70-462: Chapter 3 Lesson 1
Whenever an index is created on a table the optimiser generates statistics on how the values in the columns in the index are distributed across the table. The number of rows that a given value exists in is called the cardinality of the value. When a query runs the optimiser can then use these statistics do determine how selective the query is and we’re it would be more efficient to use an index seek rather than an index scan.
In the vast majority of cases you just let SQL Server maintain the statistics.
Depending on how they were created or updated and on the numbers of rows in the table, statistics will either be based on every row in the table or on a sample of the rows.
Statistics will automatically be updated if
- an empty table has a row added
- a table with fewer than 500 rows has more than 500 rows added
- a table with more than 500 rows has more than 500 rows added and the number of rows added is more than 20% of the total number of rows in the table.
70-462: Chapter 10 Lesson 1
- Identify fragmented indexes
- Identify unused indexes
There are several handy dynamic management objects (DMOs) that you can use to maintain indexes. Any cumulative counts that are returned are reset whenever the SQL Server service is reset.
sys.dm_db_index_physical_stats is a UDF that returns details about the physical structure of the heap or index. You use the parameters of the function to filter the objects you want to look at and you must supply at least a database_id value. Here’s an example of running this for the Person.Address table in AdventureWorks2012.
SELECT I.name AS IndexName, PS.* FROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('Person.Address'), NULL, NULL, 'DETAILED') PS INNER JOIN sys.indexes I ON I.index_id = PS.index_id AND I.object_id = PS.object_id ORDER BY PS.index_id, PS.index_level;
This returns details such as the type of index (or whether it’s a heap), level within the index and page counts per level, number of records at each level. It also gives details of any fragmentation including a percentage.
For more details see sys.dm_db_index_physical_stats
sys.dm_db_index_operational_stats is function that returns details of how the index has been used. You use the parameters of the function to filter the objects you want to look at and you must supply at least a database_id value. Here’s an example of running this for the Person.Address table in AdventureWorks2012.
SELECT I.name AS IndexName, OS.* FROM sys.dm_db_index_operational_stats(DB_ID(), OBJECT_ID('Person.Address'), NULL, NULL) OS INNER JOIN sys.indexes I ON I.index_id = OS.index_id AND I.object_id = OS.object_id;
This will return cumulative counts of details such as inserts, updates and deletes at the leaf and non-leaf levels of the index, range scans and singleton lookups, locks and waits and lock escalation.
For more details see sys.dm_db_index_operational_stats
sys.dm_db_index_usage_stats is a view that returns details on the usage of indexes. Here’s an example of running this for the Person.Address table in AdventureWorks2012.
SELECT * FROM sys.dm_db_index_usage_stats WHERE database_id = DB_ID() AND object_id = OBJECT_ID('Person.Address');
This returns cumulative counts of index scans, seeks and lookups and also gives a date of the last time these occurred. This is good for checking whether you’re indexes are being used.
Design, implement and troubleshoot security
70-462: Chapter 6
Describe advantages/disadvantages of using a GUID as a clustered index
You can generate nonsequential global unique identifiers by using the NEWID() T-SQl function and you can store these in a column with the UNIQUEIDENTIFIER data type. An advantage is that GUIDs can also easily be generated from application code and are guaranteed to be unique across space and time.
You can generate sequential GUIDs by usnig the NEWSEQUENTIALID() T-SQL function.
A disadvantage is that UNIQUEIDENTIFER data types are 16 bytes so they use four times as much space as an INT data type. If you use a UNIQUEIDENTIFER type as a clustered index key then this will be included in any non clustered indexes that you create.
Use Spatial Data
Spatial Data Types Overview
extended events are a lightweight replacement for the existing SQL Trace functionality which is deprecated along with it’s GUI SQL Profiler.
It provides two GUIs, the New Session Wizard and the New Session UI which replace SQL Profiler.
There are several events that you can monitor including data from the operating system and other applications.
SQL Server Audit is based on extended events.