As part of my preparation for the 70-761 exam I’ve been trying to get a full understanding of the different types of window functions which are as follows:
I know the ranking functions pretty well but haven’t used the aggregate and offset functions at all really.
This third and final post demonstrates the offset functions.
I’ve been going through the official exam guidebook by Itzik Ben-Gan and he explains how the window functions work very well and the whole book is certainly worth reading if you use T-SQL at all.
With every window function you use the OVER clause. This is the part that specifies the “window” you are ranking or aggregating over.
The Offset Functions
For the demonstrations I’m going to use the same data set I used in the ranking function demo and create a table with 11 rows that contain purchases for 3 customers as follows…
CREATE TABLE dbo.WindowFunctionDemo ( Id INT NOT NULL IDENTITY(1,1), CustomerID INT NOT NULL, PurchaseDate DATETIME2(0) NOT NULL, PurchaseItem VARCHAR(100) NOT NULL, Amount DECIMAL(10,2) NOT NULL, ); GO INSERT INTO dbo.WindowFunctionDemo ( CustomerID, PurchaseDate, PurchaseItem, Amount ) VALUES (1, '20180306 12:34', 'Rugby Ball', 14.99), (1, '20180307 09:54', 'Frying Pan', 21.50), (1, '20180307 10:02', 'Vacuum Cleaner', 259.990), (2, '20180225 15:26', 'Orchid', 13.62), (2, '20180228 18:47', 'Book', 7.99), (2, '20180301 08:02', 'Snakes and Ladders', 8.00), (2, '20180306 14:32', 'Pillows', 20.00), (2, '20180306 14:32', 'Washing Machine', 359.99), (2, '20180307 23:15', 'Jeans', 59.99), (3, '20180301 09:27', 'Rucksack', 13.99), (3, '20180301 09:27', 'Scarf', 12.50); GO SELECT * FROM dbo.WindowFunctionDemo; GO
The offset window functions allow you to return data from other rows based on their distance away from the current row. For example, we might want to return all of the purchases made today but in the same row return the value of the previous purchase that the customer made.
The data set must be ordered so the functions can find previous and following rows correctly.
The following example returns all the data in the table along with the amount of the previous and next purchases each customer has made…
SELECT *, LAG(Amount) OVER (PARTITION BY CustomerID ORDER BY PurchaseDate) AS PreviousAmount, LEAD(Amount) OVER (PARTITION BY CustomerID ORDER BY PurchaseDate) AS NextAmount FROM dbo.WindowFunctionDemo; GO
You can pass a second parameter into the LAG and LEAD functions to access other rows and not just the ones immediately either side of the current row. This example returns the Amount value from the two previous orders…
SELECT *, LAG(Amount) OVER (PARTITION BY CustomerID ORDER BY PurchaseDate) AS PreviousAmount, LAG(Amount, 2) OVER (PARTITION BY CustomerID ORDER BY PurchaseDate) AS SecondPreviousAmount FROM dbo.WindowFunctionDemo; GO
If not supplied, a default value of 1 is used as this parameter so
is the same as
There is also a third parameter that can be passed in that will convert NULLs to a default value…
SELECT *, LAG(Amount, 1, -100) OVER (PARTITION BY CustomerID ORDER BY PurchaseDate) AS PreviousAmount, LAG(Amount, 2, -200) OVER (PARTITION BY CustomerID ORDER BY PurchaseDate) AS SecondPreviousAmount FROM dbo.WindowFunctionDemo; GO
There are also FIRST_VALUE and LAST_VALUE offset functions that return the first and last values in the window frame.
This example returns the Amount value of the first purchase for each customer…
SELECT *, FIRST_VALUE(Amount) OVER (PARTITION BY CustomerID ORDER BY PurchaseDate) AS FirstPurchaseAmount FROM dbo.WindowFunctionDemo; GO
We can do something similar to find the last value in the window frame… (however there is something you need to be careful with here!!!)
SELECT *, LAST_VALUE(Amount) OVER (PARTITION BY CustomerID ORDER BY PurchaseDate) AS LatestPurchaseAmount FROM dbo.WindowFunctionDemo; GO
As you can see this is not showing the results we might expect. We might expect to see 259.99 in the LatestPurchaseAmount for all rows for CustomerID = 1. If you read my post on the aggregate window functions you might remember that the default window frame is…
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
This means that for each row the window frame, i.e. the data available to the offset function, is all previous rows up to and including the current row. As we’re ordering by PurchaseDate, the window frame for the first purchase by customer 1 of the Rugby Ball on 2018-03-06 12:34:00 only contains that single row. The window frame is restricting access to the rows for later purchases.
All we need to do to get the required result is make sure we’re using the correct window frame, i.e. make sure the frame includes the latest purchase for each customer. We can do this as follows…
SELECT *, LAST_VALUE(Amount) OVER ( PARTITION BY CustomerID ORDER BY PurchaseDate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ) AS LatestPurchaseAmount FROM dbo.WindowFunctionDemo; GO
We could also use ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING which spreads the window frame across the whole partition.