Thursday, September 1, 2016

TempDB usage per active session

From time to time you find yourself needing to shrink some space out of TempDB. Shrinking database files is never my first choice but sometimes it is the best I have. Many people think that you cannot shrink TempDB in SQL 2005, but I am going to show you how.
Why would I need to shrink TempDB?
Yesterday afternoon my pager started going crazy because an Ad-Hoc query that needed some tuning filled TempDB on a server. Luckily, the user only impacted their own query so it was easy to quickly identify them and work with the right people to get the query rewritten.
Once the immediate problem was resolved there had to be some cleanup. On this server, TempDB has 32 files (1 per processor) all on the same disk. The full database condition caused all kinds of alerts in our monitoring tools, from drive space alerts to too few growths remaining. There were 3 possible solutions to quiet the alerts:
1. Reboot – There is never a good time to reboot a production server
2. Turn off the Alerts – Not really an option. My preference would be for increasing the sensitivity
3. Shrink TempDB – Not a great option, but the best of the 3
Shrinking TempDB
Once we had decided that we would go ahead and shrink the files in TempDB it seemed like the hard part was done, but after running the following command:
USE [tempdb]
DBCC SHRINKFILE (N’tempdev’ , 5000)
I got back the following:
DBCC SHRINKFILE: Page 1:878039 could not be moved because it is a work file page.
DbId FileId CurrentSize MinimumSize UsedPages EstimatedPages
—— ———– ———– ———– ———– ————–
2 1 878040 640000 4672 4672

(1 row(s) affected)

DBCC execution completed. If DBCC printed error messages, contact your system administrator.

“Page could not be moved because it is a work file page.”…grrr. This is a new thing in SQL 2005 caused by the caching that is done in TempDB. I am not going to try to explain here how objects are cached in TempDB, but Kalen Delaney’s Inside Sql Server Series is a great place to learn about it if you are interested ( What is important is that the cached objects are tied to a query plan and that by freeing the procedure cache you can make those objects go away, allowing you to shrink your files.
Trying again:
USE [tempdb]
DBCC SHRINKFILE (N’tempdev’ , 5000)
This time it worked:

DBCC execution completed. If DBCC printed error messages, contact your system administrator.
DbId FileId CurrentSize MinimumSize UsedPages EstimatedPages
—— ———– ———– ———– ———– ————–
2 1 640000 640000 264 264

(1 row(s) affected)

DBCC execution completed. If DBCC printed error messages, contact your system administrator.
I think I got lucky that the shrink worked on the first try. There will certainly be times when you have to try freeing the procedure cache and shrinking multiple times to get a file to shrink, but eventually it will get the job done.

Tempdb is hard to shrink. Some pages cannot be moved because they are actively being used by system processes, so your chances of shrinking tempdb sink.

I see that you already have tried clearing all the caches, which sometimes helps, but is not guaranteed to work.
Try to identify what is using tempdb:

Even if usage is low, a single non movable page is enough to make the shrink process ineffective.

If you can't shrink it, I suggest that you plan a downtime to restart the service and let tempdb restart from its initial size.

Lists the TempDB usage per each active session.
It helps identifying the sessions that use the tempdb heavily with internal objects.

When the internal objects usage is high, the session is probably using big hash tables or spooling in worktables. It could be a symptom of
 an inefficient plan or a missing index.

Shrinking a TempDB full of internal objects will probably have no effect, because the engine will not release the deallocated space.
The only possible alternative to restarting the service, is running DBCC FREESYSTEMCACHE('ALL'), that will clear all cached objects,
 including not only internal objects, but also cached query plans. Use it carefully on a production server.

;WITH task_space_usage AS (
    -- SUM alloc/delloc pages
    SELECT session_id,
           SUM(internal_objects_alloc_page_count) AS alloc_pages,
           SUM(internal_objects_dealloc_page_count) AS dealloc_pages
    FROM sys.dm_db_task_space_usage WITH (NOLOCK)
    WHERE session_id <> @@SPID
    GROUP BY session_id, request_id
SELECT TSU.session_id,
       TSU.alloc_pages * 1.0 / 128 AS [internal object MB space],
       TSU.dealloc_pages * 1.0 / 128 AS [internal object dealloc MB space],
       -- Extract statement from sql text
                   ERQ.statement_start_offset / 2,
                   CASE WHEN ERQ.statement_end_offset < ERQ.statement_start_offset THEN 0 ELSE( ERQ.statement_end_offset - ERQ.statement_start_offset ) / 2 END
               ), ''
           ), EST.text
       ) AS [statement text],
FROM task_space_usage AS TSU
INNER JOIN sys.dm_exec_requests ERQ WITH (NOLOCK)
    ON  TSU.session_id = ERQ.session_id
    AND TSU.request_id = ERQ.request_id
OUTER APPLY sys.dm_exec_sql_text(ERQ.sql_handle) AS EST
OUTER APPLY sys.dm_exec_query_plan(ERQ.plan_handle) AS EQP