SSIS Server Maintenance - cleanup_server_retention_window Job error

Category: sql server ssis

Question

Sivasankar chinnaiyan on Fri, 21 Dec 2018 00:24:54


we are receving the following error message in sql server 2017(CU11) when the SSISDB Maintenance job runs. this job scheduled daily once and SSISDB database recovery model is simple. we have observed this issue after upgraded to SQL 2017 not for all the SQL servers. many SQL server with 2017 are completing without any error. 
when run this cleanup proc step, its blocking other SSIS DB jobs also.

Date 12/10/2018 12:00:00 AM
Log Job History (SSIS Server Maintenance Job)

Step ID 1
Server AZ4OBIUCMLSQL03
Job Name SSIS Server Maintenance Job
Step Name SSIS Server Operation Records Maintenance
Duration 00:34:21
Sql Severity 16
Sql Message ID 16916
Operator Emailed
Operator Net sent
Operator Paged
Retries Attempted 0

Message
Executed as user: ##MS_SSISServerCleanupJobLogin##. A cursor with the name 'execution_cursor' does not exist. [SQLSTATE 34000] (Error 16916)


SQL version : 

Microsoft SQL Server 2017 (RTM-CU11) (KB4462262) - 14.0.3038.14 (X64)   Sep 14 2018 13:53:44   Copyright (C) 2017 Microsoft Corporation  Enterprise Edition:


I have referred below links but no able to fix the issue.

https://social.msdn.microsoft.com/Forums/sqlserver/en-US/9ce94c74-d552-48ec-8621-8be3124c53bc/ssisdb-2017-cu8-running-in-always-on-group-maintenance-procedure-bug?forum=sqlintegrationservices

https://nakulvachhrajani.com/2016/09/19/0386-sql-server-cursor-scope-a-cursor-with-the-name-cursor-name-does-not-exist-msg-16916/

https://stackoverflow.com/questions/50517509/ssis-server-maintenance-job-error

Replies

Yang.Z on Fri, 21 Dec 2018 11:25:59


Hi Sivasankar, 

-->we have observed this issue after upgraded to SQL 2017 not for all the SQL servers.

Do you have any idea about the difference between these "good" server and "failed" server? 

As Pirlo suggested in the thread you post, you can try to uninstall the CU temporarily and feedback on SQL Server User Voice

Tom Phillips on Fri, 21 Dec 2018 14:02:14


I believe this is due to the encryption level changed without the log cleaned up properly.

Try following the instructions the "Remarks" section, but don't change the encryption level:

https://docs.microsoft.com/en-us/sql/integration-services/system-stored-procedures/catalog-cleanup-server-log?view=sql-server-2017#remarks

Sivasankar chinnaiyan on Fri, 21 Dec 2018 22:03:26


There is no difference in that. I can see only difference is number of records in ssisdb log tables.

Yang.Z on Tue, 25 Dec 2018 08:54:39


Hi Sivasankar, 

Have you tried the solution proposed by Tom Phillips? 

Does that help you solve the issue? 

Hoping for your reply. 

Sivasankar chinnaiyan on Wed, 26 Dec 2018 23:51:54


Hi Yang , We couldn't try now and will try after year end and update you. thank you for following up on this.

Yang.Z on Thu, 27 Dec 2018 01:51:05


Hi Sivasankar, 

You are welcome.  

Thanks for your update:-)

justjoedbdude on Fri, 04 Jan 2019 03:27:51


Resolution: The cursor deallocate statement on line 175 of proc "ssisdb.internal.cleanup_server_retention_window" needs to be moved two lines down.

Cause: IF/ELSE blocks run depending on the SSIS catalog encryption level, and each block opens a cursor within a loop itself.  The 'ELSE' part of the IF/ELSE logic simply has the 'deallocate' statement within its loop instead of outside of it like the first block properly has.

Posted on: 2019-01-01

SQL Version: Microsoft SQL Server 2017 (RTM-CU12) (KB4464082) - 14.0.3045.24


Harshal Lokhande on Mon, 14 Jan 2019 09:50:36


Hi JustJoedbdude,

Thanks for your help. I am colleague of SivaSankar and we tested this change in Test environment and it does solved the failure. However the job runs longer than expected and also creating blocking for other jobs. 

You mentioned SQL Version: Microsoft SQL Server 2017 (RTM-CU12) (KB4464082) - 14.0.3045.24 here. Does CU12 upgrade resolved this issue if you have noticed somewhere?

We are currently on CU11.

Steve Rezhener on Fri, 18 Jan 2019 04:34:24



Steve Rezhener on Fri, 18 Jan 2019 04:42:11


With so many BEGIN and END blocks, no wonder DEALLOCATE ended up in the wrong place. Looks like this change was introduced with IF @server_operation_encryption_level = 1. So moving DEALLOCATE from line #175 to #177 should fix the problem, but ... IMHO, all 4 "garbage collectors" CLOSE, TRUNCATE, DEALLOCATE, and DROP x 2 for each IF condition) should completely removed and replaced with x1 time reference (CLOSE, TRUNCATE, DEALLOCATE, and DROP) on line #179.

Steve Rezhener on Fri, 18 Jan 2019 05:27:41


We are on CU12. Experienced an identical problem. Moving DEALLOCATE statement has fixed the problem. We've increased the batch size from 1000 to 10000 to make it faster => SET @delete_batch_size = 10000. The job was failing for 1.5 months, so there was a lot of operations to delete. The job is working again.