Rencore GmbH on Thu, 31 Jan 2013 21:50:33
Taking SharePoint Solution Deployer, my opensource PowerShell deployment script, to the next level, Bill Simser got me the idea of making the deployment even more smooth on farms with multiple WFEs and load balancer in order to achieve a no-downtime deployment
The basic idea is to deploy the solutions on each WFEs one-by-one by
1. Taking one WFE offline
2. Installing the solution with the -local switch
//Solution deployment Install-SPSolution -Identity <solutionname>.wsp –GACDeployment –CASPolicies –Local // Solution upgrade Update-SPSolution -Identity <solutionname>.wsp -LiteralPath LocalPathOfTheSolution.wsp -GacDeployment -Local
3. Run post-deployment actions on the WFE (ie. restart services, recycle apppools or IIS reset, warmup server), which my script already does for each server
4. Take WFE online again
5. Repeat step 1-4 for all other WFEs
I am struggling with three things here:
1. The whole deployment process could be quite risky when something goes wrong in between. And in order to roll back I would require the original solution if it was already deployed before (which I can back up of course before I replace it)
Anything which involves changing the content dbs should of course be done after the solutions is deployed to the whole farm, so this should not hurt in this case.
Anyway MSDN says that the "DeployLocal" method (which I assume is the same as the -local switch in PS ) should be only used
So it would be great to hear about anyones experiences with it
2. As there can be different types of load balancers (hardware, software) which might not be configurable through my script I assume that taking out the WFE from the the load balancer may not always be possible.
So I thought about just taking the server offline.
I haven't found an option yet to take only one server in the farm offline (without removing it from the farm of course), so maybe I miss something. Any ideas?
3. Before taking a single WFE offline, I would like to assure that this server does not have any open sessions, operations of users ongoing. Unfortunately I found only the possibility to quiesce the whole farm, but not a single
server. Am I missing something?
Appreciate any ideas which might point me in the direction to solve the overall goal!
SharePoint Architect, Speaker, MCP, MCPD, MCITP, MCSA, MCTS, Scrum Master/Product Owner
Blog: www.matthiaseinig.de, Twitter: @mattein
CodePlex: SharePoint Software Factory, SharePoint Solution Deployer
Chris Givens on Fri, 01 Feb 2013 01:03:00
If you ever do figure this out, I want to know how you do it. I can't tell you how many times I have had the uncomfortable conversation about total yearly downtime when it comes to deploying changes to SharePoint. It would be nice if they just built this into the product and the community didn't have to do things like what you are doing.
But do wish you the best of luck...we are cheering for you!
Rencore GmbH on Fri, 01 Feb 2013 07:34:59
Thx for the moral support, Chris! ;-)
I'll keep you posted on any progress.
Bil mentioned the PowerShell Cmdlets for Windows software NLB-Cluster to look into
which would help at least for these kind of LBs but I would rather prefer a more general approach.
Also the quiescing problem for the single WFE will remain.
MAllen99 on Tue, 28 Oct 2014 22:52:01
Hi Matthias I figured this might be better than starting a fresh discussion, I was wondering if you had found a decent solution for this.
Really all that I've found is similar to what you have written. We are using MS NLB so we can drainstop the NLB cluster so one node stops taking new connections but it takes a while to wait for the current connections to end.
This blog talks about their similar process. http://sharing-the-experience.blogspot.com/2012/06/sharepoint-farm-solution-deployment.html.
It seems the issue might be that it still uses the -local command which you mentioned might not be recommended by MS and only when troubleshooting but also discusses that -local cannot be used when uninstalling a solution so an uninstall and reinstall would cause and outage.
On a side note I'm curious if you are aware if Avepoint or any other software vendor has a solution for this.
Thanks for any update
Rencore GmbH on Wed, 29 Oct 2014 14:42:56
unfortunately not. I tried several different approaches but didn't really success reliably with any of them. So eventually I gave up on it.
Interesting idea though that Eric Hasley is commenting on the blog post you mentioned.
"There is another approach that has worked for me in the past. Because the deployment to each server is handled through a timer job, by stopping the timer service in a controlled fashion you can rollout your solution without incurring any user outage."
It could work like that (in theory).
- Stop the SPTimerV4 on all servers in the farm apart from one.
- Take out the one to deploy to from the NLB
- Wait until it has no connections
- Deploy the solutions on it in the ordinary way (eg. with my SharePoint Solution Deployer ;))
- Put it back into the NLB and take the others out
- Wait until they have no connections left
- Activate the timer service on the others servers and let them deploy
- Put them back into the NLB
No clue if this is actually working and you still have the problem with the NLB, so it could take a while.
Also I am not certain what happens in state 5 if users use different versions of your solutions at the same time (old version on the remaining open connections, new version on the updated server)
I do not have a suitable farm at hand to play with it though, so can't test it.
Matthias Einig, CEO, SharePoint MVP
Blog: www.matthiaseinig.de, Twitter: @mattein
Projects: SharePoint Code Analysis Framework (SPCAF),SharePoint Code Check (SPCop), SharePoint Software Factory, SharePoint Solution Deployer
MAllen99 on Thu, 30 Oct 2014 15:45:07
Thanks for the reply and your time. I'll give it a test. The other possibility I was thinking is similar and entails using the wsp solution setting that doesn't force an iisreset. This way we deploy and we don't have to use the -local command then it is in the GAC just waiting on the iisreset. Drainstop 1st node, iisreset, allow connections back in, drainstop 2nd node, iisreset, and finally allow connections back in on that node.
Admittedly this runs into the same issues that you mentioned regarding speed and/or any complications with running two versions.
AhCheng on Fri, 23 Dec 2016 01:28:09
How i wish SharePoint can be made no downtime deployment. That saves a lot of our After working hour time :)