How to provide hard statistics on your build sequences.
A while back my boss gave me two goals for our OS deployments; 1) he set a target for 90% successful builds and 2) build times as close to 1 hour as possible. Okay, getting there is one thing, but how do I report on that?
Configuration Manager 1511
After completing the first three parts of this series you would have a virtual lab with 4 separate network segments all connected to and routed through a Windows 2012 R2 server (RTR01) acting as the router. This server will also provide Internet access to any virtual machines that are connected to the 4 network segments. You also would have an Active Directory domain controller (DC01) that provides DHCP and DNS services to the lab.
In Part 4 we are going to build out a Configuration Manager 1511 infrastructure. This will include a Primary site server (CM16) and 2 Distribution Points (DP16a & DP16b).
We’ve amassed a very large number of task sequences since migrating to Configuration Manager 2012 and it got me thinking about ways to archive off older sequences so that we can clean house. So I came up with this script.
The script will first collect all of the task sequences in your site. Next it will enumerate through them, write the sequence to an XML named after the task sequence ID and finally create a CSV index of the TSID, task sequence name, the last modified date of the sequence and when the sequence was backed up.
Ran into a problem deploying build 10061 using SCCM 2012 R2. It would get as far as the standard “Setup Windows and ConfigMgr” action, reboot into the full OS and fail to continue the OSD task sequence. My test machine would join the domain and I could log in, but it was as if it just got bored and gave up on running the rest of the task sequence.
I would open up Control Panel and there would not be a Configuration Manager applet. The client folder (C:\Windows\CCM) didn’t exist either.
Did some searching and ran across this posting by Jörgen Nilsson. This was my exact problem.
The workaround is to skip the Windows Update Agent installation. So in my task sequence I added an ” /skipprereq:windowsupdateagent30-x64.exe” to the Installation Properties:
And now I’m off and running.
I’ve been working on integrating Windows 10 into our environment and ran across a couple of issues learning opportunities while doing so.
Upgrading from 9926 to 10049
First off I hit a snag attempting to upgrade my test machines from build 9926 to build 10049. The SCCM Team published a blog article back in October of last year on how to use a task sequence to upgrade a client to Windows 10. You can find the article here. A couple of weeks ago I had the opportunity to work with Paul Winstanley (SCCMentor and WMUG author) on a live blog he was writing on upgrading from one build to another using this method. In the lab environment it worked wonderfully, but when I tried it outside of the lab it failed every time in my environment.
Now, your first question might be along the lines of, “Why are you upgrading builds like that?” That is a good question. I cannot use Windows Update to upgrade my machines as new builds come out because the company I work for uses a combination of Group Policy and SCCM client policies to block access to WU and use WSUS/SCCM as the source for all of our updates. So I have to upgrade builds outside of the native process, hence using the task sequence in SCCM to perform the upgrade.
This was an old problem that I first ran into last Spring and I gave up after getting nowhere. I had forgotten all about it until this morning when a friend and fellow SCCM warrior Paul Winstanley wrote and asked me about it as he was getting the same failure. (Check out his writings here and here.)
First, some background…
Back in May 2014 I was having problems getting the Export-CMDriverPackage and Export-CMTaskSequence PowerShell cmdlets working. At the time I was looking for a way to easily move content from our development site to our production site.
This is probably one of those “Duh” moments that we all have but I thought I’d share it anyway.
I was getting frustrated when I was importing the MAC address of a new, out of the box computer into SCCM 2012 to be used to test my latest development build. I had a testing collection used solely for testing this new build. I’d import the computer and have the wizard place it into my testing collection, but it would never show up. I’d search All Systems and it would be there, so I know the import worked. I tried importing using a CSV. I tried adding the resource manually. I tried adding the object to the collection using a query. Nothing worked. No mater what I tried the imported computer would not appear in my test collection.
Then I noticed that my test collection was limited to a custom collection we have set up for only Windows 7 computers. I could only chuckle and laugh at myself for missing that in the first place.
So, what happened?
When you import a computer into SCCM it is added to the All Systems collection. That’s why when I searched All Systems I could find my imported machine. If, while going through the Import Computer wizard you specify a collection to add it to SCCM will create a direct membership rule for your newly imported computer account.
Where things went off the tracks for me was that my test collection was limited to that Windows 7 custom collection. That collection was, for the record, built using a query that looks at the OS info returned from Hardware Inventory. Since my imported computer had never reported inventory the query to scoop it up would pass right over it. So it only sat in All Systems, and since my test collection was not looking at All Systems it would never find it. No matter what I wanted.
Morale of the story?
If you’re going to be using a test collection for something like build testing, be sure that it is limited to the All Systems collection if you’re going to be importing new, out of the box computers for testing.
[Edit: I found a better analogy to explain what happened. See the bottom of this post]
Over the last three weeks we’ve been hit by an intermittent outage that knocked our SCCM infrastructure essentially offline for hours at a time. It would mysteriously start and after 2-4 hours it would mysteriously stop. During that time you could PXE boot a machine but it would not find any task sequences available and WinPE would reboot out from under the system. We were running down leads on network problems, DNS issues, WINs problems, SCCM infrastructure problems, just about everything under the sun. The problem would correct itself though before we could get anywhere.
We started combing through the status messages during the last outage and found something unusual. During the outage there was a flood of 5101 status messages (“Policy Provider successfully updated the policy and the policy assignment that were created from package…”) from the SMS_POLICY_PROVIDER log. A flood to the tune of nearly 8000 per hour. It appeared that just about every package in every task sequence we had was having its policy updated.
Our next problem was finding out what triggered all of these policy updates. Digging deeper into the status messages found that immediately prior to the 5101 messages pouring in was a 30001 message (“User domain\user modified the Package Properties of a package…”) showing that someone had modified the properties of one of our task sequences in development.
That someone was me.
At this point things fell together. The morning of the latest outage I had been working on a new development OSD task sequence. We use NomadBranch from 1E in our environment. The product has extensions that add a “Nomad” tab to the property page that allow you to configure the software’s settings. On the properties of a task sequence you can ensure that all packages referenced will be configured correctly.
That morning I enabled the “Enable Nomad” check box. The Nomad extensions then cycled through all 112 packages referenced by the task sequence and ensured that setting was enabled on each and every one. A very convenient option. It prevents us from having to manually check each and every package to ensure that the Alternate Content Provider is set.
Great except modifying the Alternate Content Provider is one of those package properties that triggers a policy update in SCCM. And if a package requires a policy update SCCM will cycle through all references to that package and update the policies for all deployments/advertisements for those references.
So for each of those 112 packages SCCM then found every other task sequence that referenced them. And then for each of those instances it would initiate a policy update for each and every deployment.
This problem is not a NomadBranch issue though. You can accomplish the same thing with any mass-manipulation of the package properties. If you use a script to alter the “Disconnect uses from distribution points” option (found on the Data Access tab) on a series of packages SCCM will start cross referencing each and every package and find all of the task sequence deployments that reference that package and update the policy on them. Then it will repeat the process for the next package and so on and so on and so on….
This is easy to duplicate.
[Warning, do not attempt this in a production environment!]
Within the SCCM 2012 console open the Monitoring node. Then expand the System Status branch and select Status Message Queries. Right-click on the All Status Messages query and select Show Messages.
Now, select a package that you know is in a couple of task sequences. Perhaps the OS image, or a driver package. Something that will be referenced by multiple sequences. Open the properties of that package, select the Data Access tab and toggle the “Disconnect users from the distribution point” option and click Apply.
Go back to your status message query and refresh (F5). You should right away see the 30001 and 23xx messages showing that you have updated a package and it is being processed by SCCM. Within a few moments the 5101 messages should appear, one for every package+sequence+deployment combination. Now, imagine that multiplied by every package within your task sequence.
What’s the morale of this story?
Use caution when doing any kind of mass update of package properties.
In our situation the flood of policy updates appears to have overwhelmed our Management Points. They were too busy fielding the policy updates to handle policy requests from the systems attempting to start OS deployments.
Is this Nomad’s Fault or SCCM’s Fault?
We shot ourselves in the foot though and brought this down on ourselves. What did us in was the vast number of deployments/advertisements we have out there. If we had been better stewards of SCCM and cleaned up after ourselves this wouldn’t have knocked us out of the water. We had hundreds of stale, out of date deployments that had never been cleaned up after they were done. It was those old deployments that acted as the gas being poured on the fire.
When explaining what happened I came up with a better explanation of what exactly caused the problem.
It was this vast number of deployments that brought the house down. It was like a series of nested FOREACH statements….
FOREACH package in the task sequence
FIND all other task sequences that reference the package
FOREACH of those task sequences
FIND each deployment for that task sequence
FOREACH of those deployments
That’s a lot of multipliers there. That’s what ultimately killed us, the large number of deployments (~500) we had lingering around, most of which were out of date.
Had we been better about cleaning up old, out of date deployments I don’t think we would have ever had an issue.
Okay, it’s been a busy few days but I finally needed to get back to this topic.
In part 1 I talked about the custom actions needed to deploy a Windows 10 image using SCCM 2012. I’m re-inventing my OS deployment process and eliminating the use of a custom-built base image and instead will be using the factory WIM file and performing all customizations at build-time. We had trimmed down the custom actions used to generate the base image over time, so I felt it was a good time to cut the custom base image loose and streamline the process.
In this part I’ll go over the changes I needed to make to get the .NET 3.5 feature added. Windows 10 included .NET 4.5 but we have some applications that specifically require .NET 3.x.
I could not use the native “Install Roles and Features” step in the SCCM task sequence. Since my task sequence is applying the factory WIM as an image and not running it through setup, I cannot reference the SXS files in the OS source. Windows 10 is imported as an Operating System Image and not an Operating System Installer so there are no supporting files.
To get .NET 3.5 installed will take a little creativity.
I want to thank Niall Brady for the info on this process.
[If you’re looking for some serious information on how to do SCCM, his site is the place.]
We have 2 “Run Command Line” actions required to get this done.
- Make the needed SXS files available
- Use DISM to add the .NET feature