Windows Autopilot diagnostics: Digging deeper

By Michael Niehaus on July 12, 2020 • ( 9 Comments )

I’ve posted quite a few blogs talking about troubleshooting Windows Autopilot (such as this one), with some additional posts (three separate ones here, here, and here) talking about a script named Get-AutopilotESPStatus that can help display information about what went on during an Autopilot provisioning process. While initial versions of that script were designed to read information from the device itself, later versions added logic to extract information from a CAB file created by the standard “MDMDiagnosticsTool.exe -area Autopilot -cab c:\autopilot.cab” command line. Even with that, I found myself still digging through more of the logs to figure out the timeline of events. So back to Visual Studio Code to make some additional enhancements.

By the time I was done, it felt like the script went beyond the original name, so I’ve posted a new version under a new name: Get-AutopilotDiagnostics. (I’ll leave the old one around for a little while, but this one will replace it.) So let’s go through an example execution, first for a user-driven Azure AD Join scenario. The script can be installed directly from the PowerShell Gallery and then executed (make sure you enable script execution with “Set-ExecutionPolicy Bypass” or equivalent first). Since I included the “-online” switch, it will retrieve details about the app and policy names from Intune using Graph API, just to make the later output more useful.

The first section of output provides details about the device, tenant, and Autopilot profile.

After that, summary information about the items tracked by the Enrollment Status Page are displayed, broken down between device ESP (tracking device-targeted apps, certs, and policies during OOBE) and user ESP (tracking user-targeted apps, certs, and policies after the user has signed in).

Next is a new timeline view, showing not only ESP details but also important events from the process:

So you can see when the Autopilot profile was downloaded to the machine (right after internet connectivity was established), when the MDM enrollment completed (right after the user entered their Azure AD credentials in this case), the installation details for Intune Management Extensions (IME, a.k.a. sidecar), and then the tracking information for everything else. A few points to call out on that:

The times for profile download, sidecar installation, and (in Hybrid Azure AD Join cases) ODJ blob acquisition (more on that later) are exact times from event logs, while the times for the rest of the items are what ESP observed. Since ESP isn’t watching real-time (it polls every so often), these times aren’t exact to the second, but they do give you a decent idea as to the ordering and progress of the various items.
Notice how sidecar started to install about 39 seconds after the MDM enrollment? The instructions for that came from the initial Intune MDM sync, along with plenty of other policies. (For example, Office would have started installing at the same time.) But we’re still in the “Device preparation” phase, waiting for “Preparing the device for mobile management” – that really means “waiting for IME to provide a list of Win32 apps to track.” See this blog for more details on that.
See the “Policy …EntDMID” line? That’s really the point where the “Device setup” phase starts and ESP has started tracking. So we know at that point that Intune and IME have provided ESP the list of items to track. Since this is a “dummy” policy that is set right after enrollment, it is marked as “complete” immediately as ESP starts checking.
See the gap in time between 21:17 and 21:24? That’s the transition from device ESP to user ESP, where the time was spent first showing the first sign-in animation (FSIA) and then waiting for a list of user policies from Intune and IME.

So if you do some math based on those results, you can see how long the process took:

From enrollment through “Device preparation” took about 13 minutes.
From “Device setup” (ESP tracking) through the end of OOBE took about 4.5 minutes.
From user sign-in (which happened automatically) to the desktop took about 7 minutes. (We think we can make that faster, more on that in a future blog.)

Let’s look at a similar result from a user-driven Hybrid Azure AD Join scenario, first with the summary details:

There are two additional lines at the bottom, which indicate that the device has already received an offline domain join blob (a good sign that your ODJ Connector processed a request, created a computer account, provided the details to Intune, and then that the device received those details from Intune), and that the new “Skip connectivity check” option (see this blog for details on that) was enabled.

And then the timeline includes more details on the offline domain join (ODJ) process:

So you can see when the device started waiting for the ODJ blob (relayed by Intune, from the ODJ connector), and when it was received and applied. Only after that happens will the device reboot, display the ESP, and begin processing apps, certs, and policies. So we can do similar time calculations here:

It took just over a minute after the MDM enrollment to receive the ODJ blob. The computer would have then rebooted almost immediately after that.
There was then an almost 15-minute delay after the reboot before any additional policies were received from Intune, hence the installation of sidecar was delayed for that long as well. (Yes, we’re working on eliminating that delay.)
From when the “Device setup” phase started (with the EntDMID completion) to the end of device ESP took 10 minutes. (This is a Hybrid Azure AD Join device where user ESP was disabled, hence no user ESP items are seen in the timeline).

One final thought on this to wrap up. I was testing out this script with some logs sent to me by a customer (captured using the MDMDiagnosticsTool) and noticed an interesting output (with customer-specific identifiers erased):

See the problem? The MDM enrollment and ODJ steps happened *after* everything else completed. Obviously that’s impossible as the device can’t apply policies and install apps prior to being MDM enrollment or joined to AD. So what happened? Simple, the time changed because the real-time clock on the device was off by three hours (fairly common when you reimage a device, e.g. placing a PST [-8] image on a device that was running in EST [-5]). As soon as the ODJ blob was applied, a time sync was forced (since Kerberos is very time-sensitive), and that fixed the clock, shifting it back three hours. So to read through that timeline you need to remember that those 14:43 times were really 11:43 (matching up with the profile download, which recorded the *server* time that the profile was downloaded, not the client time). More on that in a future blog as well (as if I don’t have enough blog posts about time and time zones).

Categories: Windows Autopilot

9 replies »

Carl Barrett (@CarlBarrettUK) says:

July 13, 2020 at 7:06 am

Excellent addition again, thanks for posting this, i’m sure it will be useful in the near future

LikeLike
Rich Lilly (@richlilly) says:

July 13, 2020 at 2:32 pm

Awesome as always!

Have you seen this yet in any of your envs? I’m going to run it on a fresh build tomorrow.

RecordStatus : Cannot bind argument to parameter ‘status’ because it is an empty string.
At C:\Program Files\WindowsPowerShell\Scripts\Get-AutopilotDiagnostics.ps1:246 char:64
+ … etail “MSI $msiKey” -status $officeStatus[$status.ToString()] -color …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (:) [RecordStatus], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationErrorEmptyStringNotAllowed,RecordStatus

LikeLike
- Michael Niehaus says:
  
  July 13, 2020 at 2:43 pm
  
  I have seen similar errors before. If you can capture a set of logs with “mdmdiagnosticstool.exe -area Autopilot -cab c:\autopilot.cab” so I can reproduce it I should be able to work around it. (It’s a little tricky as the Office info available is different in CAB files vs. on a live system.)
  
  LikeLike
Rustybucket (@russellduma) says:

July 13, 2020 at 9:57 pm

Thanks Michael, I am trying to troubleshoot some erratic behaviour of autopilot at the moment. One day a two stage ESP works just fine, next day I get a reboot after Device setup. I check the intunemanagement log file and I see “win32App rebootRequiredState NotRequired ” for all the win32 apps (thats the only one I see).

I’ve removed apps and added apps but one time it works just fine, no reboot, no second login, then a reboot / second login occurs for 5 other devices or after a wipe on the same device that just worked.

Trawling through all the event logs now for a crumb as to why I am getting a reboot mid enrollment

LikeLike
Dave Wenger (@WengerDave) says:

July 16, 2020 at 1:06 am

Hey Michael, the script only works partital on my machines (different tenants), with the -online switch I get an error that the autopilotddsztdfile.json cannot be found and a second “cannot convert syste.datetime” (german OS, 1909). any idea? Br from rainy Switzerland

LikeLike
lfnnetworks says:

August 3, 2020 at 12:49 pm

Hi, Michael, this is a follow up to our MDT upgrade question in your June 3rd post.
(Lfnnetworks – posted June 12, 2020)

We performed an upgrade-in-place, from 8450 to 8456.
We have been deploying 1709, and will add a TS for 1909.
After upgrading to ADK\ADKPE 1903, WISM and MDT 8456, those 1709 deployments are successful, but there are a few issues:

1. Capture window now comes up in the Deployment Wizard for one deployment TS.
2. Admin password issue- ‘incorrect password’ after successful deployment(Same TS)
3. BDD log shows MDT version 6.3.8450 in Lite Touch, Wizard, and ZTIGather. All other logs correctly show 8456, as do version .xml and Help\About.
What would you recommend to resolve these issues? Mostly #2 and #3.
I found an entry where you advise reintegrating modified scripts after upgrading. Would this mean recreating the task sequences with the same info? Thanks!

LikeLike
Lee Eastham says:

August 4, 2020 at 1:40 am

Hi Michael,

I’ve tried this on two Autopilot devices and it simply report they aren’t Autopilot devices. One is AAD joined and working fine, the other was Hybrid joined but is failing on a single user app and I can’t find anything in the logs to tell me which app it is. Every time I build another device via Autopilot another issue appears, it’s been 4 months now and I thought I had finally got it working but alas no.

Thanks
Lee

LikeLike
lfnnetworks says:

August 4, 2020 at 7:24 am

Update:
The admin password issue seems to be tied to that 1709 task sequence, so we’re testing a workaround.
That only leaves the BDD log issues.
Thanks for your help!

LikeLike
lfnnetworks says:

August 11, 2020 at 6:35 am

The BDD log issues have been resolved, thanks again

LikeLike