I’ve done a variety of blogs on troubleshooting Windows Autopilot, which you can read up on for historical reference:
- Troubleshooting Windows AutoPilot (level 100/200)
- Troubleshooting Windows AutoPilot (level 300/400)
- Troubleshooting Improvements in Windows Autopilot
- TPM Attestation: What can possibly go wrong?
And there are troubleshooting notes in several other blogs as well. And yet, reading back through these I realize a few things:
- I’ve learned a lot about troubleshooting Windows Autopilot, MDM, Azure AD, etc. since then.
- There’s always more to learn.
- The overall troubleshooting process is still too difficult. I would go as far as saying it sucks to a certain degree – even after looking at hundreds of sets of logs, it’s still easy to get lost. (Yes, we want to implement improvements to make it easier overall – those are on our list.)
So I wanted to take a step back and talk about a suggested methodology to use for troubleshooting. Let’s start with the basics. If you run into an issue, use the Intune portal to open a support case via the “Help and Support” node. (They are free support cases, included with your Intune subscription, so take advantage of them) You should be prepared to provide a few things:
- A useful description of the problem. While this seems obvious, you’d be surprised at the number of “it didn’t work” descriptions. How far did you get in the process? What error did you see? Did it work before and is now failing? What scenario are you using (user-driven, self-deploying, white glove; AAD vs. Hybrid AAD)?
- A screenshot or photo of the error. I’ve seen page-long descriptions of the problem encountered, and still am no closer to understanding the situation. A screenshot goes a long way.
- A full set of logs. We’ll always ask you to run the MDMDiagnosticsTool to create a cab file (which you will need to upload somewhere, or enclose it in a zip file before sending it, as O365 will strip CAB files – you’ll slow down the process if it takes two days for us to get a CAB file).
Let’s focus in on that last point for a bit. You should run MDMDiagnosticsTool on Windows 10 1809 and above to collect logs. (If you are still using older Windows 10 versions, you really should move forward – the MDM and Autopilot capabilities are much better. On older versions, you would need to use the older LicensingDiag.exe. That’s tied back to Autopilot’s initial implementation that was built on top of the Windows 10 licensing and activation stack. But with newer versions, you’ll get much better info from MDMDiagnosticsTool.) You might see two different variations on the command, based on the scenario that you are performing:
- MDMDiagnosticsTool.exe -area Autopilot -cab c:\Autopilot.cab
- This gathers most of the available logs related to Windows Autopilot, OOBE, MDM, Azure AD, etc. It also gathers the hardware details (via the hardware hash), registry information, and much more. We’ll go through that in detail in a moment.
- MDMDiagnosticsTool.exe -area Autopilot;TPM -cab c:\autopilot.cab
- This adds additional TPM diagnostics to the previous set of logs. Those details would be needed when you are having issues with a self-deploying or white glove scenario. There’s no harm in gathering them in other scenarios, but there are cases where the additional TPM info can generate error messages or popups, so to avoid people thinking that’s somehow related to their issue, you can leave it off for other scenarios.
After you run the command, you can copy off the resulting CAB file to a USB key, a file share, or wherever you want. The support personnel should be able to provide details on how to get the file to them. Once you’ve done that, others on the team are able to look at those logs to help diagnose the issue.
Alright, but what if you wanted to try to figure it out yourself? First, it’s useful to understand what’s inside that CAB file (focusing on Windows 10 1903 – earlier versions will have fewer files), so let’s go through a list.
|CloudExperienceHostOobe.etl.*||Low||ETL trace files.|
|LicensingDiag.cab||Low||If you’re running into Windows activation issues, you might care about this, but otherwise, it’s not useful for Autopilot troubleshooting.|
|AgentExecutor.log||Low||This is picked up from the Intune Management Extensions log folder (C:\ProgramData\Microsoft\IntuneManagementExtension\Logs) but I’ve never found anything useful in it.|
|AutopilotConciergeFile.json||Low||At this point, this file is not used.|
|AutopilotDDSZTDFile.json||High||This file contains the Autopilot profile settings being used for the device.|
|CertReq_enrollaik_Output.txt||High||This file only exists when the TPM area is included. It provides a simulation of the TPM attestation process and logs the results, so it’s useful to see why the “real” TPM attestation might be failing.|
|CertUtil_tpminfo_Output.txt||Medium||This file only exists when the TPM area is included. It provides more details about the TPM chip or firmware used in the device.|
|DeviceHash_*.csv||High||This contains the serial number and full hardware hash for the device. While that hash might not look useful to you, it tells us a lot about the device, including the version of Windows 10, patches that are installed, TPM firmware version, and a lot more stuff.|
|DiagnosticLogCSP_Collector_Autopilot.etl||Low||ETL trace files.|
|DiagnosticLogCSP_Collector_Autopilot.etl.merged||Low||ETL trace files.|
|DiagnosticLogCSP_Collector_DeviceEnrollment.etl||Low||ETL trace files.|
|DiagnosticLogCSP_Collector_DeviceProvisioning.etl||Low||ETL trace files.|
|IntuneManagementExtension.log||High||This log will capture excruciating detail about the installation of Win32 apps being deployed via Intune. (Use one of the ConfigMgr log viewing tools, e.g. CMTrace.exe, to view this.)|
|LicensingDiag_Output.log||Low||This captures the output of the LicensingDiag.exe command that generated the previously-mentioned LicensingDiag.cab.|
|MDMDiagHtmlReport.html||Medium||This is the same report you can get from the Settings app that provides more details on all the MDM policies that have been applied to the device.|
|MdmDiagLogMetadata.json||Low||This records the areas that were specified on the MDMDiagnosticsTool command line (or those added automatically).|
|MDMDiagReport.xml||Medium||This is a machine-readable XML version of the HTML report above.|
|MdmDiagReport_RegistryDump.reg||Medium||This dump the contents of a variety of registry keys that are useful to determining the state of the machine, including MDM enrollment details, Autopilot details, and related info. Support technicians may use this to find related information in Intune.|
|MdmLogCollectorFootPrint.txt||Low||This shows everything that MDMDiagnosticsTool tried to collect and put into the CAB file.|
|microsoft-windows-aad-operational.evtx||High||This event log shows Azure AD join and Hybrid Azure AD Join-related info.|
|microsoft-windows-appxdeploymentserver-operational.evtx||Low||This event log shows details from UWP app installations.|
|microsoft-windows-assignedaccess-admin.evtx||Low||This event log contains events related to kiosk configuration.|
|microsoft-windows-assignedaccessbroker-admin.evtx||Low||This event log contains even more events related to kiosk configuration.|
|microsoft-windows-assignedaccessbroker-debug.evtx||Low||This event log contains even more events related to kiosk configuration.|
|microsoft-windows-assignedaccessbroker-operational.evtx||Low||This event log contains even more events related to kiosk configuration.|
|microsoft-windows-assignedaccess-operational.evtx||Low||This event log contains even more events related to kiosk configuration.|
|microsoft-windows-devicemanagement-enterprise-diagnostics-provider-admin.evtx||High||This event log covers MDM enrollment (including failure reasons) and other pertinent MDM activities.|
|microsoft-windows-devicemanagement-enterprise-diagnostics-provider-debug.evtx||Low||This event log is usually empty.|
|microsoft-windows-devicemanagement-enterprise-diagnostics-provider-operational.evtx||Low||This event log has lots of MDM-related activity in it, but I’ve never found any of it to be of any value.|
|microsoft-windows-moderndeployment-diagnostics-provider-autopilot.evtx||High||This is the key event log used by Autopilot, and one that you’ll almost always want to look at.|
|microsoft-windows-moderndeployment-diagnostics-provider-managementservice.evtx||Low||This event log has some Autopilot-related activity in it, but this is more “housekeeping” stuff that isn’t typically useful.|
|microsoft-windows-provisioning-diagnostics-provider-admin.evtx||Low||This event log contains events related to the application of provisioning packages (PPKGs), which are used to configure some Windows default settings. Typically you can ignore this one.|
|microsoft-windows-shell-core-operational.evtx||Medium||This is the event log that the shell uses for most things, including tracking the OOBE process, registering apps when a user signs in, etc.|
|microsoft-windows-user device registration-admin.evtx||Medium||This event log shows details around Hello for Business and related configuration details.|
|setupact.log||Medium||If you are familiar with the logs created by Windows Setup, you’ll recognize this one. This logs all the stuff going on in OOBE, and can be useful for troubleshooting any OOBE weirdness.|
|TpmHliInfo_Output.txt||High||This log (which is created even when not specifying the TPM area) contains basic details about the TPM in the device: the manufacturer, the firmware level of that TPM, whether it has a required EK cert, etc.|
While that might seem intimidating, if we focus initially on only the high priority ones, in bold above, it’s a little easier to follow. Let’s walk through this somewhat methodically.
First, look at the AutopilotDDSZTDFile.json and check these settings:
- CloudAssignedDomainJoinMethod. If this is 0, the device has been configured to join Azure AD. If it is 1, the device has been configured to join AD (ODJ, Hybrid Azure AD Join).
- DeploymentProfileName. This will tell you the name of the Autopilot profile that was assigned to this device.
- CloudAssignedOobeConfig. This is a set of flags (described here) that specify how OOBE should behave (e.g. which pages should be skipped). For user-driven scenarios, you’ll typically see a value of 28 (user should not be an admin) or 30 (user should be an admin), although that could change as new settings are added. For self-deploying, bits 5 (32), 6 (64), and 7 (128) will typically be set, so the value will be bigger. (Note that you won’t see a value for white glove, as that is checked service-side, not on the client.)
So now you should know what scenario (user-driven vs. self-deploying) and join method (AAD vs. AD/ODJ/Hybrid AADJ). Next, open the Autopilot event log (microsoft-windows-moderndeployment-diagnostics-provider-autopilot.evtx) and look for errors. If you see any errors like this:
AutopilotManager reported that MSA TPM is not configured for hardware TPM attestation even though the profile indicates it is required. Autopilot cannot proceed
Then you know the device is trying to white glove or self-deploying mode, and TPM attestation is failing. Check this blog for more info on that. As a quick summary, there are a few other items you can look at:
- TpmHliInfo_Output.txt. If you see an error that there is a missing EK cert, that’s a problem.
- CertReq_enrollaik_Output.txt. Look for errors communicating with the test TPM attestation service, or processing the resulting certificate response. (This can indicate a problem with the EK cert, fixed in a later 1903 cumulative update – see the known issues list – or a problem with the date/time on the device.
If you know that TPM attestation has succeeded, but you’re still seeing an error in the Autopilot event log with error 0xC1036501:
Autopilot discovery failed to find a valid MDM. Confirm that the AAD tenant is properly provisioned and licensed for exactly one MDM.
Then make sure there is only on MDM app defined in Azure AD (as described in my previous blog).
If you received any “something went wrong” errors with an 8018* error code, check the microsoft-windows-devicemanagement-enterprise-diagnostics-provider-admin.evtx to see what caused the MDM enrollment error. You should find events like this one:
The reason text is what you’re looking for.
If you received any “something went wrong” errors with an 801C* error code, check the microsoft-windows-aad-operational.evtx and microsoft-windows-shell-core-operational.evtx event logs to see what caused the Azure AD Join failure.
If you get an error like “OOBEIDPS” check the microsoft-windows-shell-core-operational.evtx event log and setupact.log to see if they reported any network-related errors.
If you get a timeout, things get really interesting:
- If there is a timeout during ESP and you’re using conditional access, check the IntuneManagementExtension.log to see if there are any AAD conditional access-related errors.
- If there is a timeout during user ESP and it’s a hybrid AADJ scenario, check the microsoft-windows-aad-operational.evtx event log to see if the device registration process completed. You’ll have to do this by omission: You’ll see an event that says “Device is not cloud domain joined: 0xC00484B2” (event 1089) every few minutes until the device registration process completes, which can take up to 30 minutes (as AAD Connect only syncs every 30 minutes). When that event stops, the device has been registered. (Then you may see events about the user not having an AAD user token…)
- If you’ve added or changed an app recently and now you’re getting ESP timeouts, the app may not be configured properly. Check the IntuneManagementExtension.log to see if any apps failed to install. Look for entries like this:
[Win32App] Got InstanceID: Win32App_aaaaaaaa-70f7-4570-8394-92909f0f9919_1, installationStateString: 2
State “2” means the install failed, state “3” means the install was successful. Right now, if an app install fails, the failure is not reported by ESP, so it will sit until the timeout happens.
There are probably 101 other possible failure causes, which may get into more logs or even the ETL trace files (which you can try to use with Windows Performance Analyzer, Microsoft Message Analyzer, TRACEFMT, or similar tools – more challenging). At some point, you may need to admit defeat and open a support case. But hopefully you can make an initial stab at it.
Categories: Windows Autopilot