It feels like I’ve written this blog before – many times actually. But given the amount of interest recently, it’s time to cover the topic again: How to troubleshoot Windows Autopilot Hybrid Azure AD Join. This process involves the following steps:
Here’s a description of those numbered steps:
- The device will send its hardware hash to the Windows Autopilot services.
- If the device is registered with Windows Autopilot and has an Autopilot profile assigned to it, the profile details will be provided to the device. In the Hybrid Azure AD Join case, the profile would tell the device what Azure AD tenant the device is associated with and that the device needs to be joined to Active Directory, but it does not specify the Active Directory domain details.
- The user will be prompted for their Azure Active Directory credentials (or if using white glove, the device will perform TPM attestation) to get an Azure AD token; that token will be used to enroll the device in Intune. Intune will be notified as part of the enrollment process that it needs to get the device joined to Active Directory.
- Intune will look for a Domain Join device configuration profile assigned to the device (via the groups that device is part of). Assuming it finds one, it will create a request for the Offline Domain Join connector (officially named the “Intune Connector for Active Directory”). If it doesn’t find one, steps #5 and #6 will never happen, and the device will time out waiting for an ODJ blob that will never come.
- The ODJ connector picks up the ODJ request from the Intune service (it polls Intune looking for requests). If it finds a request, it will attempt to create an Active Directory object in the specified domain and OU using the naming prefix specified (all from that Domain Join profile). If that succeeds, it will upload the resulting ODJ blob representing that computer account to the Intune service.
- When the device performs its next MDM sync (usually every 3 minutes, possibly even more frequently), it will receive that ODJ blob from Intune and apply it to the device. If the “skip connectivity check” setting is specified in the Autopilot profile, the device will immediately reboot to complete the domain join process. If the “skip connectivity check” setting is not specified, or if the device doesn’t meet the requirement for that setting, the device will first try to ping a domain controller for the domain (to ensure connectivity) before rebooting. If that ping test never succeeds, this step will time out and you’ll never get to step #7.
- Finally, the user needs to sign into the device using Active Directory credentials, which need to be validated by an Active Directory domain controller, hence connectivity is required at this point; VPN connectivity can be used. See this post for more details on that.
So what can possibly go wrong? There are a few points of failure in this process:
- There is no Domain Join profile targeted to the device.
- The ODJ connector can’t create the ODJ blob for the device.
- The device can’t establish connectivity.
- The user can’t sign in.
- User ESP times out after the user signs in.
Strangely, I routinely see people run into one of these issues and then someone else will say “I’m seeing the exact same problem.” And more often than not, that’s not at all true. Being able to recognize the differences in these different failures is a key troubleshooting skill. You should be able to answer the following questions:
- Did the ODJ connector process a request and upload a blob for the device?
- Did the device receive and apply the ODJ blob?
- Did the device try to check connectivity?
- If using a VPN connection off the corporate network, can it connect so the user can sign in?
- Can the Hybrid Azure AD Join process complete so the user can get an Azure AD user token, needed to talk to Intune and other Azure AD-based services?
Methodically, you need to make your way down that list.
Did the ODJ connector process a request and upload a blob for the device?
This might be the first this on the list, but it’s usually the second thing I check. In any case, it’s at least something that’s reasonably easy to do by checking the event log on the server running the ODJ Connector (“Intune Connector for Active Directory”). You’ll see the log like so:
And as I mentioned back in 2018, there’s a lot of noise in that event log, so it’s useful to filter out that noise by clicking the “Filter current log” link and put in this for the filter:
That looks like this:
That specifies to include all events except 30121 and 30150 (the dash at the beginning means “don’t include”). After you do that, you can see more interesting stuff:
The series of three events (30120, 30130, 30140) shows a request being downloaded (30120), processed (30130), and uploaded (30140). Looks good, right? You might think so, but pay attention to the date and time: If you don’t see any events from around the time you deployed the machine, the service never got a request. And since I deployed my device on 7/19 and the last successful request was on 7/18, that’s proof that the connector never got a request for my device. So time to move to question #2 to continue troubleshooting.
If there were any errors recorded (again, during the timeframe that you were trying to deploy the device – don’t get thrown off by errors during any other time period) while creating the computer account, it could represent a problem with the delegation of rights to the service account (the computer account) or a problem with the OU path that you specified (should always be “OU=something, DC=yourdomain, DC=com” and should never start with “CN=”). Fix the issue, try again.
Also, while you’re looking at the log, make it bigger. The default size isn’t big enough, given the “noise” events. Add a zero or two to the maximum size from the “Properties” page:
Did the device receive and apply the ODJ blob?
This is usually where I start, since there’s no need to check the event log on the ODJ Connector server if you can see proof on the device that it received the ODJ blob. And with the new Get-AutopilotDiagnostics script, that’s pretty easy to check. First, recognize the symptoms of what will happen if an ODJ blob is not received. The device would have already enrolled in Intune, but then failed after a period of time (around 25 minutes) with an error that looks something like this:
The error code could be 80070002 (basically “not found”), 80070774 (“domain not found”), or 80004005 (effectively just a generic error, usually caused by a timeout). (But don’t assume that these error codes always mean “an ODJ blob wasn’t received – it’s not that simple. Make sure you understand where the error happened: the user authenticated, the device was enrolled in Intune, and then there was an error prior to the ESP tracking the device provisioning steps. There would have been no reboot (which is triggered after the ODJ is received).
So how do we see if the ODJ blob was received? Just press Shift-F10 to open a command prompt, run PowerShell.exe, and install and run the script:
There are two key items in that output:
- ODJ applied: No
- Timed out waiting for ODJ blob
Both of these are telling you the same thing: The device never got an ODJ blob. (But we already knew that because there were no events in the ODJ event log on the server.) Why? Because there was no “Domain Join” profile targeted to the device. I did that on purpose for this device, but it can certainly happen accidentally too, especially if you target that profile to a group containing only the Azure AD Join device created for the device when you registered it with Autopilot. If you then went through a full Hybrid Azure AD Join scenario, Intune would switch its targeting to the new Hybrid Azure AD Join device, so subsequent redeployments (reimaging, reset) would not work. As a simple workaround, you can target the “Domain Join” profile (assuming you only have one) to “All devices” to avoid problems like this. Want to understand more about this scenario? Read this article.
But wait, what about that 80070774 error code? Doesn’t that mean that the device got an ODJ blob and is now trying to communicate with an AD domain controller? No, it doesn’t. You can get that same error code when you trying to talk to a “null” domain name, e.g. when no ODJ blob was received. So don’t get confused about that.
Did the device try to check connectivity?
There are two situations where Autopilot does not check connectivity to a domain controller in a Hybrid Azure AD Join scenario:
- The Autopilot profile has been configured to “Skip AD connectivity check,” and is running either Windows 10 2004 or the December cumulative update for Windows 10 1903 or 1909, as specified in the requirements.
- You are performing a white glove scenario. Since no user needs to sign in when using white glove, there’s no need for a connectivity check (and there’s never been one in this scenario – the user still needs connectivity later to sign in).
So if you don’t meet those conditions, you’ll see this screen for a long time:
And it will eventually time out (typically with an 80004005 or 80070774 error code):
From this point, we can again press Shift-F10 to open a command prompt, then run PowerShell to “Set-ExecutionPolicy bypass” and “Install-Script Get-AutopilotDiagnostics.” When running the script, you should then see output similar to this:
The “Could not establish connectivity” message confirms that the device was indeed trying to ping a domain controller, and eventually gave up and reported the timeout error. (See anything else interesting there? The device then kept moving forward, installing the Intune Management Extensions. It will then install all the apps and policies too – it’s fully enrolled in Intune, so all that happens in the background, even though it does no good due to the connectivity check failure.)
Can the user sign in?
Assuming you skipped the connectivity check (so we don’t know for sure if there is connectivity to a domain controller to authenticate the user), it’s possible to see an error like this when the user signs in:
This is basic Active Directory networking: The device needs to be able to find a domain controller via a DNS lookup, then talk to that domain controller to authenticate the user’s ID and password, apply GPOs, etc. If there’s no connectivity, you’ll see that error. Possible causes:
- The device isn’t on the corporate network, so it’s not able to find the domain or domain controller. (Yes, this one should be obvious.)
- You need to install a VPN client configuration (potentially with a machine certificate) so the user can either manually establish connectivity by clicking the “network connection” icon or automatically via an auto-connecting VPN.
To troubleshoot, you have some options. My preference is to start with a VM that can be moved between the “internet” (anywhere off the corporate network, but with internet access) and the corporate network. When it fails on the “internet,” move it to the corporate network so an AD user can sign in, then move it back to the “internet” for further troubleshooting of the VPN client (a whole lot easier to do when signed in).
If that’s not an option, you can also add a local account during the process by pressing Shift-F10 during OOBE to open a command prompt. From there you can create a local account with a simple command, and give it administrator access:
net user Troubleshoot P@ssword /add
net localgroup Administrators Troubleshoot /add
Then you can use that local account to sign in and troubleshoot more.
Can the Hybrid Azure AD Join process complete?
Autopilot and Intune take care of getting the device joined to Active Directory and enrolled in Intune, but they don’t take care of the Hybrid Azure AD Join process – that’s a separate process that happens in the background. If you are using ADFS, that should happen pretty quickly (assuming you have ADFS set up properly with all the claims rules needed). If you aren’t using ADFS, i.e. using passthrough authentication or password hash authentication, then the process is more involved and could take a half hour or more to complete; if the user signs in before that process completes, the user won’t get an Azure AD user token and won’t be able to talk to Intune. (And if you are using ADFS and haven’t configured the needed claims rules, it will fall back to the non-ADFS behavior.)
The non-ADFS flow is described in detail in my previous blog post. Unless you can guarantee that process will finish before the user signs on (after the SCP can be read, the userCertificate property updated, and AAD Connect syncs the object to AAD), you will want to disable the user ESP, which can be done using a custom OMA-URI policy in Intune.
To do that, create a device configuration profile in Intune, specifying Windows 10 and above and a type of “Custom.” You can give the profile a name (e.g. “Disable user ESP”), and then add one custom OMA-URI setting:
- Name: SkipUserStatusPage (or whatever you want)
- Description: (whatever you want)
- OMA-URI: ./Vendor/MSFT/DMClient/Provider/MS DM Server/FirstSyncStatus/SkipUserStatusPage
- Data type: Boolean
- Value: True
That ends up looking like this:
There’s plenty of additional troubleshooting that can be done for Hybrid Azure AD Join issues, starting with the official documentation. But that’s well beyond the scope of what I can cover in this article. Just remember: Windows Autopilot and Intune set the device up for Active Directory, and Windows takes care of doing the Hybrid Azure AD Join process in the background, asynchronously.
Still having issues?
You can always open a support case via the Intune “Help and Support” node (under “Troubleshooting + support” in the Endpoint Manager admin center portal) or work with https://fasttrack.microsoft.com or your local Microsoft team.
Categories: Windows Autopilot