Windows Autopilot

TPM Attestation: What can possibly go wrong?

First off, it would be good to touch on what TPM attestation is, and then talk about why you care.  From some older Windows Server documentation, here’s a decent overview:

With TPM key attestation, a new management paradigm is now possible: An administrator can define the set of devices that users can use to access corporate resources (for example, VPN or wireless access point) and have strong guarantees that no other devices can be used to access them. This new access control paradigm is strong because it is tied to a hardware-bound user identity, which is stronger than a software-based credential.

And that same article goes on to describe how it works:

In general, TPM key attestation is based on the following pillars:

  1. Every TPM ships with a unique asymmetric key, called the Endorsement Key (EK), burned by the manufacturer. We refer to the public portion of this key as EKPub and the associated private key as EKPriv. Some TPM chips also have an EK certificate that is issued by the manufacturer for the EKPub. We refer to this cert as EKCert.
  2. A CA establishes trust in the TPM either via EKPub or EKCert.
  3. A user proves to the CA that the RSA key for which the certificate is being requested is cryptographically related to the EKPub and that the user owns the EKpriv.
  4. The CA issues a certificate with a special issuance policy OID to denote that the key is now attested to be protected by a TPM.

Alright, but what does any of that have to do with Windows Autopilot?  Simply, we needed a mechanism to allow the device to prove that it wasn’t an imposter.  A device can leverage TPM attestation to prove to Azure AD that it is the same device that was registered with Windows Autopilot.  Azure AD will then provide a device token, enabling Azure AD Join or MDM enrollment, without anyone ever typing in any credentials.  Most customers want assurances that random devices can’t join or enroll, so this is the mechanism that we decided to use.  Since all new PCs manufactured in 2016 or later should support TPM attestation, this seemed like a reasonable idea.

But then there was a complication: a TPM vulnerability that required us to block certain TPMs because they hadn’t yet been patched.  See the CERT bulletin and related MSRC bulletin for more details.  While some customers diligently updated their TPM firmware (e.g. for Surface devices), others still haven’t done that – and then try to use those devices with Windows Autopilot scenarios that require TPM attestation.

We also discovered some challenges with devices that don’t ship with an EKPub cert.  Instead, they are supposed to acquire that cert when they start up.  We can attempt to force that to happen by running the TPM maintenance task:

image

But for various reasons (bad drivers, network connectivity challenges, TPM operating in reduced functionality mode, etc.) that process might not complete successfully.

We made as many improvements as we could in Windows 10 version 1903 to make this process more reliable.  Due to the extent of these changes, we were unable to backport these changes to Windows 10 version 1809, hence we’ve decided to only support scenarios that depend on TPM attestation (namely, self-deploying mode) with version 1903 and above.

That doesn’t mean the process is completely smooth though – things can still go wrong.  Some of these are obvious (e.g. devices that don’t meet the requirements) but others are much more subtle.  Here are some things that can cause issues:

  • The device doesn’t support TPM attestation.  Possible causes for that to fail:
    • The device doesn’t support TPM 2.0.  (The full details can be found at https://docs.microsoft.com/en-us/windows-hardware/design/minimum/minimum-hardware-requirements-overview#37-trusted-platform-module-tpm.)
    • The device doesn’t have an EKPub cert or is unable to get one.  (Some devices get one of these over the internet when they first start up; make sure you’ve whitelisted those locations.  If you are using a Surface Go or other devices with an Intel TPM, make sure you have a reasonably-current Intel iCSL driver.)
    • The device doesn’t have the needed TPM firmware updates.  (See the links above.)
    • The device’s TPM hasn’t been whitelisted (not a common issue, unless you’re on a VM – we explicitly block TPMs from VMs).
  • The date/time is off by more than 10 minutes in the past.  The TPM attestation process gets a cert from a remote server that has been “back-dated” 10 minutes, so if the client’s time is off more than 10 minutes it will reject the cert because it’s valid date/time is in the future.

So what can you do to verify?  To start, if you have a device running Windows 10, you can check in the Windows Security app under Device Security and then Security processor details.  Assuming it says “Ready” under both “Attestation” and “Storage” you’re starting from a good place:

image

Next up, actually try one of the Windows Autopilot scenarios that requires TPM attestation:

  • Self deploying mode.  This one should be fairly obvious: the device deploys itself and joins Azure AD without putting in any credentials at all.
  • White glove.  This one is a little more subtle.  While white glove is designed as an optimization to the user-driven process, behind the scenes it leverages the same self deploying capability so that the technician can enroll the device into Intune and join it to Azure AD (or Active Directory, although in the white glove scenario it joins Azure AD first, before later being un-joined and then joined to Active Directory).

If those succeed, you’re golden.  If either fail (typically with a meaningless timeout error like “800705B4” after trying over and over again), then you need to do some additional troubleshooting.  First, verify that the issue is indeed with TPM attestation (as there can be other causes for timeout errors).  Look for these events in the Microsoft-Windows-ModernDeployment-Diagnostics-Provider/Autopilot event log:

  • Event 302: AutopilotManager device enrollment failed during stage AADEnroll with error 0x801C0003.  (This is an error from Azure AD join – it means that the device wasn’t authorized to join.  By itself this doesn’t necessarily mean that the device failed to do TPM attestation, as it can also happen for other reasons, e.g. exceeding device limits.)
  • Event 156: AutopilotManager reported that MSA TPM is not configured for hardware TPM attestation even though the profile indicates it is required.  Autopilot cannot proceed.  (This normally indicates that something interfered with the hardware TPM attestation process, but it doesn’t tell you what.)

The next step would be to gather the Windows Autopilot log files using this command (Windows 10 1903):

MDMDiagnosticsTool.exe -area Autopilot;TPM -cab c:\autopilot.cab

This CAB file will grab all sorts of information (and will typically be requested if you open a support case), but there are two files that are particularly interesting for TPM investigations:

  • TpmHliInfo_Output.txt.  This file captures information about the TPM and should show that it supports TPM 2.0.  If you see something like this, then there’s an issue:  -NoValidEkCert: No valid EK cert found
  • Certreq_enrollaik_output.txt.  This file captures an attempt to enroll an AIK key for the device (see the initial doc link).  If that succeeds, then attestation should be OK.  If it fails, you might see something like this (indicating that the time is off on the device):Certificate Request Processor: A required certificate is not within its validity period when verifying against the current system clock or the timestamp in the signed file. 0x800b0101 (-2146762495 CERT_E_EXPIRED)        

As I mentioned before, don’t assume that all 800705B4 errors are caused by TPM errors, and don’t assume all self-deploying and white glove failures are caused by TPM errors.  If you get stuck and can’t figure it out, open a support case via the Intune Help and Support node.

Categories: Windows Autopilot

6 replies »

  1. Thanks for the article Michael!

    I have one question with an issue regarding this topic:

    What could be the issue if TPM attestation one of two identical Laptop models (Dell E7470) keeps failing while the other one enrolls successfully? The Autopilot event viewer logs on the faulty unit shows that the TPM attestation step runs in a timeout after 10 attempts which causes the 800705B4 error in the Securing your hardware phase.

    One interesting line shows “MSA TPM keystate has been updated. New server state = unattested key, new client state = attested key”. It seems that the key on the client is attested, but not on the serverside or something? CertReq_enrollaik_Output.txt shows that a certtificate is enrolled, Both TPM’s are the same (Nuvoton) with Dell firmware 1.3.2.8, TPM spec 2.0 and I’ve tested the whole process with Windows 10 1903.

    Thanks!

    Regards,

    Kevin

    Like

    • A good question. We have seen issues with devices that were sysprepped, leaving some “remnants” that make Autopilot think that there is no need to do TPM maintenance even though there is (to pull the certs into the registry again). Are you seeing this on the device even after reimaging it?

      Like

      • I’ve just done a reinstall on the faulty client by hand (usb) which resulted in the same behaviour unfortunately. I left the Autopilot entry in Intune (normally I would reimport the device before a new attempt), reinstalled Windows and immediately monitored the Autopilot event log when entering OOBE.

        The Windows AIK certificate enrollment taks gets triggered, pulls a new certificate from one of the Azure SCEPS, locates the AIK certificate and key, reports that the TPM maintenance task is skipped since the AIK certificate is already present, reports the line “MSA TPM keystate has been updated. New server state = unattested key, new client state = attested key”, tries is another 10 times and then times out with the error 0x800705B4.

        I’m not sure what’s happening here but it seems that the client TPM isn’t able to verify itself in Azure or something?

        Like

      • Not sure what’s up with that. Can you open a case via the Intune “Help and Support” node? (There’s no charge for Intune cases.) E-mail me the case number and I can check it. If you can attach the result of “mdmdiagnosticstool.exe -area Autopilot;TPM -cab c:\autopilot.cab” that would be useful too.

        Like