<![CDATA[Jedda Wignall]]>https://jedda.me/https://jedda.me/favicon.pngJedda Wignallhttps://jedda.me/Ghost 5.120Thu, 29 May 2025 22:11:06 GMT60<![CDATA[Beneath the MASQUE - a dive into Network Relay technology on Apple platforms]]>A technology that Apple describes as a "modern alternative to VPNs", is in production use today on hundreds of millions of devices and has been embraced by some of the largest players on the internet. Sounds intriguing, but outside of select big tech and academics, how much do

]]>
https://jedda.me/beneath-the-masque-network-relay-on-apple-platforms/6811645d12089e000142e239Thu, 29 May 2025 20:27:00 GMTA technology that Apple describes as a "modern alternative to VPNs", is in production use today on hundreds of millions of devices and has been embraced by some of the largest players on the internet. Sounds intriguing, but outside of select big tech and academics, how much do we really know about it?

At WWDC in 2023, Apple presented Ready, set, relay; an introductory session which included a call to "start replacing the use of VPNs with relays that are easier to manage and provide a more seamless user experience". During this session, Cisco Secure received an explicit callout as an example of a relay service designed to facilitate access to a private enterprise server. We are now almost 2 years removed, and have only seen a small additional number of commercial offerings that explicitly lead with relay technologies, and only very sparse discourse that usually remains extremely theoretical in nature. In particular, I'm surprised that we haven't seen more community discussion or questioning about prescriptive implementations of this technology, and hopefully, this article can kick off some interesting conversations within the Apple admins community at large.

A recreation of an incredibly simple graphic from Apple's Ready, Set, Relay session. As I hope you'll see through this article, in some ways this is an obvious oversimplification, and in other ways, not at all. At its core, it really should (and can) be this simple.

In this article, I'll try to introduce some of the concepts and history behind Network Relay technologies as well as dive into an open-source implementation using Envoy proxy. Through this, we'll explore some of the current limitations and implementation considerations and discuss some interesting use cases both today and into the future. And yes - we will even get to integrate with and show off an awesome use case for Managed Device Attestation; one of my favourite topics!

Much has been written about the underpinnings of MASQUE & relay technologies, so this article won't go too deep down into the OSI model but will instead focus on the practical implementation realities and possibilities around Network Relay on Apple platforms. If you want a deeper understanding of the underlying transports, intention and guts of the protocols, then there are a few companion pieces that I have enjoyed and are definitely worth reading (and can even be consumed first if you are genuinely interested in a comprehensive technical deep dive):

As I think you'll see, this topic is a wide-ranging maze that can lead you off on all sorts of garden paths as you traverse the server infrastructure, the network and the behaviour of the client, amongst other futurist concepts and dreams for the web. Consequently, this article ended up fairly long form in an attempt not to assume too much knowledge and appeal to a wider audience of Apple admins, Infosec and network folks. If you just want to see some cool examples, there will be no hurt feelings if you skip down the page.

For the most part, this is a written piece, but for the technical examples, I've included videos that better show off how relays operate across macOS, iOS and the proxy infrastructure. As I was writing this, I realised it's almost impossible to fully show off Network Relay on the client and server side simultaneously just by presenting some log entries and dry configurations. So with all that, strap in, put on some appropriate background music and let's dive in together to Network Relay on Apple platforms!


📖 A brief history of HTTP

In the beginning, there was HTTP/1.0 (well, technically 0.9) - a simple, text-based protocol built on top of TCP. It did exactly what the early web needed: fetch and deliver static documents, one request at a time. HTTP/1.1 followed with important upgrades, including persistent connections, basic pipelining, and the introduction of the CONNECT method, a feature that enabled early proxying scenarios.

Despite these improvements, HTTP/1.1 still had critical limitations, and these were particularly exposed when being used as part of a proxy. Persistent connections and pipelining reduced overhead, but also exposed a key bottleneck: head-of-line blocking. The concept here is pretty simple - with multiple requests sharing a single TCP connection, and a need to send and receive in a specific order, a single slow or delayed response could stall everything behind it. This made proxying inefficient and unreliable, especially under high load or poor network conditions.

A simplified representation of HTTP/1 pipelining and its impact on Head of Line Blocking. A client requests 4 resources (HTML, CSS, JS, images, etc.) from a HTTP server - these must be processed in order, so if B is very slow, the client must wait for C and D (even if they would be individually quicker to access).

Additionally, traditional HTTP proxies were built for one job: handling HTTP traffic. They could do so either explicitly (by being configured on the client) or transparently (on an upstream middle-box device such as a firewall), and in the case of HTTPS traffic, often terminated TLS and deliberately man-in-the-middled client traffic. As such, this style of proxy was (and very much still is) commonly used for caching, filtering, or outbound access control.

HTTP/2 wasn’t standardised until 2015 - almost 16 years after HTTP/1.1. It marked a significant departure from its predecessors, moving from a text-based protocol to a more efficient binary format, which allowed for more efficient parsing and improved performance. This also introduced connection multiplexing, which allowed multiple requests and responses to be sent over a single connection at the same time without blocking one another, helping to mitigate some of the head-of-line blocking issues in earlier versions of the protocol.

A simplified representation of the same 4 resources being streamed from a HTTP server over HTTP/2 - the resources can be fetched using independent streams on a single TCP connection, and the client can make use of D if it arrives before C, B or A.

HTTP/2's later enhancements also later expanded the original CONNECT method, enabling it to tunnel arbitrary TCP streams and not just HTTP. Whilst initially associated with WebSockets, the arbitrary nature of the established tunnel meant that any TCP traffic could be proxied (SSH, file sharing, remote desktop, etc.). This enhancement, known as Extended CONNECT, forms a core part of the technology behind Apple’s Network Relay and other modern proxying approaches. Much of HTTP/2’s design was influenced by Google’s experimental SPDY protocol, which had aimed to make the web faster and more efficient during the interim years.

More recently, HTTP/3 has now taken things a step further by moving away from TCP entirely and building upon a new transport protocol called QUIC, which runs over UDP. This change has allowed for faster connections, better performance on unreliable networks, and further ability to handle multiple streams of independent data. One of the biggest differences is that encryption is truly integrated by default - unlike older versions of HTTP, which relied on encryption as a separate TLS layer. This makes HTTP/3 not just faster, but also more inherently secure. Another incredible example of QUIC's resilience comes from its addition of a "connection ID" to the UDP paradigm - after this stateful element has been coordinated between client and server, even the underlying IP addresses can change! This enables seamless connection migration, meaning that a client can roam between wired, Wi-Fi and cellular connections (even transitioning from IPv4 to IPv6 and back again) without terminating or even needing to re-handshake!

QUIC was originally developed by Google to speed up its own services, and later became a foundational base for HTTP/3 through standardisation work with the IETF. Some people go as far as to view QUIC as a potential replacement for much TCP traffic in the future, as it attempts to blend the best parts of TCP and UDP into a single and structurally secure transport that is connection-oriented and extremely efficient and adaptable.

One thing that is important to understand is that these generations of HTTP aren’t strict or exclusive replacements for one another. In practice, most browsers use a mix of HTTP/1.1, HTTP/2, and HTTP/3 depending on the site, the server, and the network conditions. Most of the time, clients find out about the presence of HTTP/3 by first loading a site through HTTP/1.1 or HTTP/2 and using the Alt-Svc header to discover it. This site is served through Cloudflare Pages, and you may be reading it now thanks to any one of those HTTP generations.


🍜 The terminology soup - it's not just you

As you start reading more about Network Relay and its accompanying technologies, you may find that you fall into a large pot of terminology soup. You'll see terms like private relay, proxy, CONNECT, QUIC, MASQUE, hop, tunnel, transport, substrate; used interchangeably and often describing a mix of different technologies. In many ways, some of this is just semantics, but I think it’s worth unpacking, because I found it really confusing at first, and it might help others new to the topic. In its Platform Deployment Guide, Apple uses the term Network Relay, a solid name choice that well defines the intent and helps it stand alone from traditional proxies. In that guide, Apple then go on to say:

"Network relays are built on the modern and standardised MASQUE protocols and can be used to proxy all TCP and UDP traffic."

This is broadly true, but it also highlights our soup problem. MASQUE (Multiplexed Application Substrate over QUIC Encryption), as specified in the name itself, specifically refers to a family of protocols (or technically a substrate) built on HTTP/3 over QUIC - however Apple’s implementation of Network Relay also uses the more traditional Extended CONNECT method, which isn’t MASQUE proper (but is definitely foundational and adjacent), and can also fallback entirely to HTTP/2 and not use QUIC or UDP at all! Confused yet? Stick with me.

If you’re down in the weeds enough for it to matter, you probably already understand the nuances. But if you’re an Apple admin just looking for a modern and secure way to replace a VPN or architect a zero-trust network flow, I think it's totally fine to leave some of the semantics at the door. If you are deep in the weeds, you might notice a few simplifications in this article in order to make things more accessible - please don’t hold them against me.


🚚 What is a relay, and why does it matter?

So, what specifically is a Network Relay? In short, it's a HTTP proxy capable of tunnelling traffic to a target host. This is currently done in one of two ways. For TCP traffic (including standard web traffic), this is done using Extended CONNECT:

A simplified representation of a CONNECT request to a network relay proxy. The client establishes a secure outer tunnel through the relay, and once established, sends its raw TCP packets (including any TLS handshake) through to the target host. For encrypted traffic, the relay itself has no way of decrypting, so client privacy is preserved and the TLS connection survives end-to-end.

Under this, a client uses the CONNECT HTTP verb to establish a "raw" TCP tunnel through the relay, which allows it to send any arbitrary TCP packets in such a way that they will be streamed back and forth to the target server. The relay opens a simple upstream TLS connection to the target and then returns a 200 status back to the client in order to let it know the tunnel is ready. Importantly, for encrypted workloads (such as HTTPS and anything protected by TLS), the Network Relay is not performing any form of TLS termination; it simply passes data through unencumbered, which allows the client to perform its own TLS handshake with the target host. In doing so, the relay has no way (beyond knowing the host name through authority or SNI) of inspecting or decrypting the traffic between the client and the target.

A Wireshark capture from a macOS client showing the "outer" QUIC Initial packet (to co-ordinate the encryption for the client to relay connection), and then the "inner" TLS handshake for a relayed host prometheus.internal.credibly.cc (which is simply passed along through the tunnel by the Network Relay proxy server to the target host). The relay is responsible for reconstructing a TCP packet with its own IP and handling the data frame exchange between the client and target.

For UDP traffic, MASQUE & the CONNECT-UDP upgrade are used to create a similar tunnel to a target host. As UDP is a connectionless protocol, this gets a little more interesting, as the Relay technically has no way of knowing that the upstream socket "tunnel" to the target host is even reachable, so MASQUE has some built in mechanisms to signal to clients and hold sessions open in order for data to flow. For this article, i think it's too hard to get into the specific treatment of datagrams and context identifiers, except to say that MASQUE provides an effective way for a single TCP or QUIC connection to carry multiple UDP streams simultaneously (we will talk about the challenges of encapsulating UDP in TCP later on in this article).

From a client's perspective, it really does feel simple, and for the most part, it just works. Simply install a configuration that points at a compatible Network Relay proxy and watch as your TCP & UDP traffic is magically tunnelled where it needs to go. It's a bit duck on water; so simple above the surface, but a whole lot going on in the (substrate) below.

Network relays are an exciting technology because they can offer a far more flexible experience compared to traditional (layer 3/4) VPNs. Traditional VPNs such as IPSec, SSL, or even Wireguard come with the complexity of managing IP addressing, maintaining routing tables, and potentially handling specific interactive authentication flows. In contrast, network relays can utilise fairly standard HTTP connections with simple, low-latency certificate-based authentication, making them much more streamlined and efficient. The tunnels can be managed and multiplexed by the operating system for maximum performance, and applications don't need to know it's even happening. Additionally, Network Relay is one of the Apple Device Management payload types that is supported alongside ACME certificate payloads and Managed Device Attestation, meaning we can use cryptographically attested hardware identifiers as part of our authentication and authorisation efforts. This is an extremely powerful security design that I will show off in detail in some of the technical examples below.

There’s no need to bring up a system-wide network interface or modify routing tables when using relays, so traffic can be easily matched and immediately shipped to a defined proxy capable of securely tunnelling only that specific connection. This streamlined approach means that network relays can provide a much simpler connectivity solution compared to traditional VPNs, but they also play nicely with them - because there are no changes to routing tables or network interfaces, relayed traffic simply follows the rules of the underlying network stack, so you can easily blend these technologies and not impact the user experience. I've successfully run relayed traffic over Wireguard and Tailscale networks, and it just works seamlessly. When it does come to IP and routing, the MASQUE draft standards also define another new method, CONNECT-IP, which would allow for the full proxying of IP packets inside of HTTP. Whilst not currently implemented in Apple's Network Relay, in future this could open up even more interesting possibilities and is an exciting component to watch evolve.

Through Network Relay configuration and transport flexibility, it’s also far easier for individual domains and even managed applications to be transported through relays, leaving all other device traffic unaffected. While similar behaviour can be emulated with per-app VPNs and on-demand VPN rules, it’s clear that relays provide a simpler technology that easily allows a straightforward horizon to be defined for different types of data and network connectivity on Apple devices.

As we look towards the future, the standard HTTP (& TLS or QUIC) nature of connections will make Network Relays much more cryptographically agile; over time, new cipher suites can be introduced and negotiated between client and proxy at the TLS and QUIC layer, and ideally make the transition to post-quantum encryption much simpler (or, through their proxying nature, even provide a post-quantum encryption "wrapper" for legacy solutions). No need for configuration files with pre-agreed Diffie-Hellman parameters and phase configurations; just allow a client and server to negotiate a cipher and session key like they do many thousands of times on your device every single day as you browse the secure web.

Network Relays already solve real problems, and the roadmap suggests the capability set will only grow.


🍿 Demystifying NERelay & peculiar connection flows (it's always DNS)

Sometimes I find that in order to make good sense of a topic, you've got to take the long way round, and unfortunately for you, dear reader (and this extremely long article), that is the case here. Whilst most of our focus as Apple admins (and the bulk of the examples in this article) will be on how we can deploy relay configurations from MDM solutions, there are two distinct ways to deploy system-wide relays on Apple devices. We have our standard Device Management payload, but we also have NERelay, a Swift-based API that allows developers to customise and deploy system-wide Relays directly from their apps (with explicit user consent). This could be used by a SASE or Zero Trust vendor to implement a relay directly into their agent, and I hope in the future we see more commercial solutions doing so.

A system-wide NERelay configured Network Relay asking for explicit user consent on install. This is one of the only places you'll see any user-facing reference to "Proxy" relating to relays - it's clear that Apple is trying to differentiate them and let them stand under their own terminology.

We are going to explore the Swift API first, as it turns out there is some additional functionality exposed as part of NERelay that I think can help in understanding the particulars of connection flows and implementation pre-requisites. Don't worry - you don't need to be a developer or understand Swift to follow along here; we won't be writing or looking at any code. We will simply be talking at a high level about the configuration and structure parameters that developers use to configure and work with this API.

To start, let's drop down another layer by first talking about a sibling (or perhaps cousin) of Network Relay - VPN on Demand. On Apple Platforms, the VPN on Demand technology allows an administrator to deploy a VPN configuration that can evaluate and auto-connect based on specific domains or network conditions. In an example flow (using a Connect action), you'd see something like:

A very simple example flow for a VPN on-demand configuration. Importantly, once the tunnel is established, the DNS resolution happens inside the tunnel using internal resolvers.

This flow is reliant on Apple's connect-by-name APIs and effectively hooks a connection in such a way that its influence is invisible to the process that requested it. It also feels obvious - if we are deploying access to private resources from the internet, then we want both the DNS resolution and the access to the resource to happen securely inside the tunnel (so that it is both encrypted and the DNS records are not published on the internet). It should be noted that we are talking about simple Connect actions, and other rule types can operate differently.

Why do I bring up VPN on Demand behaviour? Well, Network Relays, despite sharing a lot of the same connective tissue, operate quite differently. Let's have a look at that same connection flow as before, but this time for a very basic configured relay:

A very simple example flow for a Network Relay configuration. Note that the initial DNS request uses the standard system resolvers and is only "matched" by the Relay configuration if it is resolvable.

Did you catch the very large distinction? In this default configuration, the client's standard DNS resolvers are used for the initial resolution, and if they fail - the connection is never relayed. In most deployments, this would require the client to have a way of resolving internal subdomains from wherever it is in the world. Once the IP has been successfully resolved, the system then forwards the hostname (and not the resolved IP) as part of its authority request to the Relay, and in doing so, simply discards its personally resolved IP. I don't know about you, but to me, that just feels weird!

This brings us to how the dual Relay APIs handle this behaviour, and why I decided to bring up NERelay first (or at all). Here you can see an overview of the shared and divergent options for NERelay and MDM deployable (payload/profile) Network Relays:

As you can see, there is a lot of shared configuration; namely, the ability to set both a HTTP/3 and HTTP/2 relay, domain matching criteria and client identities. Where we see major differences are in the addition of DNS behaviour modifiers through Synthetic DNS and explicit DNS over HTTPS resolvers to NERelay, but are missing from Device Management configuration payloads.

The first of these, Synthetic DNS Prefixes; provides a simple way for the system to handle the DNS behaviour we observed above. When you set these, the client will simply synthesise an IP address from the provided range for matched domains in the initial connection. As an IP is magically returned from the system DNS resolver, the Relay will be activated and the CONNECT method established. This can be done for both IPv4 and IPv6 networks, making things incredibly easy, and best matching the behaviour of the initial VPN on Demand flow that we discussed. Here is an example of what this lookup presents when configured for test IPv4 prefix 192.0.2.0/24:

A dns-sd lookup showing a synthetic IPv4 resolution as part of a NERelay configuration on macOS. In this configuration, the synthetic IPv4 range 192.0.2.0/24 has been set for the domain *.internal.credibly.cc.

I've found that dns-sd is actually a very helpful command when working with Relays, as traditional DNS lookup tools like dig and nslookup don't use the full system resolver that integrates these (and DNS over HTTPS) lookups:

% dns-sd -G v4v6 [hostname]

The second DNS-based configuration option in NERelay looks great on the tin. Based on its name and description in the documentation (dnsOverHTTPSURL), it looks to be an ability to explicitly define a DNS over HTTPS resolver directly in the relay configuration. That, however, is not what it does, and this parameter does not refer to a standard DNS-over-HTTPS (DoH) resolver at all. When specified, this option configures support for Oblivious DNS over HTTPS (ODoH). In this setup, the system issues an Oblivious DNS POST request to the relay itself, which in this system is supposed to act as an ODoH proxy. This one took me a while to find and troubleshoot, as it was fairly unexpected behaviour. Whilst this is a bit of a let down for most enterprise scenarios, it aligns very clearly with some of Apple's other commentary and directly hints at Apple’s use of NERelay with iCloud Private Relay; in which DNS queries are deliberately encrypted and sent through an Apple-operated proxy, then resolved by a separate upstream "egress" resolver like Cloudflare or Fastly. This option can also be combined with NERelay's ability to chain multiple relays together, which would probably lead you closer to creating your own personal service in the vein of iCloud Private Relay.

Additionally and finally, NERelay exposes support for the same style of "on-demand" rules as VPN on Demand. We won't explore these in this article (and i'll admit I haven't tested them in depth), but they would theoretically allow for split horizon style scenarios where relays could be activated and relied on for secure access when on the wider internet, and then bypassed when a secure traditional network route to a resource should exist on an internal network or wireless SSID.


📲 Coming back around to Device Management

That was a long way around to come back to device management deployment of Network Relay, but I hope you can see why I chose to introduce it this way. I think it's important to understand how the DNS flow operates and how Apple chooses to specifically deal with this inside NERelay.

Why is this particularly important? Well, when we deploy Network Relay via MDM, we simply have to bring our own DNS strategy. We don't get the simplicity of those cool synthetic DNS prefixes, or any way to define an explicit DoH resolver (oblivious or otherwise). In real world terms - if you have a user sitting at a coffee shop who wants to access projects.internal.credibly.cc on their iPhone, that phone must have a way to resolve the internal hostname, or the connection will simply fail and the relay will never occur, even if it is otherwise successfully configured.

So it looks like we have a problem to solve and need a way to resolve internal domains on distributed internet devices - surely that can't be too challenging, right?

One thought might be to try and run a DNS over HTTPS resolver through the Relay. This seems obvious and would solve our problem quite nicely. If DNS over HTTPS is just arbitrary TCP, then let's just relay it by combining managed DNS settings and a matching relay payload:

A simple visualisation of how managed DNS over HTTPS (DoH) would look when running inside a relay.

💣 Unfortunately, it doesn't seem to work! I would love to be proven wrong here, but setting up a configuration like the above causes DNS lookups on the client to silently timeout and fail. The OS knows that the DoH resolver is there, as we can see a placeholder for it when using scutil --dns:

scutil --dns output showing the placeholder (internal DoH) resolver config on a macOS client with a DNS over HTTPS com.apple.dnsSettings.managed profile pushed out via MDM (I'm actually not sure why there are 2, but this always seems to be the case).

When we try to run this DNS over HTTPS (or TLS for that matter - the behaviour is the same) configuration through a Relay (with a resolvable internet facing IP for the resolver a.doh.credibly.cc), we get no resolution for the *.internal.credibly.cc domain, just a timeout:

A dns-sd lookup showing a failure to resolve *.internal.credibly.cc domains through an MDM-configured Network Relay.

What a pain! I suspect that this is deliberate, and Apple’s DNS client is blocking recursive resolution and transport through the same relay in order to avoid loops, but it's frustrating as it would have been a clean way to solve our problem, and the behaviours are largely undocumented.

If we can't run DNS through our relay, maybe we can run a secure internet-facing DNS over HTTPS responder to resolve our internal domain. On the surface, this looks doable - the com.apple.dnsSettings.managed payload type has a PayloadCertificateUUID property, which claims that the "system uses this identity to authenticate the user to the DNS resolver". In theory, we should be able to set up a route on our Envoy proxy to an internal DoH resolver and then authenticate to it using the same certificate that we are presenting for our relay!

💣 Again, it doesn't appear to work! The issue here is that when configuring a PayloadCertificateUUID (even one that is being successfully used in the same profile to authenticate to the relay), Apple's implementation strangely doesn't present the client certificate when asked (this is true for both DoH and DoT). I've tested this thoroughly across macOS & iOS and see the same behaviour, and at least one other individual seems to have come across this, but it doesn't seem to be otherwise widely reported (🍎 folks pretty please see FB17391418). It's easy to reproduce by spinning up a simple stunnel proxy and pointing a com.apple.dnsSettings.managed configured macOS or iOS client to it:

2025.05.07 19:51:01 LOG5[14]: Service [DoH] accepted connection from 172.29.80.58:52223
2025.05.07 19:51:01 LOG6[14]: Peer certificate required
2025.05.07 19:51:01 LOG7[14]: TLS state (accept): before SSL initialization
[...]
2025.05.07 19:51:01 LOG7[14]: TLS state (accept): SSLv3/TLS read client hello
[...]
2025.05.07 19:51:01 LOG7[14]: TLS state (accept): TLSv1.3 write server certificate verify
2025.05.07 19:51:01 LOG7[14]: TLS state (accept): SSLv3/TLS write finished
2025.05.07 19:51:01 LOG7[14]: TLS state (accept): TLSv1.3 early data
2025.05.07 19:51:01 LOG7[14]: TLS alert (write): fatal: unknown
2025.05.07 19:51:01 LOG3[14]: SSL_accept: ../ssl/statem/statem_srvr.c:3509: error:0A0000C7:SSL routines::peer did not return a certificate
2025.05.07 19:51:01 LOG5[14]: Connection reset/closed: 0 byte(s) sent to TLS, 0 byte(s) sent to socket

Though it's absolutely there in the docs and spec, PayloadCertificateUUID isn't even configurable in the UI of most MDMs, so perhaps there just aren't a lot of people out there wanting to use certificate-based authentication for their DNS!

So, where does this leave us? Well, in a bit of a strange place, really. As it turns out, because we just need any old successful resolution, we can use internet-facing Wildcard DNS entries or simply plain old junk records, and that will be enough for clients to relay. As stated before, what the A or AAAA records resolve to seems to be largely irrelevant - the actual connection will use the Relay's DNS resolver for the target host. For my labs, I've simply CNAME'd to my external relay hostname of relay.credibly.cc, and everything just works! If you are just wanting to play with Relays or run a similar lab, this might be enough.

An example of a Wildcard DNS record for *.internal.credibly.cc out on the internet (in this case, Cloudflare DNS).

Equally, you could run an unauthenticated DoH resolver for your internal domain(s), and that works too - your appetite for this depends on if you consider your internal IP ranges confidential or if you just don't care (as they aren't accessible from the internet, just resolvable).

This whole DNS dance has plagued me, and I hope that I've either got something wrong and will be community corrected, or Apple will address the issue with client certificates, and we can run securely that way.


⚒️ Implementing an open-source Network Relay with Envoy

Envoy - a flexible open source application proxy - has supported Extended CONNECT for quite some time, and fairly quietly added support for MASQUE's CONNECT-UDP in late 2023. In doing so, they became the first open-source proxy and solution to support the required mechanisms for Network Relay.

To show off what Envoy can do, I built out a very simple lab that looks like this:

A simple overview of our lab setup. Internet-based clients can get an MDA client certificate from devices.pki.credibly.cc and then use it to connect to relay.credibly.cc. The relay has access to internal DNS, which allows it to resolve hosts for *.internal.credibly.cc hosts (as well as any other standard DNS lookups such as apple.com, etc.)

And to support our relay with the required certificates, also built out a very simple two-tier PKI; one intermediate for services (our relay, DNS over HTTPS resolvers, other internal resources), and another for devices:

Our very simple two-tier PKI with a shared root. We will use this in our lab examples to attest device certificates and build a mutual trust model between our devices and the relay.

For the sake of simplicity, I won't be detailing the setup of the servers, lab, PKI or the online ACME provisioners, however, it's fairly simple work with Docker, a DNS server, an Envoy container and a couple of step-ca instances. All you really need to know for these examples is that we have an online authority at devices.pki.credibly.cc capable of issuing hardware-attested certificates from the Credibly Intermediate Devices E1 authority to devices via an ACME payload, and an internet facing Envoy network relay listening on port 443 (both TCP & UDP) and presenting a TLS certificate for relay.credibly.cc signed by Credibly Intermediate Services E1. We can and will use the ACME-issued client certificate to authenticate and perform mutual TLS with our network relay using a hardware-attested identity - extremely powerful stuff.

This is probably where we need to discuss a limitation with Envoy's QUIC implementation, and it's a fairly significant one in the face of all this attestation talk. Envoy's QUIC implementation is powered by Google's QUICHE library, which at the present time lacks support for certificate-based authentication (mTLS) for HTTP/3. This means that whilst our Envoy proxy will fully support a very clean and secure hardware-attested PKI-based authentication mechanism for HTTP/2 connections, connections over HTTP/3 are limited to significantly less secure forms of authentication within additional HTTP headers. This limits the usability of Envoy as a production HTTP/3 Network Relay proxy, as even if you were comfortable decoupling from hardware identity and using a form of header-based authentication, these would be awkward to manage using MDM. Luckily, the Envoy/QUICHE team have indicated that support for client certificates in QUIC/HTTP3 is coming, and is on track to be added this year.

Envoy is incredibly flexible and can be used across a swathe of use cases - many of which they actively provide example configurations for. Whilst the provided example configs cover off many individual elements that we need for a functioning edge proxy to support network relay (TLS inspection and termination, mTLS validation, CONNECT upgrade support, dynamic forward proxying, etc.), I struggled to find a singular defined example that covered everything we need, so I created one for testing with Network Relay. This was built out of a mix of recommended settings for edge proxy and upgrade configurations, Envoy's extremely detailed (and humbling) documentation, and some good old-fashioned trial and error.

⚠️
Be cautious if you are tinkering with this, as unless you specify an authentication config; it can allow any client to proxy arbitrary TCP & UDP through your server. In particular, the HTTP/3 config I have provided is fully open and unauthenticated, so don't just expose this to the internet unless you have a plan of how to protect it!

For those interested in the specific Envoy configuration, I've made it available here, and the following video walks through some of the config parameters and what they mean for supporting Network Relay (if you aren't that interested in the backend and just want to see it from the client perspective, feel free to scroll on past):

As a final quick note on Envoy; it is not particularly verbose by default and will happily sit there rejecting clients for expired & invalid certificates, dropping datagrams that are too large and erroring out on upstream DNS without logging a single thing to stdout unless you put it in trace mode. I've supplied my docker-compose.yml with the line I used to do this, and if you are playing around you might find this useful.


👨🏻‍💻 Finally - let's see some relayed data flows

Great! Now that we have our proxy configured, we can start relaying some connections! Let's push a MDM profile to a iPad that contains:

  • Our Credibly Root E1 self signed Root certificate (so that the device trusts the relay and ACME servers)
  • An ACME payload requesting a hardware bound key & Managed Device Attestation certificate
  • A Network Relay payload configuring a HTTP/2 Relay (https://relay.credibly.cc) using our ACME payload as the identity certificate and matching *.apple.com domains

We wait a moment, and then see this start to appear in our Envoy server access logs:

[2025-05-28T03:31:33.020Z] "CONNECT - HTTP/2" response_code=200 duration=33337 bytes_received=1431 bytes_sent=8829 authority=ocsp2.apple.com:443 client_cert_subject=CN=G2RM8N14VD client_cert_issuer=CN=Credibly Intermediate Devices E1
[2025-05-28T03:31:38.534Z] "GET /.well-known/masque/udp/amp-api.apps.apple.com/443/ HTTP/2" response_code=200 duration=12643 bytes_received=4253 bytes_sent=11193 authority=amp-api.apps.apple.com:443 client_cert_subject=G2RM8N14VD client_cert_issuer=Credibly Intermediate Devices E1
[2025-05-28T03:31:44.534Z] "GET /.well-known/masque/udp/time.apple.com/123/ HTTP/2" response_code=200 duration=69 bytes_received=51 bytes_sent=51 authority=time.apple.com:123 client_cert_subject=G2RM8N14VD client_cert_issuer=Credibly Intermediate Devices E1

Lets break down what we are seeing:

  • We open up with a pretty simple CONNECT tunnel being opened for a HTTP/2 call to ocsp2.apple.com. In this (and all our requests) we can see our attested hardware identifier; the device's serial (G2RM8N14VD)
  • We then see a GET request which configures a CONNECT-UDP MASQUE tunnel for a HTTP/3 call to amp-api.apps.apple.com. This is now HTTP/3 (and QUIC) being tunnelled inside HTTP/2! You can see this only transfers a small amount of data (11KB) and the tunnel is open for about 12 seconds.
  • Then we see something really cool - some proper arbitrary UDP! The client again uses a CONNECT-UDP MASQUE tunnel to time.apple.com in order to synchronise its time using NTP.

Relaying apple.com domains like this is quite informative as you get to see a consistent stream of interesting domains that Apple devices reach out to as well as the mix of HTTP/2, HTTP/3 and other arbitrary TCP and UDP ports. It should be noted that these log examples are just a fairly simple representation of what is observable - Envoy's access logging is very detailed and can be customised to your needs if you want to see specific timings or other detailed connection metadata (both downstream to the client and upstream to the target host).

If we keep watching the logs, we will see interesting entries like:

[2025-05-28T07:50:13.665Z] "CONNECT - HTTP/2" response_code=200 duration=909048 bytes_received=5609 bytes_sent=3922 authority=20-courier.push.apple.com:5223 client_cert_subject=G2RM8N14VD client_cert_issuer=Credibly Intermediate Devices E1

A longer lived 15 minute tunnel for APNS push notifications using port 5223, and:

[2025-05-28T03:31:44.896Z] "CONNECT - HTTP/2" response_code=200 duration=117445 bytes_received=517 bytes_sent=76392249 authority=iosapps.itunes.apple.com:443 client_cert_subject=G2RM8N14VD client_cert_issuer=Credibly Intermediate Devices E1

A larger 76MB download from the App Store.

All of these are silently relayed through the Envoy proxy without any impact or visibility on the client, and these connections are all authenticated and gated by the client's presentation of a valid leaf signed by Credibly Intermediate Devices E1.

So what about internal resources? Our lab has some observability setup using Envoy's exposed Prometheus metrics that I've graphed in a simple Grafana dashboard at grafana.internal.credibly.cc. Let's change the matching domain in the Relay config to *.internal.credibly.cc and load it up in Safari:

[2025-05-28T03:45:44.896Z] "CONNECT - HTTP/2" response_code=200 duration=23909 bytes_received=1091 bytes_sent=16596 authority=grafana.internal.credibly.cc:443 client_cert_subject=G2RM8N14VD client_cert_issuer=Credibly Intermediate Devices E1

Perfect! A very simple browser based flow, but none the less one that is now accessible globally, protected by MTLS, and scoped purely for our *.internal.credibly.cc domain without touching a client routing table or assigning an IP.

To better illustrate what the experience of using a configured Network Relay can look like, check out the video below where I show off some browser flows, arbitrary TCP connections for SSH & Microsoft Remote Desktop, as well as some OS level behaviours for HTTP/2 and HTTP/3 traffic and relays:

Because they are built into the OS and don't require an agent or interactive authentication flow, a Network Relay configured by an MDM profile comes alive and can be used during the setup assistant - unlocking some powerful deployment flows such as this one with integrated Managed Device Attestation:

And what about some actual load? Let's see what traffic looks like for 200 iPads connecting to *.apple.com domains as well as how HTTP/3 to HTTP/2 fallback works in reality:


📊 A few notes on performance

It's important to note that the examples I walked through weren't specifically optimised for performance and were mainly to show the general utility of Network Relay on Apple platforms. I'm positive that there are additional window size, flow control and other settings in Envoy that can be better tuned to production scenarios, and I'd love to hear any feedback from anyone who works on this.

With that in mind however, I was interested to see the impacts of encapsulating HTTP/2 inside HTTP/3 and vice versa as well as the effects of QUIC's loss detection and recovery when on a less than perfect connection. To test this I developed a very simple test client (and accompanying server) that uses Swift's URLSession and URLSessionTaskMetrics to benchmark and record specific timings and performance across both HTTP/2 and HTTP/3. This was then run over both a HTTP/2 and HTTP/3 relay, giving us a 4 way matrix (2 over 2, 3 over 2, 2 over 3, 3 over 3) of tests and benchmarks. I will likely write a followup with a deep dive into this data, but for now lets just examine the results of a single test - 25MiB of data divided across 50 streams in a single multiplexed connection This test was then repeated 1000 times for each type and the results averaged:

A multiplexing test (25MiB across 50 streams) run 1000 times and then averaged. The baseline with no loss shows a slight pure throughput edge to HTTP/2 but as soon as we introduce a small amount of loss to the link, HTTP/3's QUIC resiliency quickly dominates over the TCP (and TCP inside TCP!) retransmissions of HTTP/2.

For non-network folks, these results may shock - just 0.1% (or on average 1 in every 1000) of lost or malformed packets doesn't translate to a 0.1% loss in throughput, but potentially up to a 90% one! This has to do with how congestion control is handled in TCP and highlights why QUIC can be such an important upgrade; particularly if we are relying on Network Relay to facilitate access to clients on disparate networks where we can't always guarantee link quality. Another metric where we clearly see the benefits of QUIC are in the initial handshake and Time to First Byte:

The same test shows the effect on time to first byte (the time taken to establish, handshake and for the client to start receiving data). We continue to see the same pattern - TCP retransmissions in HTTP/2 cause pain whilst QUIC's improved handshakes, multiplexing and resiliency help HTTP/3 respond much faster under loss as well as protecting the inner HTTP/2 stream.

Here, the benefit can be seen in the no-loss baseline, and it's then magnified under loss. QUIC's more efficient handshake which typically can be done in a single round trip versus TCP & TLS's two allows data on connections to flow faster.

Tommy Pauly alluded to this in his unusually open presentation on Apple's QUIC implementations in 2021. When relaying traffic via HTTP/3 for iCloud Private Relay, measurements showed benefit when using QUIC for the Client to Relay hop, even if the underlying request is HTTP/2, effectively "shielding" TCP congestion and loss by encapsulating it on the imperfect networks that users experience every day.

🧪
For network folks; I know I have skipped many of the intricacies here and provided only Keynote style headline graphs. These tests aren't completely fair as they pit CUBIC vs BBR in a consistent lossy pipe, but these are the defaults. This article would strain with detailed descriptions of congestion control algorithms, reliable streams, unreliable datagrams, test patterns, concurrency distribution and OS level multiplexing. I do however plan to delve into all this in a more technically focused piece in the near future, so stay tuned!

🫆 Impacts on ZTNA & SaaS application access

As i'm sure you would agree, Network Relays unlock some strong Zero-Trust patterns on Apple platforms. In some ways this is nothing new, as we can achieve much of the same with well scoped VPN and SASE tooling, however the additional flexibility and integration that Relays bring make it possible to build out some very compelling workflows.

An additional config parameter that we skipped over a little is RawPublicKeys, which can bring some additional benefit as a form of certificate pinning for the for the device-to-relay TLS session. All combined this means that Network Relay can give us:

  • A secure tunnel out of a source network using cryptography we can control and enforce pinning to. This provides protection for relayed traffic against on-path interception (be it legitimate inspection or attacks)
  • Flexibility on which domains and applications are relayed without needing to modify routing tables or per-connection DNS
  • The ability to egress our traffic out of trusted IPs (which can then be used for access control in upstream applications or IdPs)
  • An ability to enforce all of this behind a securely provisioned, hardware backed X.509 identity that can be cryptographically proven to be resident on a specific Apple device

This is the case for both domains we control (such as our *.internal.credibly.cc example, as well as ones that we don't like our SaaS platforms or IdPs. Additionally, this power from Relays can be combined with tools like Cloudflare Zero Trust or Microsoft Entra ID Conditional Access policy to build out powerful authentication and access control flows based on trusted egress IPs.

This looks to be what Smallstep is trying to achieve with it's Smallstep Enterprise Relay, which is an early adopter offering a commercial MASQUE relay capable of egressing to a set of defined SaaS applications - very cool! I've chatted a fair bit with the Smallstep team on device attestation and step-ca and like me, they are extremely passionate about the topic and it's potential, so its great to see them using it here.

I've always been very interested in the idea of distinct device and user identity, and believe that too often in our industry these have been conflated together. One thing that I really like about Network Relay (when integrated with device attestation), is that it is capable of giving us 2 separate (but integratable) AAA stacks that can be configured to form a real and significant gauntlet for resource access without adding too many layers of complexity. In this, a device must meet attest-able compliance criteria in order to achieve network level access to an application before a user has a chance to login. Ideally this is done with phishing resistant credentials, and all of these signals and audit trails neatly shipped into a SIEM.

I'm sure you will agree that is a fairly powerful security story, and I hope we will see a much wider adoption of this amongst both commercial vendors and open source solutions as the technology continues to mature.


🕵️‍♀️ MASQUE, obfuscation and the impact on traditional filtering

One of the original privacy goals of MASQUE was to make relayed traffic appear indistinguishable from normal web traffic through explicit obfuscation techniques. While the protocol's standardisation efforts through the IETF working group have invested focus towards simpler use cases, this obfuscation intent remains embedded in the RFCs and these may resurface as a focal point as deployment and development continues. The idea behind it is simple but powerful: a user could browse to what appears to be a regular website, yet the server behind it may also function as a concealed authentication MASQUE relay; quietly and efficiently tunneling arbitrary TCP or UDP traffic to anywhere else on the internet. This concept (whilst not yet fully realised) will likely terrify many network and security teams who employ rigid controls of data egress on their networks.

😱
As a note here, the "not yet fully realised" component is the concealed nature of the authentication, but there is absolutely nothing stopping someone running a CONNECT or CONNECT-UDP proxy today and using it for data exfiltration, C2, or just as an employee or student wanting a clean unfiltered link! If this concerns you, maybe check to see how your edge firewall or network access solution would handle this today.

Initially, when QUIC was an experimental protocol used mainly by Google, many firewall and security solutions suggested to block it entirely. This was primarily because QUIC, being a UDP-based protocol, could bypass traditional HTTPS inspection and man-in-the-middle techniques that organisations relied on for security controls (content filters, DLP, C&C protection, ect). Blocking QUIC traffic generally had the effect of forcing Chrome (and then later other browsers) back to HTTP1/2 and through traditional edge proxies or inspection.

An example outbound LAN->WAN configuration blocking QUIC on a Sophos XG network edge firewall.

For many organisations, these traditional edge proxies form a major part of their perimeter security strategy. In education settings for example; these controls are crucial for content filtering and ensuring students can’t access inappropriate or harmful subject matter online. In enterprise environments, they might be used as part of DLP solutions or protection against malicious traffic.

The newly presented challenge is that the encryption in HTTP/3 & QUIC conceals much of the information that traditional firewalls used to rely on for filtering. While in early versions of QUIC, it is possible for firewalls to decrypt some of these streams, doing so is computationally expensive and can impact performance and compatibility. Additionally, QUIC and HTTP/3 are evolving and new versions of QUIC are being discussed to completely lock out this style of inspection.

Through Network Relay's built-in HTTP/2 fallback behaviour, Apple clearly understands that there will be many network environments both today and into the future where HTTP/3 traffic flow isn't promised. Interestingly, their current advice on how to block the use of iCloud Private Relay on networks simply involves DNS poisoning the mask.icloud.com and mask-h2.icloud.com domains. This causes clients to fail the initial lookup, and data flow to fall back outside the relay. This style of blockage is obviously achievable for this one specific Relay, but in a future world where almost anyone can run a network relay, relying on DNS blocking is clearly not going to scalable or sustainable.

As HTTP/3 and MASQUE become more common, it will be interesting to see how vendors and network admins respond with new observability and control technologies and techniques. Legacy filtering methods may be stretched, but these may not scale in a world of heavily encrypted, multiplexed traffic. A shift toward zero trust, where access decisions are based on device agents, identity and posture is clearly underway. But for many organisations, edge firewalls and traditional controls aren’t going anywhere fast. This will be an interesting challenge to watch play out as the traditional perimeter melts away and the true edge is harder to define.


🥂 The bouncer, the bartender & some final closing thoughts

As a very wise man once posed to me; If you go to a bar, do you have to show ID to the bartender every time you buy a drink, or is the wristband from the bouncer at the door enough? In our world, this all depends on your views on Zero Trust (and your individual threat model), but raises some interesting question about relying parties and how we consume identity.

Our Envoy examples with MDA certificates were powerful, but lack a good authorisation story. Its similar to the paradigm I have previously discussed around Managed Device Attestation; just because a device can be attested to be a genuine Apple device, it doesn't mean its a device that belongs to you. Here; a valid certificate alone doesn’t say whether the device still meets MDM compliance, whether the user is on shift, or whether a relay to the requested target destination should be allowed. We can of course choose to deal with some of this upstream as part of our PKI issuance process, but this is precisely the point of the bouncer and the bartender! Where and when do we check the ID, and how thoroughly? Do we issue attested identity for a long time and then rely on consumers to validate it, or do we simply issue it for short periods and then allow consumers to piggyback on our sessional trust?

If we really want to lean into Zero Trust principles, I think we need to consider that upon consuming a certificate with cryptographically rooted identifiers, we perform some validation and verification of its current state (outside of outright certificate revocations). If we have permanent identifiers for a device, there is nothing stopping us using those as a cross platform signal; potentially verifying enrollment or compliance status in an MDM, ensuring the health of security telemetry, or enforcing other forms of continuous authentication.

Envoy has some interesting abilities to integrate with control plane services and external authorisation filters that may be able to assist in solving some of these challenges. I plan to do some work and experimentation on this over the next little while, as I think there is a missing link here that needs some further consideration.

In some ways, we've just scratched the surface on the use of Network Relays on Apple Platforms as well as where they could head in the future. We looked at some cool examples, but didn't even touch concepts like relay chaining, scaling (through round-robin DNS or similar) or different deployment scenarios.

I have some thoughts around how Network Relay and attestation could be a very powerful part of Apple's BYOD story alongside Account Driven User Enrollment. To fully realise this, we'd need to see some updates and enhancements to Device Attestation; specifically the long awaited eventuation of Secure Enclave Enrollment ID. This was introduced and discussed by Apple at WWDC 2023 as an independent and private identifier linked to the enrolment of a device. Importantly, this would be present across all current deployment types, including User Based Enrollment and Managed Apple Account driven scenarios. I spoke about this briefly when I joined the MacAdmins Podcast to talk about Managed Device Attestation, and discussed how the Secure Enclave Enrollment ID could open up some awesome attestation workflows for BYOD devices. Here is just one such imagined workflow as it relates to network relay:

An example future workflow showing how Secure Enclave Enrollment ID could be used as a consistent signal or primary key between systems even when the permanent identifiers aren't known. In this model, access and authorisation decisions at a relay level could be made via external lookups to MDM or other control plane tooling.

This hypothetical workflow would allow a Relay to accept attested certificates as identification and authentication, and then use the Secure Enclave Enrollment ID to perform authorisation lookups. This could be as simple as ensuring the device is correctly enrolled, or as detailed as checking specific compliance requirements, device state and security agent status and then authorising CONNECT tunnels to specific hosts in a attribute based access control model.

Sadly, as of May 2025; a freshly attested device doesn't provide this identifier and it isn't present in documentation outside of that WWDC session. I have some theories around how it's use could be linked in with the new ManagedApp framework and the multiple device attestations that can provide, so maybe it will finally get some love at WWDC this year. I believe having a consistent co-residency "primary key" that can be used for cryptographically verifiable signalling across different tools would massively oxygenise the use of attestation in the Apple ecosystem.

This article has also only scratched the surface around the ongoing development of MASQUE at the IETF, and how new technologies like QUIC-Aware Proxying aim to improve performance and security even more in future iterations. Recent discussions at IETF122 and 121 have highlighted some of the planned improvements as well as providing interesting insights about the "zoo of options" and some of the terminology confusion I discussed in my intro. I will be fascinated to watch as this exciting set of protocols and standards continues to evolve as well as how this is implemented in Apple's operating systems.

As you can probably tell by now, I'm fairly passionate about Network Relay and allied technologies, and would love to discuss it further if you too have found this article interesting or have further questions or clarifications. Much of my research and work here has been largely in an under-documented world, so whilst I have made every attempt to test thoroughly, its possible I have made errors or incorrect assumptions - if you have any feedback or corrections to offer, I would love to chat. I've created a new channel; #network-relay on the MacAdmins Slack (🎉 happy 10-year anniversary!) and would love for it to become an active place to explore this topic in more detail. If that's not your thing, you are always very welcome to reach out to me directly on LinkedIn or elsewhere.


🎭 Special Thanks


🪏 Technical Tidbits

Here are just a couple of technical tidbits that whilst they couldn't find a home in the article proper, might be interesting to the right folks:

  • Whilst I have heard discussion on how Network Relay and iCloud Private Relay can intertwine and sit within one another, I didn't see specific evidence of that in my testing with MDM deployed relay. A Network Relay configured for all .icloud.com domains will never attempt a CONNECT for mask.icloud.com or mask-h2.icloud.com but other domains in the same relay configuration (apple.com, icloud.com) will be sent to the system wide Network Relay as expected. I haven't done a deep deep dive on this and checked at a packet capture level, but there definitely seems to be some form of split in how Apple treats this that is not well documented, and this may obviously be different for agent or app implemented relay that is not a system wide MDM deployment. Note that configured relay will respect things like Wireguard, Tailscale (in that it will route THROUGH them if the route expects) or other VPN interfaces from what I have seen, so this seems to be specific to mixed relay.
  • You do need to be a little careful with MTU if you are dealing with anything non-standard. The extra encapsulation and limits in QUICHE can get you in trouble.
  • If you are doing any testing of your own, remember that browsers, apps and the OS love to keep connections open for longer than you think. Sometimes, you may need to quit and re-open Safari (or any app or browser) or even do a good old fashioned reboot if you are making configuration changes and not seeing the results you expect!
]]>
<![CDATA[Mac Admins Podcast: E370 - Managed Device Attestation]]>It was an absolute pleasure to sit down with the crew this week for an episode of the Mac Admins Podcast where we discussed the current state and future possibilities of Managed Device Attestation on Apple platforms.

Our robust chat covered attestation fundamentals, Secure Enclave primitives, MDA workflow

]]>
https://jedda.me/mac-admins-podcast-e370/6811760a12089e000142e250Tue, 02 Jul 2024 01:01:00 GMTIt was an absolute pleasure to sit down with the crew this week for an episode of the Mac Admins Podcast where we discussed the current state and future possibilities of Managed Device Attestation on Apple platforms.

Our robust chat covered attestation fundamentals, Secure Enclave primitives, MDA workflow and deployment model considerations as well as some really interesting discussions around the future of the technology.

If you are involved in managing or securing Apple devices at scale, I think it's a worthwhile listen on a topic that will become increasingly important as MDM and identity & trust providers start to integrate this technology at the heart of our tooling.

Episode 370: Jedda Wignall on Managed Device Attestation
Trust is a subject we regularly discuss with our guests. How do we trust our users, how do we trust the software they want to run, how do we trust the devices they are on. In the modern world where…
]]>
<![CDATA[Cross Platform ECIES encryption with Swift & Apple's Secure Enclave]]>https://jedda.me/cross-platform-encryption-with-apples-secure-enclave/660564abcdfe2b0001c600fcThu, 11 Apr 2024 00:51:45 GMTA discussion of fundamentals, and some examples of cross-platform Elliptic curve-based hybrid cryptography compatible with Apple platforms and hardware-bound keys.


Apple's Secure Enclave; amongst it's many other utilities, provides excellent opportunities for data integrity and confidentiality. The ability to encipher data only decryptable by a single device and hardware-bound key is a powerful workflow; potentially even more so when this data originates from cross-platform infrastructure or applications.

For an upcoming project, I needed a consistent way to exchange encrypted data between a series of microservices and Apple devices and keys held in their Secure Enclaves. This short article will briefly touch on some hybrid cryptography fundamentals and one of Apple's implementations as part of their Security framework as well as links to some new resources I have created to help build and understand cross-platform applications.

Whilst this article and the projects and code samples it links to are inherently technical and require a base understanding of programming in Swift and Golang, the underlying concepts may be valuable to non-programmers with an interest in security on Apple platforms. In particular, I've tried to step out my commentary in the ECIES-Swift-Playground described below to be relevant to someone not familiar with the actual code and help to establish an understanding of the fundamentals.

Whilst CryptoKit provides a modern hybrid cryptography option through its implementation of the HKPE standard, this can prove a more complex build out due to the myriad of AEADs, KDFs and KEMs supported by the standard, it's nonce and sequence generation and use of authenticated encryption. Elliptic Curve Integrated Encryption Scheme (ECIES) is the primary Elliptic Curve encryption standard supported by Apple's Security framework and is a great choice for simple, secure data exchange. It shares a lot of connective tissue with HKPE, but is a little easier to implement across platforms (I plan to write a follow-up article discussing a similar implementation of cross-platform HPKE senders and receivers, and will go into more detail about this).

ECIES is a hybrid encryption scheme using Elliptic Curve keys supported by the Secure Enclave (when using a NIST P-256 curve). As a hybrid scheme, it uses asymmetric keys exchanged and trusted between a sender and receiver to derive a symmetric key that is used to actually encrypt the data. It is simple to encrypt and decrypt data on Apple platforms using this scheme, as Apple provides functions that handle most of the lift for you; namely ephemeral key generation, ECDH exchange, symmetric key derivation, nonce generation, AES-GCM sealing/opening as well as encapsulating the ciphertext along with the asymmetric key. David Schuetz provides an excellent breakdown of Apple's ECIES implementation here, and has done a huge amount of detective work to document the technical underpinnings of the process.

When data is encrypted on an Apple device under the scheme, this is effectively the process that happens behind the scenes:



Selection of the underlying hashing algorithm and variable IV setting used during the process is governed by a SecKeyAlgorithm; a Swift type that determines which hash the KDF function uses (SHA-224, SHA-256, SHA-384, ect.) and how the Initialisation Vector (IV) for the symmetric AES-GCM encryption is computed.

When exchanging data between Apple platforms and devices, these resultant "Combined Ciphertext Data" objects can be decrypted back to plaintext with the appropriate key simply by passing them (along with the correct algorithm selection) to SecKeyCreateDecryptedData(). The receiver retrieves and uses the ephemeral public key to perform the ECDH with their private pairing and derives the same shared key that was used to encrypt. This is great if you are working entirely within an Apple ecosystem, but if you want to exchange data cross-platform or with a backend service, good examples have been harder to find.

If you search around online for ECIES examples, you are likely to come across many that utilise the EC curve secp256k1, most known for its use as part of Bitcoin and other cryptocurrencies. Note that this curve is different from secp256r1 (which is supported by Apple implementations whereas secp256k1 is not), leading to many ECIES implementations (that hardcode the use of k1) being incompatible with Apple's. To aid in a better understanding on how to exchange cross-platform encrypted data with Apple platforms, I've created the following open source resources:

  • ECIES - a Go module that implements functions to perform ECIES encryption, decryption and key derivation compatible with Security framework. It basically does the same uplift as Apple performs for you in their Swift/Objective-C API.
  • ECIES-Go-Example - a sample Go project that utilises the ECIES module to encrypt and decrypt some example data.
  • ECIES-Swift-Playground - a macOS Swift playground that details the process of importing public and private keys from OpenSSL PEM representations, then encrypting and decrypting. It can be used alongside ECIES-Go-Example to learn how to exchange ECIES encrypted data cross-platform.

These projects and samples taken together, whilst hardly revolutionary, should contribute a fair bit towards learning and coding cross-platform ECIES data exchange. With all this in mind, encrypting data for a device's Secure Enclave is fairly simple. We can take Apple's documentation around the protection of keys with the Secure Enclave here:

Protecting keys with the Secure Enclave | Apple Developer Documentation
Create an extra layer of security for your private keys.

And use their examples to build out something like this:

do {
  // create our access control policy
  // so that our key is only accessible when the device is unlocked
  // and valid biometrics are presented
  var accessError: Unmanaged<CFError>?
  guard let access = SecAccessControlCreateWithFlags(
      kCFAllocatorDefault,
      kSecAttrAccessibleWhenUnlockedThisDeviceOnly,
      [.privateKeyUsage, .biometryAny],
      &accessError) else {
          throw accessError!.takeRetainedValue() as Error
  }

  // setup our key creation attributes
  // for a 256 bit (P-256) EC key generated by the Secure Enclave
  // and with a tag we can use to query and use it later
  let attributes: NSDictionary = [
              kSecAttrKeyType: kSecAttrKeyTypeECSECPrimeRandom,
              kSecAttrKeySizeInBits: 256,
              kSecAttrTokenID: kSecAttrTokenIDSecureEnclave,
              kSecPrivateKeyAttrs: [
                  kSecAttrIsPermanent: true,
                  kSecAttrApplicationTag: "me.jedda.ECIESSecureEnclaveExample.key",
                  kSecAttrAccessControl: access
              ]
          ]

  // create our key
  var createError: Unmanaged<CFError>?
  guard let secKey = SecKeyCreateRandomKey(attributes, &createError) else {
    throw createError!.takeRetainedValue() as Error
  }

  let publicSecKey = SecKeyCopyPublicKey(secKey)
  var publicSecKeyError: Unmanaged<CFError>?
  guard let publicKeyData = SecKeyCopyExternalRepresentation(publicSecKey!, &publicSecKeyError) as? Data else {
            throw createError!.takeRetainedValue() as Error
  }

  // now we have our raw public key bytes in publicKeyData
  // which should be easily portable to wherever they need to go
  // we can encrypt data with this key, and it will only be decryptable
  // on this device with the hardware bound key in the Secure Enclave

} catch {
  print("ERROR \(error)")
}

You need to think carefully about how you use this type of encryption. Remember that with non-extractable keys; if the device dies or is lost, the data is effectively crypto-shredded. For short-lived messages and commands, this might be desirable, but just make sure that you consider these risks if you choose to encrypt anything more permanent.

If you use some of the resources I've created and linked to above, I hope you find them useful. If you have any questions or feedback, please reach out.

]]>
<![CDATA[Managed Device Attestation for Apple devices - a technical exploration]]>https://jedda.me/managed-device-attestation-a-technical-exploration/65547ceaa1f5560001ac7c66Tue, 28 Nov 2023 08:10:00 GMT

A deep dive into the current state of cryptographic device attestation on Apple platforms.


Managed Device Attestation (MDA) for Apple devices. Announced with a flurry of interest at WWDC 2022, and then..... not much activity at all. Apart from a small subset of cutting-edge users, Managed Device Attestation hasn't yet found its way to the masses - mainly hindered by slow adoption by major MDM providers and some limited use cases.

If you Google it today, you'll likely find some Apple developer documentation and a smattering of blogs from MDM & PKI vendors to explain the high-level concepts but no tangible or shipping examples of real-world use cases.

I believe we are sitting on the frontier of a transformative platform security mechanism, and hope that through reading this article, you'll agree with me that device attestation is about to become a very big deal.

This article aims to do 3 things:

  • Detail the high-level concepts of cryptographic attestation and managed Device Attestations on Apple platforms in their current state today
  • Provide a technical breakdown of the attestation process and ACME payload execution on iOS & macOS using commonly available open-source tools
  • Contribute some editorial on the current and future use case and direction of Managed Device Attestation on Apple platforms

Just a note that the information in this article has been brought together from many sources along with my own lab testing. As this is a quickly changing (and sporadically documented) topic, some of the following breakdowns may be slightly off on specifics and order of operations, however, the general concepts should be fairly sound. If there is anything here that you feel is incorrect, please reach out. I've included a list of sources in the footer.

Breaking down the core concepts

The concepts behind cryptographic device attestation aren't unique to Apple platforms. There are efforts inside Google, Microsoft and other manufacturers working to achieve the same thing - strong signed evidence of genuine, identifiable devices with hardware-bound private keys. The initial ideas around utilising WebAuthn attestations to verify manufacturer hardware identifiers stirred out of Google, with Brandon Weeks' contributions and thought leadership on this topic to be strongly admired.

The power of manufacturer-backed attestations should be easily understood. As networks and resources become more distributed and threats to data confidentiality become more aggressive, organisations are looking for more secure ways to ensure that only authorised users on authorised devices have access to sensitive resources.

Traditional efforts in this space have focused on the user & their observable properties; enforcement of strong authentication factors, session management & contextual policy on network location, to name just a few. As the perimeter shifts beyond that of traditional firewalls & the popularity of SASE & ZTNA architectures surges, the need for greater evidence of authorisation & posture of managed devices has surfaced and continues to grow. Many commercial ZTNA solutions such as Cloudflare ZTNA & Jamf Security Cloud have built-in device posture capability checking to ensure compliance with specific criteria prior to allowing access to specific applications and resources. In more traditional campus architectures, NAC solutions such as Aruba ClearPass & Cisco ISE can be integrated to consider device posture as part of their network authentication & authorisation decisions.

Device identity & posture information can come via MDM and interconnected systems, or sometimes via a client or agent running on a device. However, for many current implementations, we are making critical assumptions the information our devices and services report is trustworthy.

Why is cryptographic attestation important?

Devices can lie.

They can lie about their core identity and they can lie about specific properties - hardware configuration, OS & software versions, MAC addresses & more. Pre-Apple Silicon, Mac Admins were often using this to their advantage by overstamping known serials on VMs to test device enrollment workflows. Prior to T2, even the process of changing physical Mac hardware serials was trivial for those with the right tools.

With MDM becoming ubiquitous, we are finding ourselves more reliant on it as a source of truth, and are using management UDIDs and device identifiers to certify and link devices & posture to other security mechanisms and systems (802.1X. IdPs, ZTNA, etc). To ensure security integrity for these managed devices, we need strong evidence that a device is who it says it is, and we need this prescribed independently of the OS or user space, which if compromised could just find another way to lie about its identity or properties. By utilising each Apple device's Secure Enclave as part of the attestation process, this strong evidence is now available to us.

In most managed Apple environments, we are used to using hardware serial numbers and device UDIDs to identify and reference devices. But here's the thing; the Secure Enclave in an Apple device is an isolated secure subsystem and can't actually access the device's UDID or serial number. What it can access is its root cryptographic key UID (which is burned into each SE during manufacturing) as well as a series of register stored hashes from each stage of the most recent device boot chain. During the attestation process on Apple's servers, the former is used to match the device's permanent identifiers (serial number & UDID), and the latter are used to match versions of macOS/iOS. These are signed using Apple's attestation CA and returned back to the client.

Managed Device Attestation for Apple devices - a technical exploration

When you use MDA as strong evidence of device posture, you are choosing to place trust in the integrity & security of two primary components; the Secure Enclave and Apple's attestation of its manufacturing records. As always you should consider your individual risk profile, however, many organisations will consider this level of trust acceptable.

When asked to perform an attestation via a device management profile payload, a device reaches out over TLS to https://register.appattest.apple.com, however like most Apple services - this connection is certificate pinned, so we can't get direct visibility into the transaction. We do end up with Apple's response as part of a CBOR payload that is included with our ACME order (see below), which if we unpack (with the help of this code from Smallstep), can see how this initial attestation is represented. So what does that attestation certificate look like? Here's one from an iPad running iOS 16.5.1:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 1700180740331 (0x18bdaab48eb)
        Signature Algorithm: ecdsa-with-SHA384
        Issuer: CN = Apple Enterprise Attestation Sub CA 1, O = Apple Inc., C = US
        Validity
            Not Before: Nov 16 00:25:40 2023 GMT
            Not After : Nov 16 00:25:40 2024 GMT
        Subject: CN = 84d19f4396ae30c72ec7e4c78ae20fd14cc4bfc5f079ee0b992e928c6fc0ce43, OU = AAA Certification, O = Apple Inc., ST = California
        Subject Public Key Info:
            Public Key Algorithm: id-ecPublicKey
                Public-Key: (384 bit)
                pub:
                    04:bd:ed:3d:85:8e:65:bd:8a:5d:c0:99:19:22:65:
                    5e:31:08:0e:26:73:3b:7e:f6:b9:c0:c9:c5:d8:54:
                    b7:c0:79:fe:c2:34:29:33:b1:0c:31:a7:b7:f2:5d:
                    0a:6d:91:e2:b4:bc:e4:34:f9:f5:dc:e9:79:c7:39:
                    c7:13:b4:6d:16:73:b8:26:0a:3d:c2:09:fd:2f:7e:
                    1f:2c:3c:c2:1f:20:ef:d4:9c:36:0b:ea:8f:e0:78:
                    fe:27:5e:de:4f:fd:46
                ASN1 OID: secp384r1
                NIST CURVE: P-384
        X509v3 extensions:
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Key Usage: critical
                Digital Signature, Non Repudiation, Key Encipherment, Data Encipherment
            1.2.840.113635.100.8.10.2: 
                16.5.1
            1.2.840.113635.100.8.9.1: 
                DMH9ZK607VP8
            1.2.840.113635.100.8.9.2:
                4a6ec46c7aaf514cc1abf85c37a87fd0fda903e4
            1.2.840.113635.100.8.11.1: 
                =.=.dx...............B..).5..8].
    Signature Algorithm: ecdsa-with-SHA384
    Signature Value:
        30:64:02:30:60:1b:eb:16:24:9e:30:5b:a2:d9:92:7a:ee:9a:
        98:4e:da:1c:8f:40:7a:4a:69:8b:b3:fa:8d:16:1c:5b:cf:71:
        17:3d:89:b3:e7:b8:8f:0a:c5:28:f9:38:ff:4c:8e:bb:02:30:
        10:1a:2b:6f:fa:94:c7:6f:73:d2:7f:c9:ea:36:d0:ee:28:02:
        c1:36:c3:7d:3c:49:7e:7a:57:7e:c3:a3:40:16:99:c1:39:28:
        88:9d:73:b4:98:8b:cb:32:3b:47:19:1a

I've highlighted some interesting parts of the certificate info in yellow, namely:

  • An issuer of Apple Enterprise Attestation Sub CA 1. This is an intermediate CA signed by the Apple Enterprise Attestation Root CA. Interestingly the root is not available via any form of authorityInfoAccess, but is available for download from Apple's private PKI repository. All Apple attestations must chain back to this root CA during challenges in order to be validated. This validation is the responsibility of the ACME server processing the device-attest-01 challenge.
  • A public key starting 04:bd:ed:3d:85:8e:65:bd. This is the public pairing to our hardware-bound private key generated and stored in the Secure Enclave.
  • 16.5.1 - the device's current iOS version. This has been matched using the device's boot chain hashes and is available as an attested property.
  • DMH9ZK607VP8 - the device serial number. One of the permanent identifiers that can be used to complete the device-attest-01 challenge.
  • 4a6ec46c7aaf514cc1abf85c37a87fd0fda903e4 - the device UDID. Another permanent identifier that can be used to complete the device-attest-01 challenge. This is an interesting identifier that can differ in use case based on platform. I'll speak more to this later in the Behaviours on macOS vs iOS section of this article.

Now the device has its Apple attestation, but that's nowhere near the end of the story. Those familiar with X.509 might note that the attestation certificate doesn't have any EKUs - it's not usable nor designed to be used for anything beyond attestation, and it's not linked to or issued from your PKI. For that, we need ACME.

ACME & the device-attest-01 challenge

Those not familiar with the ACME protocol will likely have heard of Let's Encrypt - an extremely popular free online certificate authority that uses ACME for certificate issuance. ACME is a client & server protocol that uses "challenges" to ensure a client can prove control of specific identifiers for a certificate to be issued. Traditional ACME challenges include http-01; a challenge where a client must place a token file on a publicly available web server to prove control of the filesystem, and dns-01; a challenge that asks for a specific value in a DNS TXT record for a domain to prove control of the DNS zone.

These challenges, alongside open source automatable clients such as Certbot have led to Let's Encrypt issuing approximately 7% of all public TLS certificates on the internet using the ACME protocol. These challenges are however focused (for the most part) on the issuance of server certificates (for a website, proxy or other TLS service) and not for client identification or authentication.

Enter device-attest-01 – a draft extension by Brandon Weeks that specifies new mechanisms utilising WebAuthn to allow a device to complete a challenge as to its permanent identity. It sets out many of the concepts I discuss in this article and is a good read if you want to understand more at a technical level. As a draft, this extension is still open for development and change, however, it has been accepted by IETF as a WG Document (working group document) which means it is being actively developed with the goal of becoming a part of the standard.

For their implementation of Managed Device Attestation, Apple has utilised ACME and device-attest-01 as part of it's com.apple.security.acme device management payload. This allows devices to request attestation from Apple, then forward the signed attestation & CSR to a supported ACME server in exchange for a signed certificate from organisational PKI.

For the following breakdown, we are going to be looking at JSON requests from Apple's com.apple.security.acmeclient as well as logs from step-ca, an excellent open-source Certificate Authority with built-in support for ACME & the device-attest-01 challenge. Smallstep; the developers of step-ca have done a massive amount of work to build support & foster interest in device attestation and they should be commended for their efforts.

I think the following breakdown does a pretty good job of outlining how the ACME client-server transaction works, but note that for brevity a lot of identifiers, JSON wrapping, and some beige steps have been removed. If you want to see all this for yourself in full colour, watching the step-ca logs whilst also snooping an ACME transaction through something like Proxyman will expose all the data you need.

Where you see "Credibly" or "pki.credibly.cc" referenced below - that is my example organisation and domain used for testing these flows.

Here's what an ACME transaction looks like for an Apple device:

The client performs a GET on the ACME directory URL (embedded in the device management payload) to discover specific endpoint URIs for ACME account creation & certificate ordering:

method=GET path=/acme/mda/directory response="{"newNonce":"https://pki.credibly.cc/acme/mda/new-nonce","newAccount":"https://pki.credibly.cc/acme/mda/new-account","newOrder":"https://pki.credibly.cc/acme/mda/new-order","revokeCert":"https://pki.credibly.cc/acme/mda/revoke-cert","keyChange":"https://pki.credibly.cc/acme/mda/key-change"}" size=677 status=200 user-agent=com.apple.security.acmeclient/1.0

Note that whilst endpoints are available for certificate revocation and key management, to my knowledge Apple's ACME client doesn't currently use them.

The client performs a HEAD on the "newNonce" endpoint to get a nonce value that is used as a one-time anti-replay nonce on subsequent operations:

method=HEAD path=/acme/mda/new-nonce size=0 status=200 user-agent=com.apple.security.acmeclient/1.0

You can read more about how this nonce is used in the ACME spec.

The client performs a POST to register a new account on the ACME server:

method=POST path=/acme/provisioner/new-account response="{"status":"valid","orders":"https://pki.credibly.cc/acme/mda/account/.../orders"}" size=188 status=201 user-agent=com.apple.security.acmeclient/1.0

During this, it agrees to your ACME server's TOS on your behalf!

The client now places an order for a certificate using the ClientIdentifier property set in the payload. In step-ca's current implementation, this must match one of the permanent identifiers the device has achieved attestation for:

{"identifiers":[{"type":"permanent-identifier","value":"DMH9ZK607VP8"}]}

The requirement for the ClientIdentifier to match a permanent identifier is not a part of the specification and this may change in the future.

The server confirms the order and returns an "authorizations" endpoint for the client to check for challenges:

path=/acme/mda/new-order response="{"id":"...","status":"pending","expires":"2023-11-17T08:13:01Z","identifiers":[{"type":"permanent-identifier","value":"DMH9ZK607VP8"}],"notBefore":"2023-11-16T08:12:01Z","notAfter":"2023-11-17T08:13:01Z","authorizations":["https://pki.credibly.cc/acme/mda/authz/..."],"finalize":"https://pki.credibly.cc/acme/mda/order/.../finalize"}" size=575 status=201 user-agent=com.apple.security.acmeclient/1.0


The client checks for appropriate challenges and finds a device-attest-01 which it attempts to solve by forwarding the WebAuthn attestation statement it acquired earlier. The server validates this against Apple's attestation root CA, and if valid, accepts the challenge and lets the client know that it was successful:

method=POST path=/acme/mda/authz/... response="{"type": "device-attest-01", "status": "valid","validated": "2023-11-17T00:25:40Z"}" size=366 status=200 user-agent=com.apple.security.acmeclient/1.0


After finalising by providing a CSR with a public key matching that of the attestation leaf, the client finally receives a signed certificate from the server:

method=POST path=/acme/attest_nohook/certificate/CEE... certificate="MII..." issuer="Credibly Intermediate Devices CA"  public-key="ECDSA P-384" size=2067 status=200 subject=DMH9ZK607VP8 user-agent=com.apple.security.acmeclient/1.0 valid-from="2023-11-21T03:57:47Z" valid-to="2023-11-22T03:58:47Z"


Here's what the issued certificate looks like (using a default step-ca template):

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            3d:fb:bd:f5:75:47:c2:a4:d8:96:72:44:a8:3e:c1:73
        Signature Algorithm: ecdsa-with-SHA384
        Issuer: C = AU, O = Credibly, CN = Credibly Intermediate Devices CA
        Validity
            Not Before: Nov 21 03:57:47 2023 GMT
            Not After : Nov 22 03:58:47 2023 GMT
        Subject: CN = DMH9ZK607VP8
        Subject Public Key Info:
            Public Key Algorithm: id-ecPublicKey
                Public-Key: (384 bit)
                pub:
                    04:bd:ed:3d:85:8e:65:bd:8a:5d:c0:99:19:22:65:
                    5e:31:08:0e:26:73:3b:7e:f6:b9:c0:c9:c5:d8:54:
                    b7:c0:79:fe:c2:34:29:33:b1:0c:31:a7:b7:f2:5d:
                    0a:6d:91:e2:b4:bc:e4:34:f9:f5:dc:e9:79:c7:39:
                    c7:13:b4:6d:16:73:b8:26:0a:3d:c2:09:fd:2f:7e:
                    1f:2c:3c:c2:1f:20:ef:d4:9c:36:0b:ea:8f:e0:78:
                    fe:27:5e:de:4f:fd:46
                ASN1 OID: secp384r1
                NIST CURVE: P-384
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature
            X509v3 Extended Key Usage: 
                TLS Web Client Authentication
            X509v3 Subject Key Identifier: 
                B1:A3:9A:6A:E2:7A:17:19:BA:9C:B0:44:3D:6E:7A:7A:57:60:34:FD
            X509v3 Authority Key Identifier: 
                8B:2A:03:BC:F7:9B:F9:77:F8:FF:AB:ED:32:4C:71:5E:F9:61:DA:14
            1.3.6.1.4.1.37476.9000.64.1: 
                mda

Let's note a few things about this certificate (again highlighted in yellow):

  • An issuer of Credibly Intermediate Devices CA. This is our intermediate Certificate Authority for which we own and issue the PKI.
  • A certificate common name (CN) with the serial number of the device; DMH9ZK607VP8. Currently, in step-ca it is a requirement that the CN match one of the attested permanent identifiers.
  • A public key starting 04:bd:ed:3d:85:8e:65:bd - note this is the same public key that was originally attested by Apple. With this, we know that Apple has attested this pair's private key to be hardware bound in the device's Secure Enclave.
  • An OID (1.3.6.1.4.1.37476.9000.64.1) with a value mda. This is the name of the provisioner within step-ca used to issue our certificate.

You might also note that this certificate now has an EKU for "TLS Web Client Authentication", which means it can be used for 802.x, VPN & Mutual TLS, etc. Its properties are actually fully customisable and are governed by step-ca X.509 templates. In a section below, I'll detail a method of enriching this certificate with additional data about the device and assigned users.

OK - we did it. We have a beautiful ECC private key stored in and protected by our Secure Enclave and a matching certificate signed by our organisational PKI. Surely this is the end of the story, right?

No - as you'll read; there are some specifics on how we can use these resultant certificates on Apple platforms that are very important to understand.

Behaviours on macOS vs iOS

MDA on iOS devices came first in 2022 and with iOS 16, whilst the Mac gained support in 2023 with macOS 14.0 Sonoma (although you might not know it beyond some WWDC content and this reference here).

Whilst the same MDM profiles can deploy ACME certificates to both iOS & macOS systems, there are some significant differences in the behaviour and usability of these certificate identities across each platform.

These differences provide challenges in consistent use cases for MDA ACME certificates, and it's a little unclear if some of these differences are deliberate design decisions, or were born out of intricacies in variances between Apple's OSes.

Keychains

Admins familiar with macOS may only be used to only a couple of file-based keychain types, traditionally a login keychain for each user of a Mac and a System keychain that stores both root certificates and global certificates and credentials (such as Wi-Fi passwords).

There is another class of keychain that is more opaque to user space called the Data Protection keychain. This is a complex topic that dances outside of the scope of this article, but suffice it to say that items in the Data Protection keychain may not be available to all apps (and sometimes not at all in user space with public APIs).

With the Data Protection Keychain, we also come across Keychain Access Groups. A feature born in iOS and later brought back to macOS, this allows for sandboxing and sharing of keychain items with a group of applications sharing a team or code signing entitlement. Purists out there will know this description & illustration is a bit of a simplification, but forgive me whilst illustrating a point:

Managed Device Attestation for Apple devices - a technical exploration

On iOS, Apple uses a shared Keychain Access Group across a slew of it's apps and OS services. This includes Safari, Mail, and crucially MDM-issued ACME certificates. Apps in other keychain groups (or no group at all) have no access or visibility into Apple's keychain items, and thus can't make use of MDM provisioned certificate identities.

This is obviously a great implementation of credential sandboxing that severely limits third-party ability to access secrets for a developer's apps however, because there is no centralised control within MDM (or elsewhere) to allow or whitelist identities to other apps or groups, provisioned identities are only available within the MDM profile itself (802.1X or VPN), or to Apple applications (Safari & Mail). On iOS, this allows for some powerful Mutual TLS (mTLS) flows using MDA ACME certificates within Safari however, the lack of a Certificate Preference payload on iOS means the ease and enforcement of mTLS browser flows is limited. Additionally, apps such as Chrome, Edge, Brave, or managed browser contexts or application-embedded WebViews have no access to MDM-provisioned certificate identities, which further limits mTLS usage.

macOS is currently even more of a more conservative story. Presumably due to differences in keychain implementations compared to iOS (or as a deliberate design decision), a fully issued MDA certificate is almost opaque to the OS itself and doesn't appear in Keychain Access or even via the security command. MDM can get the full certificate via the CertificateList command, and a client can get limited information via the /usr/libexec/mdmclient QueryCertificates command, but for the most part, can't easily access the public or private keys. Additionally, because the certificate exists outside the appropriate keychain access groups (or isn't available at all through public APIs), the identity isn't available within Safari for mTLS use, nor to any third-party applications.

I've been doing a fair bit of work lately on ZNTA platform implementation with Apple endpoints. In doing so, we are looking at user and device posture as part of an ABAC model for resource authorisation, and in some cases, the gauntlet we are asking a subject to run & conquer is significant (device compliance baselines, running service binaries, code signing validation, etc.). If a client such as Cloudflare's could access and validate an MDA certificate from your corporate PKI as part of a device posture assessment, wouldn't you say that (paired with other challenges) that would represent very strong evidence of an authorised device in an understood state?

Additionally, mTLS is a great use case for MDA-verified certificates which works today on iOS but not on macOS. Built-in support for mTLS in cloud platforms such as Amazon Web Services, Azure, Cloudflare ZTNA & many more often makes it simple to require mutual authentication and encryption for internet services without traditional VPNs. In an increasingly mobile world, mutual TLS makes natural sense as a strong security control - either for standalone services & APIs or as part of a more mature zero-trust strategy. Cloudflare has a great rundown on Mutual TLS which is worth a read if you aren't familiar with its use.

This is pure conjecture, but perhaps we could see a future implementation across all Apple platforms similar to the whitelisting of System Extensions for hardware-bound, attested ACME identities - one where specific team or bundle identifiers are whitelisted to allow them access through the SecItem API. I think device identity is going to become a big enough deal that a mechanism like this (or something else like a developer entitlement) makes natural sense.

Device UDIDs

Another difference in behaviour that I have observed relates to device UDIDs and how they are reported and used on iOS vs macOS (and then Apple Silicon vs Intel with T2). This is undoubtedly expected behaviour, however i'm yet to find any definitive information or documentation, and as you'll see; these make usage of UDIDs more difficult on macOS when paired with external tools.

On iOS, the UDID returned as part of a device attestation is the standard device UDID reported to MDM. It can be easily searched and matched using most MDM APIs and is a ubiquitous identifier that is consistent across many platforms and apps referencing iOS devices and their posture.

On macOS, this is a different story. On Apple Silicon Macs, the permanent identifier returned via a device attestation is the Provisioning UDID viewable in a Hardware Overview in System Report. On attestations for Intel Macs with a T2 chip, the returned permanent identifier is a different 24-character UDID in the same format as the Provisioning UDID - however it doesn't match up or show in any system profile or hardware dumps that I can locate. Additionally, on these systems, System Report shows the Provisioning UDID and Hardware UUID as identical.

Here's an example of what this looks like on an Apple Silicon MacBook Pro vs an Intel MacBook Air:

Managed Device Attestation for Apple devices - a technical exploration

Documentation is sparse around the Provisioning UDID on the Mac platform, so I'm inferring a lot from my lab testing. I'm familiar with its use within developer ecosystems but can't find good references on its use in platform security. If anyone has anything further to offer on this, please reach out as I'd love to better understand this and update this article with something more definitive.

In any case, these differences make the UDID hard to rely on for Macs when trying to match an identifier in any MDM or separate system. In the next section, I'll look a little more at why you may want to do any matching at all.

Gatekeeping device-attest-01 in production

device-attest-01 is an interesting challenge. Paired with manufacturer attestation (and the ACME server's validation of an attestation chain) it can provide powerful evidence of a device's genuine identity, but neither Apple nor your PKI has any direct visibility as to that devices connection to your organisation or to any particular user.

An appendix item in the device-attest-01 draft details a concept called External Account Binding as a control to gatekeep authorised devices during the ACME account creation and order process, however at the current time, this is not supported by Apple's ACME client.

Consider this in production. Perhaps your users are already utilising a mature ZNTA, SASE or cloud architecture and you consider it trivial for you to protect access to a HTTPs endpoint - awesome! For most deployments, users will be distributed and enrollment into PKI may be a postured prerequisite to the aforementioned secure connectivity pathways (if they exist at all), so it's natural to assume that many ACME CAs and provisioners will end up internet accessible. The issue with this is that presently; as long as an Apple device can attest itself as any piece of genuine Apple hardware, it can enroll in your enterprise PKI - this is obviously not workable for most organisations.

Controls need to be put in place to ensure that only authorised devices can access signed ACME certificates using your PKI. In the future, better MDM integration with ACME (particularly for actual MDM enrollment), may facilitate these controls naturally, however additional ACME payloads may still be required for other services (802.X, VPN, etc), just as exists today with SCEP and MDM such as Jamf.

For now, Smallstep suggests some security through obscurity by randomising ACME provisioner names and blocking the enumeration of provisioners when step-ca is internet-facing. In addition, they outline and recommend the use of webhooks to provide decisioning and enrichment of all of their certificate types. I respect their position here, as they are limited by the constraints of both device-attest-01 and Apple's ACME client implementation, but there has to be a better way. It's worth noting that Smallstep's commercial products are excellent and have these authorisation controls built in.

I've attempted to fill a hole here with a new open-source middleware for step-ca called step-posture-connector. It's a middleware tool that utilises the webhook functionality in step-ca to provide device identifier authorisation as well as the ability to enrich resultant X.509 certificates with user data. Today it supports flat files (JSON & CSV) and Jamf as a provider of device, group & user information, but if it proves helpful, its vision would be to expand to additional MDM providers (Intune, Kandji, Mosyle, etc.) as well as potentially other sources of asset management and ITSM platforms.

If you are considering rolling production ACME flows, or even just testing using step-ca, it's worth a look, and I'd love to hear feedback or see contributions from the community if you feel there is value here.

Mutability of attested properties

When attesting devices, Apple includes some properties that are considered immutable (hardware serial number & UDID) and some that are mutable, such as OS versions that are subject to change as a device updates.

To this point, this article has focused on device attestation via ACME Device Management payloads, however another option for device attestation exists to MDM through it performing a direct DeviceInformation query command. The idea here is that an MDM can ask for a refreshed attestation statement from Apple on a device's identity & properties, effectively getting a signed statement of a device's component versions & other properties independent of its own checks (via binary or scripting or whatever other method it normally uses). I think this will be broadly used in the future to allow MDM to verify and validate important device security parameters. The only current caveat on this is that Apple's servers currently limit fresh device attestations to one every 7 days, so usage cadence must be considered and not used for quick, punchy workflows where you need to see attested change of these mutable properties.

Managed Device Attestation for Apple devices - a technical exploration

Announcements at WWDC 2023 discussed some additional properties coming to attestations on the Mac, including SIP & Secure Boot status, Kernel Extension allowance, and specific Secure Enclave enrollment IDs, so the assumption is that as time goes on, Apple will continue to add select properties to attestations in DeviceInformation queries; most of which will likely be some form of mutable value (versions, status booleans, or other descriptors). It's a good flexible mechanism that allows Apple to provide access to new feature sets over time using a very consistent approach that can be supported by MDM providers without much additional effort.

It will be interesting to see how MDM providers and other security tech use and expose these properties, and it's definitely worth knowing about them as you consider how you might take advantage of attestation on Apple platforms.

Sending your own ACME payloads via MDM

At the present time, there is limited support for the required payload (com.apple.security.acme) within the GUI of major MDM platforms (although it's there in Mosyle).

If you wish to test ACME payloads, you may need to author your own .mobileconfig plist and upload it into your MDM. I've included an example below that can be tailored to your needs:

You'll need to modify the ClientIdentifier value if you wish to target a UDID as the device's permanent identifier - however make sure to note the platform differences outlined above. You'll be able to see issues with the attestation (such as non-matching permanent identifiers) in the step-ca logs, so that's a great place to start if you want to experiment.

Permanent identifiers & MDM enrollment types

As a privacy-focused company, it wouldn't be a new Apple feature without some caveats around device anonymisation. For MDA this comes in the form of differences when a device is User Enrolled into MDM; namely the serial number and UDID being completely omitted in attestations when User Enrolment is used for a device.

This is obviously a design decision, and for the types of organisations likely wanting to implement the initial voicings of Apple device attestation; probably not a big deal. It's worth remembering that one of the features of Account Driven User Enrollment is ongoing authentication, & that paired with a cryptographically attested device could be a very effective security control in BYOD environments. During my research I've seen references to some other anonymised identifiers for devices and the secure enclave, so hopefully we will see some future updates in this space that provide better support for all enrollment types.

Ephemeral identities & the STAR dream on Apple platforms

If you'll allow, I'll close this out with a little bit of editorial on where I believe cryptographic device attestation on Apple platforms could lead us.

The concept of STAR (Short Term Automatically Renewed) speaks to a world beyond traditional revocation methods; where short-living ephemeral certificates are issued by PKI and require renewal or re-issue at extremely short intervals. By pairing short-lived certificates with the principle of least privilege, we are massively limiting our exposure factor for compromised credentials. We see this in the DevOps world where ephemeral containers and serverless architectures use short-lived certificates and mTLS connectivity in completely automated settings. If a key and it's certificate only live for minutes or hours, do we need to worry about CRLs or OSCP in most environments?

At the present time, Apple's com.apple.security.acmeclient doesn't support certificate renewal and treats each order as a brand new one - including the generation of a new private key. Given that there are no real current content encryption use cases for ACME payloads (S/MIME, etc.), and the key is hardware-bound anyway, this probably doesn't really matter. For mTLS & 802.1X and VPN payloads, we aren't generally validating or pinning to specific public keys - just issuance from a trusted CA.

We haven't yet seen widespread adoption and push of MDA from the most popular MDM providers. I suspect this is rooted in the following (amongst other things):

  • SCEP's position as the technical core of many MDM enrollments today
  • Some of the platform inconsistencies discussed in this article
  • A little bit of arrested development stemming from Declarative Device Management and natural changes to how MDM will interact with devices in the future

When we do see better upfront adoption, hopefully we see some of the niceties baked in (such as automated renewals, ect) that we are used to with SCEP profiles. I think it's natural that ACME will replace SCEP on Apple platforms over the next few years, but how quickly this happens remains to be seen.

Imagine a future where Declarative Device Management is leveraged to allow the automatic issuance and renewal of an ephemeral attested device identity at short intervals on Apple platforms (and the keychain access issue is solved, and MDM profile replacement/failure/rollback is consistent across platforms) - 🤩. Paired with modern zero-trust concepts, this would be an extremely effective technical security control.

I really hope that you've found this deep dive into a lesser-understood concept interesting, and I would love to hear from you if you have questions, comments, or disagreements.

I truly believe that the future is extremely bright for secure device identity on Apple platforms. I think manufacturer attestation is going to become a massive part of the enterprise security story, and I can't wait to see where the path leads.


Sources

The following is a list of sources that I've referenced and might be of interest to you if you'd like to continue learning about this topic:

]]>