As we are seeing increased adoption of distributed computing around the world, there is also increased adoption of heterogeneous environments which means the compute can be present anywhere geographically. All the network components, IoT devices, edge systems, etc. are becoming more & more software-oriented and computationally intensive to serve more purposes. And because of this, these edge systems are now becoming part of critical infrastructure where they can be part of a centralized Kubernetes cluster, can host applications directly on a customer’s machine, and can process all the critical data directly at the customer’s end, etc.
With all these benefits there are also some challenges i.e how to trust the remote devices, be assured that an authorized device is joining the cluster, an authorized application is being deployed, communication between the central and edge system is authorized, etc.
Device Identification
Usually, this is done by adding some user authentication/authorization that is software-driven. But nowadays, we have systems that are designed for autonomous unattended operation. Identifying and making sure that a trusted device joined the network is very necessary. Otherwise, a malicious device can join the network and capture authenticated users or other network details, which it can use for later purposes or other attacks. Malicious applications can spoof their identity and join the central system applications and capture sensitive data.
The attestation of devices and workloads addresses this problem by assigning the identity to them. This identity can be further mutually validated to ensure they are actually who they are. Again, if attestation is completely software-based then it can be spoofed as well.
In this article, we will see how we can use SPIRE to attest remote nodes and workloads running on them. SPIRE supports TPM as a node attestor plugin, which means we can use hardware-based secure identities generated by TPM for nodes. For workload attestation, we will use Envoy proxy here, which will connect SPIRE agents to get the identities for the workloads and set up mutual authentication with other/central clusters.
Overview of TPM, SPIRE & Envoy
TPM
Trusted Platform Module (TPM) provides hardware-based security-related functions like managing cryptographic keys, secure boot, key storage, random number generation, etc. It is a computer chip that can securely store artifacts used to authenticate the platform. It is a crypto-coprocessor residing on the motherboard that can be used for purposes like generating, storing, and limiting the use of cryptographic keys. The detailed specification for TPM is available in the TPM specification.
Envoy
Envoy is an open source service and edge proxy. It is designed in consideration of modern service-oriented architectures. It isolates the network management from the application. It can be used for things like load-balancing, routing, injecting network filters, observability, service discovery, TLS, etc.
SPIRE
SPIRE is an open source project that implements SPIFFE standards. SPIFFE is a set of open source standards for securely identifying software systems in distributed and heterogeneous environments. SPIRE does this by implementing node and workload attestation features, where it securely issues SPIFFE Verifiable Identity Documents (SVIDs) to nodes and applications and it can also verify SVIDs of other workloads.
SPIRE is based on the server & agent model, where the server is responsible for issuing SVID, and maintaining registration entries i.e. selectors & conditions for issuing SVID. On the other hand, the agent runs on every node of the cluster and exposes an endpoint for the workload through which they can request the SVID. Agent validates every request against the selectors received from the server and if all conditions are met it will issue the SVID to the workload.
Node attestation and workload attestation
SPIRE requires that every agent authenticates & verifies itself when it first connects to the server. This process is called node attestation. The server uses some of the pre-set conditionals from the registration entry to validate if the agent trying to connect is the expected one. Configuration and collection of these registry entries on the agent side are configured using plugins in SPIRE.
It has a node attestor plugin which will be applicable for both the server and the agent. On the agent side, it will try to find out if the agent is really in possession of information shared during registration. Once a node/agent is successfully attested it will receive a SPIFFE ID; this ID will act as a parent for all the workloads that will run on that node.
Secure Device Identity
Secure Device Identity is based on IEEE 802.1 AR standard. It is cryptographically bound to a device and supports authentication of the device’s identity. Possession of a DevID allows a network-attached device to assert its identity in authentication protocols.
Each DevID comprises:
- A DevID secret that is the private key portion of the public-private key pair.
- A DevID certificate containing the corresponding public key and a subject name that identifies the device; and
- The certificate chain from the DevID certificate up to a trust anchor contained in the DevID trust anchor store available to potential authenticators.
IDevID (Initial Device Identifiers)
IDevIDs are created before the device is supplied to the customer. This is usually created by the OEM. IDevID credential is intended to be usable for the lifetime of the device.
LDevID (Locally Significant Device Identifier)
LDevID is usually created by the customer when a device first joins the network or during the onboarding process of the device. LDevID can make use of IDevID secrets or the DevID module can generate separate secrets from each LDevID. LDevID certificates are not expected to be long-lived certificates. They are expected to be removed when the device is zeroized or at its end of life.
TPM for DevID
As noted above IDevID is expected to be usable for the lifetime of the device and it is usually provisioned by the supplier/OEM. Considering this, keeping DevID secret in secure storage is highly important as if that gets transferred to other devices then we will lose the essence of uniquely identifying devices.
TPM is a secure Root of Trust for Storage (RTS). It protects private keys, preventing the use of keys from one device to another device or within another TPM. Private keys stored inside the TPM can never be taken out or decrypted. Only TPM can use it for cryptographic operations.
TPM generates DevID credentials using keys. There are multiple ways to generate these credentials, depending on the manufacturing processes, applications, and user’s privacy requirements. TCG specification for TPM 2.0 Keys for Device Identity and Attestation describes this in detail.
However, if you want to try out this feature and generate DevID quickly, you can use this DevID provisioning tool. It is based on the enrolment protocol defined in the TCG specification.
SPIRE and TPM
As discussed above, SPIRE uses a plugin model to add node & workload attestors. It has a plugin for node attestation using TPM DevID. The plugin expects TPM 2.0 to be available on all nodes which are required to be attested and expect already provisioned DevID.
Attestation process flow
Attestation process flow involves the SPIRE agent solving two challenges issued by the SPIRE server. The challenge as stated in the plugins doc is to ensure:
-
Proof of possession
To verify the node is in possession of the private key that corresponds to the DevID certificate. Additionally, the server verifies that the DevID certificate is rooted to a trusted set of CAs.
-
Proof of residency
To prove that the DevID key pair was generated and resides in a TPM. Additionally, the server verifies that the TPM is authentic by verifying that the endorsement certificate is rooted in a trusted set of manufacturer CAs.
Node Attestation workflow
SPIRE TPM plugin configuration example
Server Configuration
NodeAttestor "tpm_devid" {
plugin_data {
devid_ca_path = "/opt/spire/conf/server/devid-cacert.pem"
endorsement_ca_path = "/opt/spire/conf/server/endorsement-cacert.pem"
}
}
Agent Configuration
NodeAttestor "tpm_devid" {
plugin_data {
devid_cert_path = "/opt/spire/conf/agent/devid.crt.pem"
devid_priv_path = "/opt/spire/conf/agent/devid.priv.blob"
devid_pub_path = "/opt/spire/conf/agent/devid.pub.blob"
}
}
Workload Attestation
In this process, only the SPIRE agent plays the role of assigning SVID to the workloads. Once the node attestation is complete, the agent pulls the registry entries from the server and caches them. Whenever a request comes to the workload API for attestation, it will send it to the agent which will perform the attestation process i.e validating it against the registry entries and selectors. If it finds a match in the registration entry and all selectors are valid it will issue the SVID to the workload.
In the case of distributed systems or remote devices that need to communicate with other remote nodes/clusters, we saw how TPM can be used to give hardware-based node identities that can be used for attestation. However, once a node is verified we also need to ensure that we are running trusted workloads and any communication between workloads is also secure.
mTLS with Envoy Proxy
To achieve secure communication we can use mutual TLS (mTLS). It ensures parties on both sides of the network are who they claim to be. This is done by verifying each other’s private keys. To add this setup to any application we can use service proxies. Such proxies can basically add it as a layer over the existing application without making changes to the application code. Envoy is one such open source service proxy. It is used to provide secure and authenticated communication between the services.
Envoy has a rich extensible configuration system. SDS (Secret Discovery Service) is one such extension to provide secrets and certificates to Envoy from remote endpoints. An SDS server can push certificates to Envoy instances.
On the expiration of certificates, the SDS server will push new certificates to Envoy instances and they can start using those without any restarts.
SPIRE agents can be configured as SDS providers. It can provide the key material i.e X509 SVID & CA certificates to the Envoy instances that it requires for the TLS authentication. It can also perform all SDS like the rotation of keys and certificate generation.
The diagram below depicts what the setup will look like when using node attestation with TPM and secure workload communication using mTLS with Envoy. This setup will allow high trust in onboarding new remote devices as well as communicating with applications running on remote devices.
To try out Envoy with SPIRE, you can follow this example setup from SPIRE docs.
TPM with secure workload communication using Envoy
Summary
In this post, we saw how we can handle some of the challenges in distributed and edge computing with the use of TPM, SPIRE, and Envoy. These three technologies paired together form a base to address the zero secret problems. With the use of DevID in TPM, we can validate our device identities that are deployed at remote locations. The use of mTLS with Envoy & SPIRE adds the second layer of protection i.e we know the workload we are talking to and our central servers are not exposed to any anonymous workload.
I hope you found this post informative and engaging. I’d love to hear your thoughts on this post, so start a conversation on Twitter or LinkedIn :) Stay tuned for more related blog posts in the future.
Looking for help with building your DevOps strategy or want to outsource DevOps to the experts? Learn why so many startups & enterprises consider us as one of the best DevOps consulting & services companies.