From the , it is pretty clear that a lot of people are showing high interest in service mesh in their project and many are already using in Production. Nearly 69% are evaluating Istio, and 64% are evaluating Linkerd. You might not know that Linkerd was the first service mesh in the market but Istio made the service meshes more popular. Both projects are cutting edge and very competitive, makes a tough choice to select one. In this blog post, we will learn about Istio and Linkerd architecture, their moving parts, and compare their offerings to help you make an informed decision.
Introduction to Service Mesh
Over the past few years, Microservices architecture has become a more popular style in designing software applications. In this architecture, we breakdown the application into independently deployable services. The services are usually lightweight, polyglot in nature, and often managed by various functional teams. This architecture style works well until a certain point when the number of these services becomes higher, difficult to manage and they are not simple anymore. This leads to challenges in managing various aspects like security, network traffic control, and observability. A service mesh helps address these challenges.
The term service mesh is used to describe the network of microservices that make up such applications and the interactions between them. As the number of services grows in size and complexity, it becomes harder to scale and manage. A service typically offers service discovery, load balancing, failure recovery, metrics, and monitoring. A service mesh also often has more complex operational requirements, like A/B testing, canary rollouts, rate limiting, access control, and end-to-end authentication. A service mesh provides an easy way to create a network of services with load balancing, service-to-service authentication, monitoring, and more, with few or no code changes in service code.
Let’s go through the architecture of Istio and Linkerd. Note that both projects are evolving fast and this article is based on Istio version 1.6 and Linkerd version 2.7.
is an open-source platform that provides a complete solution as service mesh providing a uniform way to secure, connect, and monitor microservices. It is backed by industry leaders like IBM, Google, and Lyft. Istio is one of the most popular and complete solutions with advanced offerings suitable for all sizes of enterprises. It is a first-class citizen of Kubernetes and designed as a modular platform-independent system. For a quick demo of Istio, please refer to our previous post.
Istio Mesh is logically split into a data plane and control plane.
Data plane – composed of proxies (envoy) as sidecars. These proxies mediate and control all network communication between microservices and also collect telemetry on all mesh traffic.
Control plane – manages and configures the proxy to route traffic
Istio Architecture Source: istio.io
Envoy is a high-performance proxy written by Lyft in C++ language, which mediates all inbound and outbound traffic for all services in the service mesh. It is deployed as a sidecar proxy with the service.
Envoy provides the following features:
Dynamic service discovery
HTTP/2 and gRPC proxies
Staged rollouts with percentage-based traffic split
Pilot provides service discovery for the sidecar proxies, traffic management capabilities, and resiliency. It converts high-level routing rules that control traffic behaviour into envoy specific configurations.
Citadel enables strong service-to-service and end-user authentications with built-in identity and credential management. It can enable authorization and zero-trust security in the mesh.
Galley is Istio configuration validation, ingestion, processing, and distribution component.
Traffic Management – Intelligent traffic routing rules, flow control, and management of service level properties like circuit breakers, timeouts, and retries. It let us set up A/B testing, canary rollouts, staged rollouts with percentage-based traffic splits easily.
Security – Provides secure communication channels between services and manages authentication, authorization, and encryption at scale.
Observability – Robust tracing, monitoring, and logging features provide deep insights and visibility. It helps in efficient issue detections and resolution.
Istio also has add-ons infrastructure services that support the monitoring of microservices. Istio integrates with applications like Prometheus, Grafana, Jaeger, and the service mesh dashboard Kiali.
is an open-source ultralight service mesh designed for Kubernetes by Buoyant. Completely rewritten in Rust language in version 2 to make it ultralight and performant, it provides you runtime debugging, observability, reliability, and security without requiring code changes in your distributed application.
Linkerd has three components – a UI, a data plane, and a control plane. It works by installing lightweight transparent next to each service instance.
Set of service that provides the core functionality of the mesh. It aggregates telemetry data, provides user-facing API, provides control data to data plan proxies. Below are the components of the control plane.
Controller – It consists of a public API container that provides an API for CLI and Dashboard.
Destination – Each proxy in the data plane looks into this component to look up where to send the request. It has the service profile information used for per-route metrics, retries, and timeouts.
Identity – It provides a Certificate Authority that accepts CSRs from proxies and returns certificates signed with the correct identity. It provides mTLS functionality.
Proxy Injector – It is an admission controller which looks for annotation (
linkerd.io/inject: enabled) and mutates the pod specification to add both an
initContaineras well as a sidecar containing the proxy itself.
Service Profile Validator – It is also an admission controller that validates the new before they are saved.
Tap – It receives requests from the CLI or dashboard to watch requests and responses in real-time to provide observability in the applications.
Web – It provides a web dashboard.
Grafana – Linkerd provides out of the box dashboards through Grafana.
Prometheus – It collects and stores all Linkerd metrics by scraping proxies
/metricsendpoint on port 4191.
The Linkerd data plane consists of the lightweight proxies which are deployed as sidecar containers with each instance of the service container. The proxy is injected during the initialization phase of the pod which has the specific annotation (see Proxy Injector above). The proxy is very lightweight and performant since 2.x when it was completely rewritten in Rust These proxies intercept communication to and from each Pod to provide instrumentation and encryption(TLS) without any change in application code.
Transparent, zero-config proxying for HTTP, HTTP/2, and arbitrary TCP protocols.
Automatic Prometheus metrics export for HTTP and TCP traffic.
Transparent, zero-config WebSocket proxying.
Automatic, latency-aware, layer-7 load balancing.
Automatic layer-4 load balancing for non-HTTP traffic.
An on-demand diagnostic tap API.
Comparison: Istio vs Linkerd
|Ease of Installation||Can be overwhelming for teams due to various configuration options and flexibility||Relatively easier to adapt due to opinionated and out of the box configuration|
|Supported Protocols||gRPC, HTTP/2, HTTP/1.x, Websockets, and all TCP traffic||gRPC, HTTP/2, HTTP/1.x, Websockets, and all TCP traffic|
|Ingress Controller||Envoy, Istio gateway itself||Any – Linkerd doesn’t provide ingress capability by itself|
|Multi-Cluster Mesh and Expansion Support||Support for multi-cluster deployment in stable release with various configuration options and extension of mesh outside the Kubernetes clusters possible||Multi-cluster deployment is experimental as of release 2.7. As per the latest release 2.8, multi-cluster deployment is stable.|
|Service Mesh Interface (SMI) Compatibility||Through third party CRD||Native for traffic splitting and metrics, not for traffic access control|
|Tracing Support||Jaeger, Zipkin||All backends supporting OpenCensus|
|Routing Features||Various load balancing algorithms (Round-Robin, Random Least Connection),|
Supports percentage-based traffic splits, Supports header- and path-based traffic splits
|Supports EWMA (Exponential weighted moving average) load balancing algorithm, supports percentage-based traffic split through SNI|
|Resilience||Circuit breaking, Retries and Timeouts, fault-injection, delay injection||No circuit breaking and no delay injection support|
|Security||mTLS support for all protocols, external CA certificate/Key is possible, Supports authorization rules.||mTLS supported except for TCP, external CA/key is possible but no support for authorization rules yet|
|Performance||With the recent 1.6 release Istio is getting better with resource footprint and latency is improved.||Linkerd is designed to be very light, as per some third party benchmark, it is approximately 3-5x faster than Istio.|
|Enterprise Support||Not available for the OSS version. If you are using Google’s GKE with Istio, or Red Hat OpenShift with Istio as a service mesh, you may get support from respective vendors||Full enterprise-class engineering, support, and training available by Buoyant who developed the OSS version of Linkerd|
Service meshes are becoming an essential building block in the cloud-native solutions and in the microservice architecture. It allows you to do all heavy lifting jobs like traffic management, resiliency and observability and relieve developers to focus on the business logic. Istio and Linkerd, both are mature and are being used in production by various enterprises. Planning and analysis of your requirements are essential in picking up which service mesh to use. Please spend sufficient time during the analysis phase because it is complex to move from one to another later in the game.
Comparing two technologies with such depth and breadth of things they do and are ever-evolving features is not possible in an article. Also when choosing technology as complex and as critical as Service Mesh – more than just technology, the context in which it will be used is far more important. Without that context, it is hard to say A is better than B because the answer is really – it depends. I loved the simplicity of LinkerD with getting started and also with later managing the service mesh. Also, LinkerD has been hardened over years with users from enterprise companies. There might be some features that seem lucrative in one – but one should check if the other has that feature planned in near future and make an informed decision based on not just theoretical evaluation but by trying out in a proof of concept sandbox. This proof of concept should focus on ease of use, feature match, and more importantly the operational aspect of technology. It is relatively easy to introduce a technology but the hard and long effort is spent in running and managing it through its lifecycle.
Please let us know your thoughts and comments.
Architecture diagrams sourced from the documentation of Istio and Linkerd.