This blog post is updated on 09-March-2021.
From the latest CNCF annual survey of 2020, it is pretty clear that a lot of people are showing high interest in service mesh in their project and many are already using in production. Nearly 69% are evaluating Istio, and 64% are evaluating Linkerd. Both projects are cutting edge and very competitive, makes a tough choice to select one. In this blog post, we will learn about Istio and Linkerd architecture, their moving parts, and compare Linkerd vs Istio offerings to help you make an informed decision.
What is a Service Mesh?
Over the past few years, Microservices architecture has become a more popular style in designing software applications. In this architecture, we breakdown the application into independently deployable services. The services are usually lightweight, polyglot in nature, and often managed by various functional teams. This architecture style works well until a certain point when the number of these services becomes higher, difficult to manage and they are not simple anymore. This leads to challenges in managing various aspects like security, network traffic control, and observability. A service mesh helps address these challenges.
The term service mesh is used to describe the network of microservices that make up such applications and the interactions between them. As the number of services grows in size and complexity, it becomes harder to scale and manage. A service typically offers service discovery, load balancing, failure recovery, metrics, and monitoring. A service mesh also often has more complex operational requirements, like A/B testing, canary rollouts, rate limiting, access control, and end-to-end authentication. A service mesh provides an easy way to create a network of services with load balancing, service-to-service authentication, monitoring, and more, with few or no code changes in service code.
Let’s go through the architecture of Istio and Linkerd. Note that both projects are evolving fast and this article is based on Istio version 1.9.1 and Linkerd version 2.10.
What is Istio?
Istio is an open-source platform that provides a complete solution as service mesh providing a uniform way to secure, connect, and monitor microservices. It is backed by industry leaders like IBM, Google, and Lyft. Istio is one of the most popular solution with advanced offerings suitable for all sizes of enterprises. It is a first-class citizen of Kubernetes and designed as a modular platform-independent system. For a quick demo of Istio, please refer to our previous post.
Let’s look at Istio Architecture
Istio Mesh is logically split into a data plane and control plane.
- Data plane is composed of proxies (envoy) as sidecars. These proxies mediate and control all network communication between microservices and also collect telemetry on all mesh traffic.
- Control plane manages and configures the proxy to route traffic
What is Envoy?
Envoy is a high-performance proxy written by Lyft in C++ language, which mediates all inbound and outbound traffic for all services in the service mesh. It is deployed as a sidecar proxy with the service.
Envoy provides the following features:
- Dynamic service discovery
- Load balancing
- TLS termination
- HTTP/2 and gRPC proxies
- Circuit breakers
- Health checks
- Staged rollouts with percentage-based traffic split
- Fault injection
- Rich metrics
- Pluggable extensions model based on WebAssembly that allows for custom policy enforcement and telemetry generation for mesh traffic.
In the newer version of Istio, sidecar proxy has taken the additional responsibility for what Mixer was doing. In previous releases of Istio (<1.6), Mixer was used to collect telemetry information from the mesh.
Istiod provides service discovery, configuration and certificate managmeent. It includes Pilot, Citadel and Galley.
Pilot provides service discovery for the sidecar proxies, traffic management capabilities, and resiliency. It converts high-level routing rules that control traffic behavior into envoy specific configurations.
Citadel enables strong service-to-service and end-user authentications with built-in identity and credential management. It can enable authorization and zero-trust security in the mesh.
Galley is Istio configuration validation, ingestion, processing, and distribution component.
Traffic Management: Intelligent traffic routing rules, flow control, and management of service level properties like circuit breakers, timeouts, and retries. It let us set up A/B testing, canary rollouts, staged rollouts with percentage-based traffic splits easily.
Security: Provides secure communication channels between services and manages authentication, authorization, and encryption at scale.
Observability: Robust tracing, monitoring, and logging features provide deep insights and visibility. It helps in efficient issue detections and resolution. Istio also has add-ons infrastructure services that support the monitoring of microservices. Istio integrates with applications such as Prometheus, Grafana, Jaeger and the service mesh dashboard Kiali.
What is Linkerd?
Linkerd is an open-source light weight service mesh designed for Kubernetes by Buoyant. Initially linkerd proxy was written in Java which was then rewritten completely in Rust language to make it ultralight and performant. Similar to other service meshes, it provides you runtime debugging, observability capability, reliability, and security without requiring code changes in your distributed application.
Let’s look at Linkerd architecture
Linkerd has three components: a UI, a data plane, and a control plane. It works by installing lightweight transparent proxies next to each service instance.
Set of service that provides the core functionality of the mesh. It aggregates telemetry data, provides user-facing API, provides control data to data plan proxies. Below are the components of the control plane.
Controller: It consists of a public API container that provides an API for CLI and Dashboard.
Destination: Each proxy in the data plane looks into this component to look up where to send the request. It has the service profile information used for per-route metrics, retries, and timeouts.
Identity: It provides a Certificate Authority that accepts CSRs from proxies and returns certificates signed with the correct identity. It provides mTLS functionality.
Proxy Injector: It is an admission controller which looks for annotation
linkerd.io/inject: enabled and mutates the pod specification to add both an
initContainer as well as a sidecar containing the proxy itself.
Service Profile Validator: It is also an admission controller that validates the new service profiles before they are saved.
Tap: It receives requests from the CLI or dashboard to watch requests and responses in real-time to provide observability in the applications.
Web: It provides a web dashboard
Grafana: Linkerd provides out of the box dashboards through Grafana.
Prometheus: It collects and stores all Linkerd metrics by scraping proxy’s
/metrics endpoint on port 4191. The metrics are scraped every 10 seconds. T
The Linkerd data plane consists of the lightweight proxies which are deployed as sidecar containers with each instance of the service container. The proxy is injected during the initialization phase of the pod which has the specific annotation (see Proxy Injector above).
The proxy is very lightweight and performant since 2.x when it was completely rewritten in Rust. These proxies intercept communication to and from each Pod to provide instrumentation and encryption(TLS) without any change in application code.
- Transparent, zero-config proxying for HTTP, HTTP/2, and arbitrary TCP protocols.
- Automatic Prometheus metrics export for HTTP and TCP traffic.
- Transparent, zero-config WebSocket proxying.
- Automatic, latency-aware, layer-7 load balancing.
- Automatic layer-4 load balancing for non-HTTP traffic.
- Automatic TLS.
- An on-demand diagnostic tap API.
The proxy supports service discovery via DNS and destination gRPC API.
To make the working truly transparant, Linkered uses
linkerd-init containers that are executed before every other container in Kubernetes where the Linkerd sidecar is configured. This init container executes
iptables and configure the flow of traffic.
There are two main rules that
- Any traffic that is sent to Pod’s external IP address is forwarded to specific port on the proxy (4143). By setting
SO_ORIGINAL_DST on the socket, the proxy is able to forward the traffic to the original destination port that the application is listening on.
- Any traffic that is originating from the Pod being sent to external IP address is forwarded to specific port on the proxy (4140) because
SO_ORIGINAL_DST was set on the socket, the proxy is able to forward the traffic to the original recipient. This avoids traffic loop because
iptables rules explicitly skips the proxy’s UID.
The other two components in the puzzle are CLI and dashboard which is used to interact, manage and observe the services.
Comparison: Linkerd vs Istio
Please keep in mind that both the projects are adding new features often and this is subject to change.
|Ease of Installation
||Istio has improved in this area recently and made it easier to try
||Relatively easier to adapt due to out of the box configuration
||gRPC, HTTP/2, HTTP/1.x, Websockets, and all TCP traffic
||gRPC, HTTP/2, HTTP/1.x, Websockets, and all TCP traffic
||Envoy, Istio gateway itself
||Any – Linkerd doesn’t provide ingress capability by itself
|Multi-Cluster Mesh and Expansion Support
||Support for multi-cluster deployment in stable release with various configuration options and extension of mesh outside the Kubernetes clusters possible
||multi-cluster deployment is stable
|Service Mesh Interface (SMI) Compatibility
||Through third party CRD
||Native for traffic splitting and metrics, not for traffic access control
||All backends supporting OpenCensus
||Various load balancing algorithms (Round-Robin, Random Least Connection), Supports percentage-based traffic splits, Supports header- and path-based traffic splits
||Supports EWMA (Exponential weighted moving average) load balancing algorithm, supports percentage-based traffic split through SNI
||Circuit breaking, Retries and Timeouts, fault-injection, delay injection
||retries and timeouts, fault injection, delay injection is not possible, Circuit breaking support is not there yet in v2, see issue#2846
||mTLS support for all protocols, external CA certificate/Key is possible, Supports authorization rules.
||mTLS supported for most TCP traffic (also see caveats, external CA/key is possible but no support for authorization rules yet issue#3342
||With the recent release Istio is getting better with resource footprint and latency is improved.
||Linkerd is designed to be very light, as per some third party benchmark, it is lean and slightly faster than Istio.
||Available from various vendors such as AspenMesh, solo.io, and Tetrate
||Full enterprise-class engineering, support, and training available by Buoyant who developed the OSS version of Linkerd
Service meshes are becoming an essential building block in the cloud-native solutions and in the microservice architecture. It allows you to do all heavy lifting jobs like traffic management, resiliency and observability and relieve developers to focus on the business logic. Istio and Linkerd, both are mature and are being used in production by various enterprises. Planning and analysis of your requirements are essential in picking up which service mesh to use. Please spend sufficient time during the analysis phase because it is complex to move from one to another later in the game.
Comparing two technologies with such depth and breadth of things they do and are ever-evolving features is not possible in an article. Also when choosing technology as complex and as critical as Service Mesh and more than just technology, the context in which it will be used is far more important. Without that context, it is hard to say A is better than B because the answer is really and it depends. I loved the simplicity of LinkerD with getting started and also with later managing the service mesh. Also, LinkerD has been hardened over years with users from enterprise companies. There might be some features that seem lucrative in one/ but one should check if the other has that feature planned in near future and make an informed decision based on not just theoretical evaluation but by trying out in a proof of concept sandbox. This proof of concept should focus on ease of use, feature match, and more importantly the operational aspect of technology. It is relatively easy to introduce a technology but the hard and long effort is spent in running and managing it through its lifecycle.
Also, if you get time, please read William Morgan’s Service Mesh Manifesto.
Hope this Linkerd vs Istio comparison was helpful to you to make an informed decision. Do let us know your thoughts - you can start a conversation with me on Twitter. If you’re looking to implement service mesh, do check our capabilities and how we could help you make this seamless.
Looking for help with your cloud native journey? do check our cloud native consulting capabilities and expertise to know how we can help with your transformation journey.