A strong internal developer platform (IDP) takes care of the accessibility issues and abstracts away the pipeline sprawl. Even newcomers can get to âhello worldâ without scavenging a repo with the paved path. But hereâs the catch: too often, IDPs lean heavily on portals. They add platformsâ elements like shiny dashboards, discovery tools, and templates without wiring them to the actual core capabilities or the muscle that makes workflows seamless.
Code itself isnât the hard part. The system wrapped around it is. Platforms donât fail because of a lack of polish on the UI. They lack the muscle, i.e., the APIs and application layer that actually move work forward.
If using the platform is harder than skipping it, developers will skip it.
We love a good portal. But a portal without muscle is a showroom with no factory. Imagine a portal that doesnât have any helpful documents or usable templates. Teams will click around, admire the catalog, and then⊠go back to hand-rolled scripts because the heavy lifting isnât actually wired up.
Muscle-first flips the sequence:
Build an application/API layer that encapsulates the workflows, policy, and security you want every team to enjoy.
Expose that muscle through thin faces: a portal, a CLI, ChatOpsâwhatever the team prefers.
Keep the faces lightweight so they can evolve; keep the muscle strong so the experience is consistent, auditable, and reliable.
With this approach, adoption will stop needing mandates. Engineers will choose the golden path because it is much smoother and faster.
In API-first approach, the muscle sits as a service layer (any framework of your choice, like GraphQL/FastAPI, etc, behind auth) that any interface can call. Hereâs what it centralizes:
Golden-path workflows: The platform provides end-to-end flows your teams care about: service bootstrap â CI â GitOps deployment. It doesnât just mint files; it orchestrates the steps with idempotence and guardrails. The easy paths become the correct path that provides developers the quickest and cleanest way to deploy services.
Auth, audit, policy: Compliance and guardrails are a cornerstone of platforms. Who can do what, when, and where is enforced in one place to ensure conformance to standards, best practices, and regulatory requirements. Every action is logged. Options are provided to handle any explicit requirement, making the platform flexible.
Self-service infrastructure: Developers provide intent - âI need a bucket for logsâ, not 40 lines of IaC. The platform maps this intent to org-approved Terraform/Crossplane modules with encryption, tagging, and naming baked in. This ensures the developers get what they need when needed and prevents them from getting into an approval loop with the IT team.
Day-0 observability: Observability is integrated into the platform. Telemetry and baseline dashboards attach automatically when a service is created. Developers see latency, errors, and throughput on the first deployment, not the fifth incident.
Explainable abstractions: Every âeasy buttonâ links to âwhat got createdâ and âhow to override safely.â Abstraction without explanation is a trap. Abstractions that help new developers get started quickly while providing escape latches to architects who want more control over their deployments.
All of this execution muscle gives developers speed and safety, but it also raises a design question: how do you decide what belongs inside the platformâs muscle and what should remain outside, left to team choice? Put too much in the core and you risk rigidity. Leave too much out and you lose reliability.
The muscle defines the core: golden paths, auth, audit, and observability that must be consistent everywhere. The periphery is where teams can choose their own frameworks, databases, or eventing tools, so long as they connect through clean, reliable interfaces. The balance between core and periphery gives platforms trust and autonomy.
That balance avoids the two classic failure modes:
Cages: Over abstracted portals that collapse during incidents. E.g., A platform team forces every service onto a single CI/CD pipeline template. When a team needs GPU jobs, the template breaks. No way out â platform adoption stalls.
Chaos: Every team invents pipelines, policies, and infra shapes from scratch. E.g., A large enterprise lets each app team choose any CI/CD tool. Jenkins, GitHub Actions, and CircleCI are used by various teams with different secrets management, audit logs, and naming. Debugging production issues takes days because no two systems behave alike.
Golden paths are the middle way: attractive to use, trivial to leave when justified.
To avoid both cages and chaos, a polished portal wasnât enough; we needed muscle. This is the turn we made. Letâs walk-through our experience of building IDP for one of our many customers.
We made the classic mistake first: portal-first. Backstage looked great; usage spiked for a week, then flattened. Pairing with devs revealed why: âNice surface, but the heavy lifting isnât wired up.â So we flipped the order to API-first and built the muscle.
A backend service (GraphQL + FastAPI as we had skills around them) behind org SSO that exposes idempotent endpoints for:
bootstrapService â wireCI â provisionGitOps â attachObservability â requestInfraResource
A single policy point: authZ, audit, naming/labeling conventions, cost tags, and environment protections live here.
Template versioning: every scaffold stamps a scaffoldVersion
and goldenPathId
, so we can upgrade older services safely.
Dual faces, one backend: Backstage plugin and CLI call the same endpoints, so UI/terminal parity is guaranteed.
Create service â repo with sensible skeleton, unit test, Dockerfile, standard CI, security baselines, and day-0 instrumentation pre-wired.
Deploy with GitOps â the backend generates an Argo CD Application and manifests with consistent labels, SLO hooks, and cost tags; PR â merge â deploy.
Open dashboards â baseline Grafana board appears automatically (latency, errors, RPS, plus app metrics).
Ask for S3 (or RDS, etc.) â dev provides minimal intent; backend maps it to vetted Terraform/Crossplane modules (encryption, tagging, policies enforced by default).
Channel-agnostic access (UI/CLI parity): Teams could choose the interface that fits their workflow without learning two different behaviors. The Backstage plugin and CLI invoked the same routes, so docs, demos, and runbooks worked everywhere. This cut training overhead and removed a common âbut the button vs. the scriptâ drift.
Lower cognitive load via golden paths: Instead of assembling boilerplate, secrets, pipelines, and IaC manually, devs followed a single golden path that made secure defaults the path of least resistance. Standard repo shape, pre-wired CI, and opinionated IaC meant fewer forks, fewer decisions, and dramatically less tool-hopping.
Faster feedback loops from day one: Preflight checks and seeded CI trimmed the âfirst buildâ and âfirst deployâ loops. Day-0 observability meant the first request showed up on a dashboard - no extra tickets or manual wiring - so teams could iterate while momentum was high.
Operational trust (policy + audit in one place): Authorization, tagging, naming, and environment protections live in the muscle. Every action was logged, every exception explicit. Platform/infra and security could verify posture without slowing devs.
Evolvability without rewrites: We could add new golden-path steps (or update templates) in the backend and get the improvement in both UI and CLI instantly. Conversely, we could tweak the UI without touching the workflows. scaffoldVersion
and the idp upgrade
flow let early services catch up safely.
Onboarding that actually starts work: Because access, scaffolding, CI, and dashboards came alive in one flow, new hires moved from âwaiting on ticketsâ to âpushing a PRâ quickly. Less Slack back-and-forth, more shipping.
Adoption didnât grow because we mandated it. It grew because we earned it.
If youâre starting (or rebooting) your IDP journey, hereâs how to take the first slice without boiling the ocean.
If you are building your platform, these tips will help you build a successful one.
Observe before you abstract: Shadow your developers to understand their usage of tools, processes, and painpoints. Observe them all, whether itâs a new hire or your most experienced architect. Count waits, handoffs, and tool switches. Write the top three shared friction points as jobs-to-be-done.
Ship the thinnest viable platform: Based on your analysis, choose the most common task your developers perform on your platform and create a golden path for it - create service â CI â GitOps â day-0 telemetry - and make it excellent. Resist supporting âevery stackâ from the word go.
Build muscle, then face: Your platformâs core is its capabilities. Focus on making robust capabilities first. Create the API layer that triggers workflows and enforces policy. Once the capabilities are in, wire Backstage and a CLI on top. Keep the faces thin, functional, and ensure that everything that developers need is easily accessible.
Market internally: Whatâs the use of building a platform when nobody knows about it? Plan internal roadshows and communications about new releases, run live demos, publish âwhat changedâ notes, and market your platform. Celebrate teams that got faster because they used the platform. Turn users into advocates and champions.
Design for exceptions: Golden paths must be the easiest paths, but not mandatory. Platforms are not only about tools. Not every developer needs the default settings to build their services. Senior developers and architects need more configurable options in the same golden path. Provide latches but with proper checks, reviews, and logging.
Itâs easy to count YAML lines and portal clicks; neither proves you removed friction. What matters is whether teams can deliver with less drag and more safety. To see this, blend delivery and experience signals:
Lead time & deployment frequency: These show the flow of work through your system. If changes move from idea to production faster and teams can deploy more often with confidence, youâve reduced bottlenecks. For example, a commit taking hours instead of days to reach production is a clear signal of healthier delivery flow.
Change failure rate & MTTR: These indicate whether your safety nets are working. Shipping quickly means little if every second release breaks production. A low failure rate plus a fast mean time to recovery (like rolling back or patching within an hour) demonstrates that guardrails are real, not just policy.
Feedback loop time: The faster developers get feedback, the easier it is to maintain momentum. Waiting 30 minutes for a build versus 3-5 minutes is the difference between staying in flow and switching context. Bloated build times often act as a hidden tax on velocity.
Cognitive load: Every extra tool, login, or step adds friction. Mental overhead is high if developers need five dashboards and three approvals just to release a fix. Teams with lower cognitive load can focus on delivering value instead of navigating complexity.
Adoption: The best signals come from what teams choose, not what leaders mandate. Youâve earned trust if most developers use the âgolden pathâ CI/CD pipeline without hacks or workarounds. Conversely, lots of bypassing (custom scripts, shadow pipelines) suggest the paved path doesnât meet real-world needs. Adoption, therefore, is a measure of trust in the platform.
All these numbers point you to patterns. The conversations and interviews explain the âwhyâ behind them. You need to know both to learn whether youâre truly removing friction.
Measuring flow, failure, and feedback loops is useful only if it informs the mechanism. For us, that mechanism was an API-first platform.
Channel-agnostic: The same capabilities are reachable from UI, CLI, or chat. Youâre not forcing behavior; youâre meeting teams where they are.
Lower cognitive load: Developers stop being amateur release engineers. They express intent; the platform does the plumbing.
Operational trust: Security, audit, and policy live in one place, not five forks of a script.
Evolvability: You can add features to the muscle without rewriting the face. Or change faces (Backstage â custom UI) without touching the muscle.
Explainability: When something goes sideways, engineers can drill down from the golden path to raw artifacts and logs. No black boxes.
These are some antipatterns you would like to avoid:
Portal-as-platform: A UI is best treated as a friendly face, not the heart of the platform. The real power lies in APIs, automation, and the raw capabilities underneath. If all the logic lives only in the portal, you create a bottleneck and limit how developers can scale or script workflows.
Abstracting before you observe: Creating abstractions without understanding real developer pain often leads to the wrong details being hidden. For example, assuming everyone needs a generic âdeployâ button without studying how teams actually ship may oversimplify or block critical steps. Spend time watching the day-to-day flow before masking complexity.
Gatekeeping in disguise: If your golden path is slower or clunkier than the workarounds, developers will naturally bypass it. A âsecure pipelineâ that takes 30 minutes versus a script that ships in 3 will never win adoption. The golden path must be safer and faster, or it becomes gatekeeping masquerading as governance.
Policy by wiki: Policies written in documentation are rarely followed if the tooling doesnât enforce them. Expecting developers to remember and apply rules manually guarantees drift. Instead, build guardrails directly into templates, workflows, and CI/CD checks so compliance happens by default, without relying on a wiki page.
An internal developer platform isnât a stack of tools. Itâs a product with real users who sit a few desks away. Build the muscle first so the face, whatever face you choose, has something worth smiling about.
If you want the whole tour, golden paths vs. cages, the reference architecture, and a live walkthrough of the muscle-and-faces approach, watch the talk by Ninad and Ruturaj that they delivered at KubeCon. If youâre starting this journey now, start small: one thin slice, shipped well. Everything good flows from there.
We hope this article helps you understand the theory of building platforms, but things are even more complicated in real life. If you get stuck at any stage, feel free to contact our expert platform engineers to take a look.
References:
We hate đ spam as much as you do! You're in a safe company.
Only delivering solid AI & cloud native content.