Before You Write a Composition

Crossplane Hero Image

github.com/crossplane/artwork

Introduction
The previous post made the case for why a control plane is the right architecture for a platform team. Crossplane is one way to get there; you could also write your own operators with controller-runtime, or reach for tools like the Azure Service Operator (ASO) or AWS Controllers for Kubernetes (ACK) for cloud-specific needs. Crossplane is the most complete off-the-shelf option for teams that don’t want to write and maintain custom controllers, and it is the one this series follows. Before writing a single composition, there is a mental model to internalise: vocabulary, building blocks, and the design decisions that shape everything downstream. This post covers all of that.

Recommendation
Before continuing, check out my previous post: Your Platform Should Be a Control Plane.

There is a temptation, when picking up a new tool, to start with the documentation index and work down. Install the operator. Apply the first example. See what happens. With Crossplane, this approach produces a working control plane before it produces understanding. The examples work. The resources appear. And then, when you try to design something of your own, you hit a wall, not because the tool is difficult, but because you are still thinking about it in the wrong terms.

The right place to start is not the documentation. It is the mental model.

The Same Machinery

Crossplane does not introduce a new API server, a new storage layer, or a new operational model. It runs as a set of controllers inside a Kubernetes cluster, built on the same controller-runtime foundation as any Kubernetes operator. Every piece of infrastructure you manage with Crossplane is a Kubernetes object. You create it with kubectl. You inspect it with kubectl. You delete it with kubectl.

This is not an implementation detail. It is the entire design philosophy.

The Kubernetes API becomes your universal interface. Every tool in your existing stack that works with Kubernetes objects (ArgoCD, Flux, Backstage, kubectl diff, your RBAC, your audit logs) works with Crossplane resources without modification. You are not adopting a new platform. You are extending the one your organisation already operates, using the same machinery that runs your workloads to also manage the infrastructure those workloads depend on.

This framing changes what Crossplane feels like to operate. Debugging a resource that will not reconcile looks exactly like debugging a Deployment that will not come up: you inspect the object’s status conditions, you read the controller logs, you check whether the provider has the credentials it needs. The mental model and operations tooling transfer directly; only the domain is new: infrastructure instead of workloads.

Providers: The Bridge to the Cloud

Crossplane itself is not cloud-aware. It does not know what an Azure Storage Account or an AWS S3 bucket is. That knowledge lives in providers.

A provider is a package, a bundle of CRDs and controllers, that knows how to reconcile a specific external system. provider-azure-storage understands Azure Storage Accounts: how to create them, how to detect when they have drifted from their declared state, and how to bring them back into conformance. Once installed, the provider registers its CRDs in the cluster. From that point on, those types are native Kubernetes objects.

Each provider is configured through a ProviderConfig, which carries the credentials needed to authenticate with the external API. A cluster can have multiple ProviderConfigs, one per Azure subscription or one per AWS account, and each managed resource specifies which ProviderConfig to use. The provider itself is stateless; all credential and configuration context flows through the ProviderConfig.

The Upbound provider catalog covers the major cloud platforms (Azure, AWS, GCP), as well as Helm, Kubernetes, SQL databases, and a growing number of SaaS systems. It can interact with any system exposing an API. For example, check out provider-spotify to create a playlist for the next GitHub outages. 🫣 Because providers are versioned packages, you can pin provider-azure-storage at one version while running a newer provider-azure-network, and upgrade them independently as your needs evolve.

provider-terraform is worth a specific mention for teams with an existing Terraform footprint. Rather than rewriting every module from scratch, it lets Crossplane invoke Terraform workspaces as part of a composition. Any system you are already provisioning through Terraform (on-premises infrastructure, legacy systems, SaaS APIs without a native provider) can be brought into the same control plane without migrating away from the Terraform that already manages it. The composition calls the workspace, Crossplane tracks the result, and the consumer sees none of the underlying mechanism.

Check out the marketplace for a list of all available providers: marketplace.upbound.io/providers

Managed Resources: The Raw Material

Every CRD a provider registers corresponds to a single external resource. These are called managed resources: the bottom layer of Crossplane’s abstraction stack.

A managed resource is a one-to-one mapping between a Kubernetes object and a cloud resource. An Account in provider-azure-storage is an Azure Storage Account. A FlexibleServer in provider-azure-dbforpostgresql is an Azure Database for PostgreSQL Flexible Server. Create one, and Crossplane creates the cloud resource. Update it, and Crossplane applies the change. Delete it, and Crossplane deletes the cloud resource, unless you have configured a deletion policy that says otherwise.

A managed resource for an Azure Resource Group looks like this:

apiVersion: azure.upbound.io/v1beta1
kind: ResourceGroup
metadata:
  name: rg-example-weu-1
spec:
  forProvider:
    location: westeurope
  providerConfigRef:
    name: sub-example-dev

The spec.forProvider block is where cloud-specific configuration lives. Every field maps directly to a property of the cloud resource. There is no abstraction here. This is cloud configuration expressed in YAML, and it is deliberately transparent.

That directness is the point, and it is also why managed resources are not meant for consumers. Nobody should be asking a developer to fill in location: westeurope, decide which ProviderConfig to reference, or know that a Resource Group needs to exist before a Storage Account can be created inside it. These are platform concerns. Managed resources are the raw material that platform engineers assemble into something useful. Exposing them directly to developers would be no different from handing someone a bag of bolts and calling it a car.

Managed resources can also adopt existing cloud infrastructure. If a resource already exists, provisioned by Terraform, created manually, or predating your Crossplane adoption, you can bring it under management by annotating the managed resource with crossplane.io/external-name: <existing-resource-identifier>. Crossplane will observe the existing resource, reconcile any configuration drift, and manage it going forward without recreating it. The annotation more broadly is the identifier Crossplane uses to refer to a resource in the external system, decoupling the Kubernetes object name from the cloud resource name, useful when naming constraints diverge, as cloud resources do not always require the same uniqueness guarantees that Kubernetes objects do within a namespace.

Intent and Implementation

Above managed resources sit two concepts that always work together.

A composite resource is the Kubernetes object a developer creates to express intent. It lives at the level of a consumer-facing API: three fields, no cloud primitives. Kubernetes validates it against a schema, stores it, and waits.

A composition is the recipe the platform team writes that translates that composite resource into managed resources. When a composite resource appears or changes, the composition runs: it executes each function in the pipeline in order, building up the desired set of managed resources from the intent in the object’s spec, and reconciles those resources into the desired state.

The two are inseparable. A composite resource without a composition is a stored object that nothing acts on. A composition without a composite resource is a recipe with no ingredient. Together, they are the mechanism by which a developer’s three-field declaration becomes a running, continuously reconciled piece of infrastructure.

Start at the Consumer

This is where most teams go wrong the first time. They look at the managed resources a provider offers, sketch out which ones they need, and start wiring them together into a composition. The consumer API becomes an afterthought: something that takes shape as the plumbing comes together.

The result is an API that reflects cloud infrastructure rather than developer intent. Consumers end up filling in parameters that exist because the underlying managed resources require them, not because they communicate anything meaningful about what the consumer actually wants. The composition becomes a leaky abstraction. Teams file tickets asking for more escape hatches, and the interface grows in the wrong direction: more fields, more options, more decisions delegated to people who should not be making them.

The right starting point is the other end: what does the consumer need to express?

Spend time on this question before opening a file. Write the resource YAML your developers will fill in. Give the fields names that make sense to the team using them: names that speak to intent, not implementation. What does that schema leave out? A developer-facing resource for a production Postgres database might have three fields: environment, size, and highAvailability. If you are using a consumption/serverless model, you might not even need size because it scales automatically. No subnet delegation. No backup retention window. No server SKU. No region. Those details exist downstream, in the composition, and the consumer has no reason to know they exist.

Once the consumer API is clear, everything downstream is derivable from it. The schema is the contract. And that contract, once stable, can survive any number of changes to the implementation beneath it.

Composite Resource Definitions: The Schema

A Composite Resource Definition (XRD) is, structurally, a CRD for a composite resource. It defines the schema that composite resources must conform to: which fields exist, what types they have, which are required, and what values are valid.

If you have written CRDs before, XRDs will feel familiar. The key difference is that you do not write a controller to act on them. The controller is the composition.

An XRD can produce cluster-scoped or namespace-scoped composite resources via spec.scope. Setting scope: Namespaced gives each team an isolated surface: their composite resources live in their own namespace, governed by standard role bindings. Use scope: Cluster for platform-wide infrastructure that intentionally spans team boundaries. When in doubt, default to Namespaced.

The defaultCompositionRef field points to the composition Crossplane uses when no explicit composition is specified, so consumers never need to reference it directly. The XRD is where the consumer API you designed in the previous step becomes formal.

apiVersion: apiextensions.crossplane.io/v2
kind: CompositeResourceDefinition
metadata:
  name: postgresdatabases.platform.example.com
spec:
  scope: Namespaced
  group: platform.example.com
  names:
    kind: PostgresDatabase
    plural: postgresdatabases
  defaultCompositionRef:
    name: postgresdatabases.azure
  versions:
    - name: v1alpha1
      served: true
      referenceable: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              required:
                - environment
                - size
              properties:
                environment:
                  type: string
                  enum: ["development", "staging", "production"]
                size:
                  type: string
                  enum: ["small", "medium", "large"]
                highAvailability:
                  type: boolean
                  default: false

The schema enforces the contract at the API layer. When a developer creates a composite resource, Kubernetes validates it against this schema before the composition ever runs. The enum constraint on environment is not just documentation: it is enforcement. A developer cannot submit environment: test123 and have it silently pass. The error surfaces immediately, at admission, not hours later when a managed resource fails to provision against a value the composition was never written to handle.

This is what a developer actually creates:

apiVersion: platform.example.com/v1alpha1
kind: PostgresDatabase
metadata:
  name: payments-db
  namespace: team-payments
spec:
  environment: production
  size: medium
  highAvailability: true

Three fields. No region, no SKU, no ProviderConfig, no Resource Group. The developer expresses what they need. Everything else is the platform’s responsibility. This is the object Crossplane watches: when it appears, the composition runs; when it changes, the composition reconciles; when it disappears, the managed resources are cleaned up.

Once the composition has run and the underlying cloud resources are ready, the composite resource surfaces that state through its own status:

❯ kubectl get postgresdatabases -n team-payments

NAME          SYNCED   READY   AGE          
payments-db   True     True    4m

SYNCED: True means the composition pipeline ran without errors: the desired managed resources were computed and handed off for reconciliation. READY: True means all composed managed resources are themselves ready; the full provisioning tree is healthy. A resource stuck at SYNCED: True, READY: False means the pipeline ran but one or more managed resources haven’t become ready yet, either still provisioning or with an error downstream. kubectl describe on the composite resource will surface the condition message.

Invest time in the schema. Getting it right at this stage prevents painful versioning work later. Every field you include is part of the API contract. Every field you include unnecessarily is a field consumers will use, and a field you will need to support through every future change to the implementation beneath it.

Compositions: The Implementation

A Composition is the implementation of an XRD. It describes how to translate a composite resource into a set of managed resources, and it is the layer where the platform team’s knowledge lives: regions, retention policies, naming conventions, tagging standards. Multiple compositions can satisfy the same XRD, one per cloud for instance, and Crossplane selects the right one based on the default set on the XRD, or an explicit reference in the composite resource.

Compositions run as a pipeline of functions. Each step references a Function, a separate controller installed in the cluster, that receives the current state of the composite resource and returns a desired set of managed resources. The most expressive of these is function-go-templating, which renders standard Go templates that produce managed resource YAML. You install it once as a cluster resource:

apiVersion: pkg.crossplane.io/v1beta1
kind: Function
metadata:
  name: function-go-templating
spec:
  package: xpkg.upbound.io/crossplane-contrib/function-go-templating:v0.12.0

A composition that uses it looks like this:

apiVersion: apiextensions.crossplane.io/v1
kind: Composition
metadata:
  name: postgresdatabases.azure
spec:
  compositeTypeRef:
    apiVersion: platform.example.com/v1alpha1
    kind: PostgresDatabase
  pipeline:
    - step: render-resources
      functionRef:
        name: function-go-templating
      input:
        apiVersion: gotemplating.fn.crossplane.io/v1beta1
        kind: GoTemplate
        source: Inline
        inline:
          template: |
            {{- $name := .observed.composite.resource.metadata.name }}
            {{- $params := .observed.composite.resource.spec }}
            {{- $skuMap := dict "small" "Standard_B1ms" "medium" "Standard_D2s_v3" "large" "Standard_D4s_v3" }}
            {{- $storageMap := dict "small" 32768 "medium" 131072 "large" 524288 }}
            ---
            apiVersion: azure.upbound.io/v1beta1
            kind: ResourceGroup
            metadata:
              name: rg-postgres-{{ $name }}-weu-1
              annotations:
                gotemplating.fn.crossplane.io/composition-resource-name: resource-group
            spec:
              forProvider:
                location: westeurope
              providerConfigRef:
                name: sub-example-{{ $params.environment }}
            ---
            apiVersion: dbforpostgresql.azure.upbound.io/v1beta2
            kind: FlexibleServer
            metadata:
              name: {{ $name }}-psql
              annotations:
                gotemplating.fn.crossplane.io/composition-resource-name: flexible-server
            spec:
              forProvider:
                location: westeurope
                resourceGroupNameSelector:
                  matchControllerRef: true
                skuName: {{ index $skuMap $params.size }}
                version: "16"
                storageMb: {{ index $storageMap $params.size }}
                backupRetentionDays: 35
                highAvailability:
                  mode: {{ if $params.highAvailability }}ZoneRedundant{{ else }}SameZone{{ end }}
              providerConfigRef:
                name: sub-example-{{ $params.environment }}

The compositeTypeRef field declares which composite resource type this composition implements. Crossplane uses it to enforce that only PostgresDatabase resources at v1alpha1 can bind to this composition; a MySQLDatabase or any other type will not match, even if it happens to share the same field names. Combined with defaultCompositionRef on the XRD, this is the contract that binds the two objects together: the XRD says which composition to use by default, and the composition says which type it is willing to serve.

One thing worth keeping in mind: the composite resource is namespace-scoped, but the managed resources the template creates are cluster-scoped. The name derived from $name must be unique across the cluster, not just within the team’s namespace, so naming conventions in your composition need to account for that.

Assigning the composite resource to shorthand variables at the top, $name and $params, keeps the template readable as it grows. Both $skuMap and $storageMap follow the same pattern that appears constantly in real compositions: the consumer declares size: medium, and the template translates that silently to Standard_D2s_v3 for the SKU and 128 GB of storage, a value that must match one of Azure’s allowed storage tiers, or the managed resource will fail to provision. The consumer sees neither value. The same logic applies to the highAvailability boolean: the template resolves it to ZoneRedundant or SameZone, a cloud-specific string the consumer has no reason to know exists. A real composition would also need administratorLogin and a reference to a password secret; those are required fields in practice, omitted here for clarity.

resourceGroupNameSelector.matchControllerRef: true tells Crossplane to find the ResourceGroup that was provisioned by the same composite resource, wiring the two together without hard-coding a name. Everything else (the region, the PostgreSQL version, the storage size, the backup retention window) is hardcoded. Those are platform decisions, and the template is where platform decisions live.

The composition-resource-name annotation gives each managed resource a stable identity within the pipeline. Crossplane uses it to track resources across reconciliation cycles: to know that the FlexibleServer in this reconciliation is the same one as the last, and apply an update rather than attempt to create a duplicate.

When you update the composition, every composite resource reconciles against the new template on its next cycle. A change to the region, the retention policy, or the naming convention propagates automatically across every resource in the organisation, without touching a single developer’s YAML, without triggering a pipeline, without coordinating across teams.

This is the property that makes the composition the single lever for moving your platform forward. A new compliance requirement lands: update the composition, and every database in the organisation is brought into conformance automatically. A security patch requires changing the PostgreSQL version: one line in the template, propagated everywhere. A cost optimisation means consolidating to a different SKU tier: the composition changes, the cloud changes, and no developer files a ticket or updates their configuration. The platform team encodes the decision once. The control plane enforces it everywhere, continuously.

The template is not limited to cloud resources. Any Kubernetes object a function outputs, a ConfigMap, a Secret, or a Deployment, is reconciled in the same pipeline run, alongside the managed resources that represent cloud infrastructure. A composition can provision an Azure database and create the connection Secret the application needs in a single pass. provider-kubernetes manages Kubernetes objects on any cluster, typically used when targeting remote clusters; for the local cluster, composition functions can output Kubernetes resources directly. Crossplane is not a cloud-infrastructure tool that happens to run on Kubernetes. It is a general-purpose control plane engine.

No Ceremony

The four building blocks connect in a specific direction.

Providers register managed resources: CRDs that map one-to-one to external API resources and contain the controller logic to reconcile them.

XRDs define composite resources: the platform’s intent-based API, expressed as a versioned schema that Kubernetes enforces at admission, with scope controlling whether they are namespace-scoped or cluster-scoped.

Compositions implement XRDs: they run as a pipeline of functions that translate consumer intent into managed resources, with the platform’s defaults and constraints hardcoded in the template.

Composite resources are what developers create: objects in their team’s namespace (when namespace-scoped) that express intent, trigger reconciliation, and represent the only surface your consumers ever need to touch.

When a developer creates a composite resource, the composition runs. The composition creates managed resources. Providers reconcile those managed resources against the external API. Each layer knows only what it needs to know.

This separation is not ceremony. It is what keeps the consumer-facing API stable while everything beneath it evolves freely. The three-field composite resource a developer creates today does not change when the platform team adds customer-managed encryption keys, migrates to a new database generation, or tightens the RBAC model on managed identities. When the composition changes, so does the cloud. The developer’s three-field declaration stays the same throughout.

Before You Write a Composition

The most useful habit I can describe is also the simplest: write the developer-facing resource first.

Before opening an IDE, before looking at a provider’s CRD list, write the YAML your developers will fill in. Get the field names right. Validate them with the team that will consume them. Check that every field expresses intent and that nothing in the schema forces a developer to make a decision that belongs to the platform.

When the consumer API is correct, the rest is derivable. The XRD is the schema for what you already designed. The composition is the implementation of what the schema implies. The managed resources are the cloud representation of what the composition produces.

Building in the opposite direction, starting from managed resources and surfacing fields upward, produces schemas shaped by cloud provider constraints rather than consumer needs. Those schemas accumulate parameters over time, because every new edge case demands a new escape hatch. They are harder to version, harder to simplify, and harder to maintain as the provider evolves.

The function approach gives you the full expressiveness of a programming language inside the composition layer: conditionals that map environment: production to the right SKU, loops that generate a resource per availability zone, string functions that enforce a consistent naming convention across every team. The next post goes deeper on composition functions: the different function types available, when to reach for each, and how to structure a pipeline that stays readable as complexity grows.

Start from the developer. Build the schema like an API designer, not an infrastructure engineer. Everything else follows.