Inside the Composition Pipeline

Crossplane Hero Image

github.com/crossplane/artwork

Introduction
The last post in this series covered the building blocks: providers, managed resources, XRDs, compositions, and the design principle of starting at the consumer. This one goes under the hood. It is about the mechanics of a composition: what flows through the pipeline, how functions read and produce state, how identity is maintained across reconciliation cycles, and the patterns that keep a composition reliable as it grows. The identity mechanics explain a class of failures most practitioners waste time debugging. Everything in this post has a working example you can browse at github.com/ToonVanDeuren/crossplane-composition-pipeline.

Recommendation
Before continuing, check out my previous post: Before You Write a Composition.

Disclosure: Parts of this post were refined with artificial intelligence using Claude (Opus 4.7, Sonnet 4.6). Of course; opinions, thoughts, architectural choices, and the failures I learned from are my own. It was mainly used as a way of efficiently structuring my thoughts.

You change one line in a composition. The naming convention for resource groups picks up a regional suffix that was not there before. You commit, push, the GitOps controller picks it up, and the next reconciliation runs. Crossplane sees that the desired state now contains a ResourceGroup with a new name. It also sees that the resource group from the previous cycle is no longer in the desired set. So it deletes the old one. The database that lived inside it goes with it. This almost happened to me the first time I deployed changes to my naming conventions; a colleague caught it in a pull request review before it reached production.

This is the failure that makes platform engineers cautious. The composition was correct. The change was small. The result was destruction. It happens because the underlying machinery (the way Crossplane decides whether a resource in the new desired state is the same resource as one from the previous cycle) is not what most people think it is. Once you understand that machinery (what flows through a composition pipeline, what each function actually sees, and which annotations tie one cycle to the next), the failure stops being mysterious and becomes preventable. Compositions stop feeling like fragile templates and start behaving like the deterministic systems they are.

The Pipeline Contract

flowchart TB
    subgraph api["Crossplane API Layer"]
        XRD["XRD\n(CompositeResourceDefinition)"]
        CR["Composite Resource\nimplements XRD schema"]
    end

    XCP(["Crossplane\nController"])

    subgraph pipeline["Composition Pipeline"]
        direction TB
        F1["① fetch-environment-config\nfunction-extra-resources"]
        F2["② render-database\nfunction-go-templating"]
        F3["③ render-connection-secret\nfunction-go-templating"]
        F4["④ ensure-ready\nfunction-auto-ready"]
    end

    EC[("EnvironmentConfig\n(cluster)")]

    subgraph reconcile["Reconciliation"]
        PA["Azure Provider"]
        PK["Kubernetes Provider"]
    end

    CLOUD["Azure Cloud"]
    K8S["Kubernetes APIs"]

    XRD -->|"defines schema + validation"| CR
    CR -->|"create / update triggers reconcile"| XCP
    XCP -->|"observed state · desired: {}"| F1
    F1 -. "requirements: fetch EnvironmentConfig" .-> XCP
    EC -. "fetched from cluster" .-> XCP
    XCP -. "re-invoke with resources populated" .-> F1
    F1 -->|"desired: {} · context: { env }"| F2
    F2 -->|"desired: { db MR } · context: { env }"| F3
    F3 -->|"desired: { db MR · secret } · context: { env }"| F4
    F4 -->|"final desired state"| XCP
    XCP -->|"diff desired vs observed"| reconcile
    PA -->|"reconcile"| CLOUD
    PK -->|"reconcile"| K8S

    linkStyle default stroke:#2563eb,color:#2563eb

Every step in the pipeline above is a function: a separate process that receives a request and returns a response. The request carries two things: the observed state (the composite resource as it exists in the API server, plus every composed resource currently associated with it) and the desired state accumulated by every function that ran before. The response is a new desired state. The next function in the pipeline receives the same observed state and the desired state your function just returned.

The pipeline does not mutate state in place. Each function sees a snapshot and produces a snapshot. When the pipeline finishes, Crossplane compares the final desired state against the observed state and reconciles the difference. The functions never talk to the cloud, and they do not call the Kubernetes API server directly. When a function needs data from outside its input (an EnvironmentConfig, a ConfigMap, another resource in the cluster), it returns a requirements field in its response declaring what it needs. Crossplane fetches the requested resources and re-invokes the function with them populated. The function still does not make the call itself; it asks the pipeline to make the call on its behalf. This is what keeps a function deterministic from the pipeline’s perspective: given the same input and the same set of requested resources, it produces the same output.

This is the model that surprises people coming from imperative pipelines. There is no “step 1 creates the database, step 2 applies a patch to it.” Step 1 declares that a database should exist. Step 2 sees that declaration and can add to it, modify it, or leave it alone. Whatever the last function returns is what Crossplane reconciles. The pipeline is convergence, not sequence.

References:

proto/fn/v1/run_function.proto (RPC contract: observed, desired, context, requirements)

internal/xfn/required_resources.go (the re-invoke loop, capped at MaxRequirementsIterations = 5)

composition_functions.go (per-step orchestration that threads desired/context between functions)

Composition functions docs

The example composition has four steps:

pipeline:
  - step: fetch-environment-config
    functionRef:
      name: function-extra-resources
  - step: render-database
    functionRef:
      name: function-go-templating
  - step: render-connection-secret
    functionRef:
      name: function-go-templating
  - step: ensure-ready
    functionRef:
      name: function-auto-ready

The first step pulls in shared infrastructure references from outside the composition. The second renders the database managed resource. The third renders the Kubernetes Secret the application will read. The fourth ties the composite’s Ready condition to the readiness of the resources beneath it. Each step is a separate function process. Each one receives the desired state of the previous step. The full composition is in 03-platform/composition.yaml.

The names you give steps matter. The example uses verb-object names (fetch-environment-config, render-database, render-connection-secret, ensure-ready) so a new engineer reading the pipeline understands the flow before they open a single function input. Order by dependency: anything that needs the EnvironmentConfig runs after fetch-environment-config; anything that reads the desired set runs after the steps that produce it. Beyond those constraints, the order you choose is the order a reader will assume is meaningful. A pipeline that needs a comment to explain its ordering should be reorganised until it does not.

Reading the Template Context

Most templates get stuck the first time on the shape of the context they’re handed. function-go-templating renders Go templates that produce managed resource YAML, and the inputs live in a few well-defined paths.

.observed.composite.resource is the composite resource the developer created. Its metadata and spec are reachable by the template. .observed.resources is a map of managed resources Crossplane already knows about for this composite (keyed by the composition resource name annotation, introduced in the next section). On the first reconciliation it is empty. On every subsequent one it contains everything the previous pipeline produced, including the live status Crossplane has pulled back from the cloud.

.desired.composite.resource is the composite as the pipeline is building it (you write to this when you need to surface fields into the composite’s own status). .desired.resources is the set of managed resources the pipeline has accumulated so far.

The template in render-database opens like this:

{{- $params := .observed.composite.resource.spec }}
{{- $env := (index $.context "apiextensions.crossplane.io/extra-resources" "postgres-env" 0) }}
{{- $namespace := .observed.composite.resource.metadata.namespace }}

$params is the developer’s intent. $env is the shared infrastructure reference pulled in by the previous step, which placed it in the pipeline context under a fixed key. Pulling these out at the top of the template costs nothing and keeps the rest of the file readable.

The template cannot make a network call. It cannot read a Kubernetes object that nobody put into observed or context. It cannot import a library or shell out. It is a pure function of its input. If a value needs to be in the template, something earlier in the pipeline (a separate function step) has to put it there.

References:

crossplane-contrib/function-go-templating (template data shape: .observed, .desired, .context)

Stable Identity

Crossplane has two identity problems to solve, and it solves them with two different annotations. They are independent. Miss either one and things break in different ways.

The first problem is internal: across reconciliation cycles, how does Crossplane know that the FlexibleServerDatabase your template produced this time is the same one it produced last time? The template renders YAML, and YAML has no built-in concept of identity. The answer is the composition resource name annotation (gotemplating.fn.crossplane.io/composition-resource-name). It is the key Crossplane uses to track an output across pipeline runs. Two outputs with the same composition-resource-name are the same resource. Two outputs with different names are different resources, even if everything else about them is identical.

This is what makes the resource group rename at the top of this post destructive. If you render the new resource group without giving it the same composition-resource-name the old one had, Crossplane has no way to know it is supposed to be the same thing. The new one is added to the desired set, the old one drops out, and the reconciler does its job.

The second problem is external: when Crossplane talks to Azure, what is this resource called? The metadata.name of a managed resource is its Kubernetes name. It is not necessarily the name of the cloud resource. The crossplane.io/external-name annotation is what bridges the two. If you set it, Crossplane uses that value as the external identifier when it provisions, observes, or deletes the resource. If you do not set it, Crossplane uses the Kubernetes name, with all the constraints Kubernetes places on it (lowercase, max 253 characters, no underscores).

Cloud naming rules diverge from Kubernetes naming rules constantly. A PostgreSQL database name can contain underscores; a Kubernetes object name cannot. Azure Storage Account names must be globally unique across the entire cloud; Kubernetes object names only need to be unique in a namespace. Setting crossplane.io/external-name decouples the two, so that the Kubernetes object can have a name your platform finds easy to manage, and the cloud resource can have whatever name the provider requires.

The template in the example sets both:

metadata:
  annotations:
    gotemplating.fn.crossplane.io/composition-resource-name: database
    crossplane.io/external-name: {{ $params.databaseName }}

composition-resource-name: database is a fixed string. It does not encode the database name, the environment, or anything that could change over the lifetime of the composite. It is identity inside the pipeline. crossplane.io/external-name is the database name the developer asked for. It is identity inside Azure. function-go-templating also exposes a setResourceNameAnnotation helper that produces the same annotation; the raw form is used here to keep the mechanism visible.

Notice what is missing: metadata.name. When function-go-templating produces a resource without a metadata.name, Crossplane generates one. The generated name is not meaningful to anything outside Crossplane. The two annotations carry the identity that matters; the generated name is just a Kubernetes detail.

References:

function-go-templating: composition-resource-name (the per-pipeline identity annotation)

Crossplane docs: naming external resources (crossplane.io/external-name semantics)

Referencing Shared Infrastructure

The composition in the example does not provision an Azure Flexible Server. It provisions a database on a server that already exists. The server is a piece of shared infrastructure, owned by the team that runs PostgreSQL across the organisation, provisioned through Terraform alongside the network, the firewall rules, the backup configuration, and everything else that belongs to a server-level concern.

This is the boundary between platform and self-service that compositions tend to get wrong. There is a temptation to put the server in the composition, on the theory that “the composition should be self-contained.” It produces a working example. It does not produce a working platform. A real Flexible Server has a network layout (subnet, private DNS zone, firewall) that has to fit into your organisation’s IP space; a backup retention window that has to satisfy a regulator; a sizing decision that has to be reconciled against load patterns nobody on the platform team has visibility into. None of those decisions belong in a developer-facing composition.

So the composition references the server. The reference comes from an EnvironmentConfig, a Crossplane-native ConfigMap-like object that the platform team populates with the values the composition needs at runtime. There is one EnvironmentConfig per environment:

apiVersion: apiextensions.crossplane.io/v1beta1
kind: EnvironmentConfig
metadata:
  name: postgres-production
  labels:
    scope: shared
    workload: postgres
    environment: production
data:
  serverId: /subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.DBforPostgreSQL/flexibleServers/<server>
  serverFqdn: <server>.postgres.database.azure.com
  adminUser: <admin-user>
  charset: UTF8
  collation: en_US.utf8

These keys are all the composition needs from the platform team: the server identity, plus the database defaults (charset, collation, admin user) the platform team set after consulting with the DBAs who actually run PostgreSQL across the organisation. The network topology, backup windows, firewall rules, and sizing stay behind this surface.

The composition pulls the right one in via function-extra-resources, selecting by label:

input:
  apiVersion: extra-resources.fn.crossplane.io/v1beta1
  kind: Input
  spec:
    extraResources:
      - kind: EnvironmentConfig
        apiVersion: apiextensions.crossplane.io/v1beta1
        into: postgres-env
        type: Selector
        selector:
          matchLabels:
            - key: scope
              type: Value
              value: shared
            - key: workload
              type: Value
              value: postgres
            - key: environment
              type: FromCompositeFieldPath
              valueFromFieldPath: spec.environment

Two of the labels are constants (scope: shared, workload: postgres). The third is derived from the composite’s spec.environment. function-extra-resources translates this input into a requirements response: it tells Crossplane “fetch the EnvironmentConfigs that match these labels and call me again.” Crossplane fetches them, re-invokes the function with the resources populated, and the function places them in the pipeline context under the key postgres-env. Every later function step can read them without any further round trips.

The split this creates is the one a platform actually needs. The PostgreSQL team owns the server: they create it, size it, patch it, monitor it. They also own the EnvironmentConfigs that describe it, populating the values when the server comes up. The platform team owns the XRD and the composition: the API developers consume and the implementation that uses the server. Developers own the composite resource: two fields, no knowledge of either layer below. Each layer has the surface it needs and nothing else.

function-extra-resources is not limited to EnvironmentConfigs. It can pull in any Kubernetes object: ConfigMaps, Secrets, other composites, cluster-scoped resources. The pattern is the same. Something earlier in the cluster’s lifecycle creates the object; the composition reads it; the template renders against it. When you find yourself wanting a value the composite does not carry, this is the seam to reach for.

References:

Crossplane docs: EnvironmentConfigs

crossplane-contrib/function-extra-resources (selector-based fetch, returned as a requirements response Crossplane satisfies before re-invoking)

Composition Complexity

Compositions decay into unreadable templates the same way every time. A new requirement lands; someone adds an if. Another requirement; another if. A consumer needs to declare a list of something; someone adds a range. Six months later the template is four hundred lines deep, no one can predict what it produces, and the only person who can change it safely is on parental leave.

The way out is to stop reaching for template logic. Every conditional and every loop is a decision that does not have to be in the template, and most of them are easier to handle one layer up.

The first layer is the XRD. The schema is where consumer decisions are gated. The example XRD lets developers pick development, staging, or production, and nothing else. There is no need for the template to validate the environment value, log a warning for unexpected inputs, or handle a default case. Kubernetes rejects the composite at admission if the value is wrong. The template proceeds knowing the value is one of three options, and the conditional becomes a dict lookup or a string interpolation.

The second layer is the composition structure. If you find yourself writing a range over a list the consumer provided, you have handed complexity back to the consumer. Prefer fixed-cardinality resources. If a database needs to exist per environment, you create one composite per environment, not a list of environments inside one composite. The composition does one thing, the consumer creates more of them when they need more of them, and the template never has to reason about variable-length input.

The third layer is splitting compositions. When a composition grows a third nesting level of conditionals, the right answer is almost always two compositions. The XRD can specify a default; the composite can override it. The branches do not have to coexist inside a single template.

The result is a template that is boring. The example composition is two function-go-templating steps, each short enough to read in one screen, neither containing a single conditional. The interesting work was done in the schema, not the template. That is the goal.

Connection Details

The application that consumes this database needs a hostname, a database name, and credentials. A managed resource provisions the cloud resource; it does not, on its own, surface those values to the consumer. Surfacing them is the composition’s job, and there are several patterns to do it. Each is suited to a different shape of value, and the ordering between them matters.

The simplest pattern is what the example does: emit a Secret with the values that are known at template time.

apiVersion: v1
kind: Secret
metadata:
  name: {{ $params.databaseName }}-credentials
  namespace: {{ $namespace }}
  annotations:
    gotemplating.fn.crossplane.io/composition-resource-name: app-connection-secret
type: Opaque
stringData:
  DATABASE_HOST: {{ $env.data.serverFqdn }}
  DATABASE_NAME: {{ $params.databaseName }}
  DATABASE_USER: {{ $env.data.adminUser }}

That is a regular Kubernetes Secret, not a managed resource and not wrapped in provider-kubernetes. Crossplane v2 composes native Kubernetes resources alongside managed ones: anything the pipeline emits is treated as a composed resource, regardless of whether the API group belongs to a provider or to core Kubernetes. The Secret lands in the developer’s namespace next to the composite, and the application reads it by name.

Values the composition does not yet know are a separate problem. An endpoint allocated by Azure, a generated identifier, a connection string assembled by the provider once the resource exists, none of these are available when the template runs. The mechanism for these is spec.writeConnectionSecretToRef, a field every Crossplane managed resource exposes. You set it to the name and namespace of a Secret; the provider writes the published connection details there asynchronously, after the cloud resource is ready. The composition does not template the value, because the value does not exist. It tells Crossplane where the value should land. Each provider documents the keys it publishes.

Credentials that have to exist somewhere are a third concern, and the right answer is usually to keep them out of the cluster’s etcd. The Azure Key Vault provider exposes a Secret managed resource that creates the entry in Key Vault directly: the composition includes it as a composed resource, the value comes from upstream (generated, or referenced from a Kubernetes Secret elsewhere in the cluster), the External Secrets Operator pulls the value back into the consumer’s namespace, the application reads from that Secret. As long as Crossplane manages the Key Vault Secret, it owns the value. A manual rotation in Key Vault is reverted on the next reconciliation, because the managed resource’s spec is the source of truth, not the Key Vault entry. Rotation has to flow through whatever populates that spec. If a Key Vault rotation policy or an external process needs to drive the value, the cleaner split is to leave the Key Vault Secret outside the composition entirely and reference it by name.

The pattern that beats all of the above is no credential at all. Azure Workload Identity, federated to the application’s ServiceAccount, eliminates the password entirely. That setup is its own post.

The ordering between the four patterns is the point. Prefer no credential over a credential in Key Vault. Prefer Key Vault over a Secret in etcd. Prefer writeConnectionSecretToRef over templating a value the composition does not yet have. Each step removes a class of mistake the composition can make, and the composition gets simpler as you climb.

References:

Crossplane docs: managed resources (writeConnectionSecretToRef and the per-provider published keys)

Auto-Ready

A composition’s pipeline finishes the moment the last function returns. It does not wait for Azure. If you do nothing else, the composite is marked READY: True the second the pipeline completes, which is several minutes before the cloud resource actually exists. Anything downstream watching the composite’s status will see a green signal and start trying to use a database that is not there.

function-auto-ready solves this. It takes no input and is the final step in the pipeline:

- step: ensure-ready
  functionRef:
    name: function-auto-ready

The function reads the desired state, finds every managed resource the pipeline produced, checks each one’s Ready condition, and sets the composite’s Ready condition to True only when all of them are. With the function in place:

❯ kubectl get postgresdatabases.platform.example.com -A
NAMESPACE       NAME       SYNCED   READY   AGE
team-payments   payments   True     False   45s

SYNCED: True means the pipeline ran and produced the desired state successfully. READY: False means at least one managed resource is still being provisioned. Two minutes later, when Azure finishes:

❯ kubectl get postgresdatabases.platform.example.com -A
NAMESPACE       NAME       SYNCED   READY   AGE
team-payments   payments   True     True    3m12s

The composite is now telling the truth. Anything watching it (ArgoCD waiting on a sync, a Backstage plugin showing status, a downstream composite holding for this one to finish) sees the real state of the underlying resources, not the state of the pipeline that produced them.

References:

crossplane-contrib/function-auto-ready (per-kind health checks plus the generic Ready=True condition check)

In the Cluster

Once the pipeline has run and Azure has finished provisioning, the composite and the two resources it owns look like this:

apiVersion: platform.example.com/v1alpha1
kind: PostgresDatabase
metadata:
  name: payments
  namespace: team-payments
spec:
  environment: production
  databaseName: payments
  crossplane:
    resourceRefs:
    - apiVersion: dbforpostgresql.azure.m.upbound.io/v1beta1
      kind: FlexibleServerDatabase
      name: payments-555a3ad1b29a
    - apiVersion: v1
      kind: Secret
      name: payments-credentials
status:
  conditions:
  - reason: ReconcileSuccess
    status: "True"
    type: Synced
  - reason: Available
    status: "True"
    type: Ready
---
apiVersion: dbforpostgresql.azure.m.upbound.io/v1beta1
kind: FlexibleServerDatabase
metadata:
  annotations:
    crossplane.io/composition-resource-name: database
    crossplane.io/external-name: payments
  labels:
    crossplane.io/composite: payments
  name: payments-555a3ad1b29a
  namespace: team-payments
  ownerReferences:
  - apiVersion: platform.example.com/v1alpha1
    kind: PostgresDatabase
    name: payments
    controller: true
spec:
  forProvider:
    charset: UTF8
    collation: en_US.utf8
    serverId: /subscriptions/<subscription-id>/resourceGroups/<resource-group>/providers/Microsoft.DBforPostgreSQL/flexibleServers/<server>
  providerConfigRef:
    kind: ClusterProviderConfig
    name: azure-production
---
apiVersion: v1
kind: Secret
metadata:
  annotations:
    crossplane.io/composition-resource-name: app-connection-secret
  labels:
    crossplane.io/composite: payments
  name: payments-credentials
  namespace: team-payments
  ownerReferences:
  - apiVersion: platform.example.com/v1alpha1
    kind: PostgresDatabase
    name: payments
    controller: true
type: Opaque
stringData:
  DATABASE_HOST: <server>.postgres.database.azure.com
  DATABASE_NAME: payments
  DATABASE_USER: <admin-user>

The composite holds the developer’s intent on top and spec.crossplane.resourceRefs underneath, listing the resources Crossplane produced. The FlexibleServerDatabase carries both identity annotations introduced earlier: composition-resource-name: database is the slot the pipeline writes to across every reconciliation, and crossplane.io/external-name: payments is the name Azure sees. serverId is the EnvironmentConfig value the template substituted in. Both children live in team-payments alongside the composite, with standard Kubernetes ownerReferences pointing back to it; cascade delete and garbage collection work the way Kubernetes already knows how. The crossplane.io/composite: payments label is the convenient way to find every resource belonging to a composite via selector.

Notice the FlexibleServerDatabase apiVersion: dbforpostgresql.azure.m.upbound.io/v1beta1. The m. infix marks the namespaced variant. The same Azure provider also exposes a cluster-scoped form under dbforpostgresql.azure.upbound.io, but that one is unreachable from a namespaced composite. Crossplane v2 forbids the mix because Kubernetes ownerReferences cannot point from a cluster-scoped child to a namespaced owner, and v2 wants ownership to be real, not papered over. The whole pattern of giving each team a namespace and letting them own their infrastructure inside it depends on these namespaced MRs existing.

Beyond Go Templating

function-go-templating is one of several functions Crossplane offers, and it has limits. The templates are stringly-typed, so a typo in a field name produces a managed resource that fails at admission rather than at template render. There is no compile-time check that the field you read from .observed.composite.resource.spec actually exists in the XRD schema. The functions you can call inside a template are Sprig’s standard set; anything beyond that means writing a custom function, packaging it as an OCI image, and installing it as a Function resource in the cluster.

KCL is the direction my team is moving. It is a typed configuration language designed for exactly this problem: rendering structured output (Kubernetes manifests, in this case) from typed inputs, with the validation happening at render time rather than at apply time. The composition pipeline is the same. The function changes, the template language changes, and the surface area for errors shrinks. That comparison is the subject of the next post in this series.

The full example, with all four functions, the XRD, the EnvironmentConfigs, and a sample developer composite, is at github.com/ToonVanDeuren/crossplane-composition-pipeline. It runs against any Crossplane cluster with the Azure and Kubernetes providers installed. To exercise the pipeline locally without touching Azure:

crossplane render \
  04-examples/payments-db.yaml \
  03-platform/composition.yaml \
  01-setup/functions.yaml \
  --extra-resources 02-environment-configs/postgres-production.yaml

crossplane render runs the function pipeline against the provided composite resource and prints the desired managed resources to stdout. No API calls, no cloud credentials required. The --extra-resources flag provides the EnvironmentConfig locally so function-extra-resources can resolve it without a cluster.