Cluster Scaling with the CockroachDB Operator

On this page Carat arrow pointing down

This page explains how to add and remove CockroachDB nodes on Kubernetes.

Note:

The CockroachDB operator is in Preview.

Add nodes

Before scaling up CockroachDB, note the following topology recommendations:

  • Each CockroachDB node (running in its own pod) should run on a separate Kubernetes worker node.
  • Each availability zone should have the same number of CockroachDB nodes.

If your cluster has 3 CockroachDB nodes distributed across 3 availability zones (as in our deployment example), Cockroach Labs recommends scaling up by a multiple of 3 to retain an even distribution of nodes. You should therefore scale up to a minimum of 6 CockroachDB nodes, with 2 nodes in each zone.

  1. Run kubectl get nodes to list the worker nodes in your Kubernetes cluster. There should be at least as many worker nodes as pods you plan to add. This ensures that no more than one pod will be placed on each worker node.

  2. If you need to add worker nodes, resize your cluster by specifying the desired number of worker nodes in each zone. Using Google Kubernetes Engine as an example:

    icon/buttons/copy
    gcloud container clusters resize {cluster-name} --region {region-name} --num-nodes 2
    

    This example distributes 2 worker nodes across the default 3 zones, raising the total to 6 worker nodes.

  3. Update cockroachdb.crdbCluster.regions.code.nodes in the values file used to deploy the cluster, with the target size of the CockroachDB cluster in the specified region. This value refers to the number of CockroachDB nodes, each running in one pod:

    cockroachdb:
      crdbCluster:
        regions:
        - code: us-central1
          cloudProvider: gcp
          domain: cluster.domain.us-central
          nodes: 6
    
  4. Apply the new settings to the cluster:

    icon/buttons/copy
    helm upgrade --reuse-values $CRDBCLUSTER ./cockroachdb-parent/charts/cockroachdb --values ./cockroachdb-parent/charts/cockroachdb/values.yaml -n $NAMESPACE
    
  5. Verify that the new pods were successfully started:

    icon/buttons/copy
    kubectl get pods -n $NAMESPACE
    
    NAME                                    READY   STATUS    RESTARTS   AGE
    crdb-operator-655fbf7847-zn9v8            1/1     Running   0          30m
    cockroachdb-9swcg                         1/1     Running   0          24m
    cockroachdb-bn6f7                         2/2     Running   0          24m
    cockroachdb-nk2dw                         2/2     Running   0          24m
    cockroachdb-f83nd                         2/2     Running   0          30s
    cockroachdb-8d2ck                         2/2     Running   0          30s
    cockroachdb-qopc2                         2/2     Running   0          30s
    

    Each pod should be running in one of the 6 worker nodes.

Remove nodes

If your nodes are distributed across 3 availability zones (as in our deployment example), Cockroach Labs recommends scaling down by a multiple of 3 to retain an even distribution. If your cluster has 6 CockroachDB nodes, you should therefore scale down to 3, with 1 node in each zone.

Warning:

Do not scale down to fewer than 3 nodes. This is considered an anti-pattern on CockroachDB and will cause errors. Before scaling down CockroachDB, note that each availability zone should have the same number of CockroachDB nodes.

  1. Update cockroachdb.crdbCluster.regions.code.nodes in the values file used to deploy the cluster, with the target size of the CockroachDB cluster. For instance, to scale a cluster in Google Cloud down to 3 nodes:

    cockroachdb:
      crdbCluster:
        regions:
        - code: us-central1
          cloudProvider: gcp
          domain: cluster.domain.us-central
          nodes: 3
    
  2. Apply the new settings to the cluster:

    icon/buttons/copy
    helm upgrade --reuse-values $CRDBCLUSTER ./cockroachdb-parent/charts/cockroachdb --values ./cockroachdb-parent/charts/cockroachdb/values.yaml -n $NAMESPACE
    
  3. Verify that the pods were successfully removed:

    icon/buttons/copy
    kubectl get pods
    

Decommission nodes

When a Kubernetes node is scheduled for removal or maintenance, the CockroachDB operator can be instructed to decommission the CockroachDB nodes scheduled on this Kubernetes node. Decommissioning safely moves data and workloads away before the node goes offline.

Note:

Once annotated, the Kubernetes node is cordoned so no further pods are scheduled on the node. The annotation is not a mark for future removal, as CockroachDB is decommissioned on the node immediately.

If cluster capacity is limited, replacement pods may remain in the Pending state until new nodes are available.

The following prerequisites are necessary for the CockroachDB operator to be able to decommission a CockroachDB node:

  • The --enable-k8s-node-/controller=true flag must be enabled in the operator's .yaml values file, for example:

    icon/buttons/copy

    containers:
        - name: cockroach-operator
          image: /:
          args:
            - "-enable-k8s-node-controller=true"
    
  • At least one replica of the operator must not be on the target node.

  • There must be no under-replicated ranges on the CockroachDB cluster.

To mark a node for decommissioning, follow these steps:

  1. Identify the name of the Kubernetes node that is to be removed.

  2. Annotate the Kubernetes node with crdb.cockroachlabs.com/decommission="true". The decommissioning process begins immediately after this annotation is applied. Using kubectl, for example:

    icon/buttons/copy
    kubectl annotate node {example-node-name} crdb.cockroachlabs.com/decommission="true"
    
  3. Monitor the cluster:

    • Confirm the decommissioned node's cordoned status:
      icon/buttons/copy
      kubectl describe node {example-node-name}
    
    • Monitor operator events and logs for decommission start and completion messages:
      icon/buttons/copy
      kubectl logs pod {operator-pod-name}
    

If the replacement pods remain in a Pending state, this typically means there is not enough available capacity in the cluster for these pods to be scheduled.

×