Kubernetes Cluster API Provider GCP
Kubernetes-native declarative infrastructure for GCP.
What is the Cluster API Provider GCP?
The Cluster API brings declarative Kubernetes-style APIs to cluster creation, configuration and management. The API itself is shared across multiple cloud providers allowing for true Google Cloud hybrid deployments of Kubernetes.
Documentation
Please see our book for in-depth documentation.
Quick Start
Checkout our Cluster API Quick Start to create your first Kubernetes cluster on Google Cloud Platform using Cluster API.
Support Policy
This provider’s versions are compatible with the following versions of Cluster API:
Cluster API v1alpha3 (v0.3.x ) | Cluster API v1alpha4 (v0.4.x ) | Cluster API v1beta1 (v1.0.x ) | |
---|---|---|---|
Google Cloud Provider v0.3.x | ✓ | ||
Google Cloud Provider v0.4.x | ✓ | ||
Google Cloud Provider v1.0.x | ✓ |
This provider’s versions are able to install and manage the following versions of Kubernetes:
Google Cloud Provider v0.3.x | Google Cloud Provider v0.4.x | Google Cloud Provider v1.0.x | |
---|---|---|---|
Kubernetes 1.15 | |||
Kubernetes 1.16 | ✓ | ||
Kubernetes 1.17 | ✓ | ✓ | |
Kubernetes 1.18 | ✓ | ✓ | ✓ |
Kubernetes 1.19 | ✓ | ✓ | ✓ |
Kubernetes 1.20 | ✓ | ✓ | ✓ |
Kubernetes 1.21 | ✓ | ✓ | |
Kubernetes 1.22 | ✓ |
Each version of Cluster API for Google Cloud will attempt to support at least two versions of Kubernetes e.g., Cluster API for GCP v0.1
may support Kubernetes 1.13 and Kubernetes 1.14.
NOTE: As the versioning for this project is tied to the versioning of Cluster API, future modifications to this policy may be made to more closely align with other providers in the Cluster API ecosystem.
Getting Involved and Contributing
Are you interested in contributing to cluster-api-provider-gcp? We, the maintainers and the community would love your suggestions, support and contributions! The maintainers of the project can be contacted anytime to learn about how to get involved.
Before starting with the contribution, please go through prerequisites of the project.
To set up the development environment, checkout the development guide.
In the interest of getting new people involved, we have issues marked as good first issue
. Although
these issues have a smaller scope but are very helpful in getting acquainted with the codebase.
For more, see the issue tracker. If you’re unsure where to start, feel free to reach out to discuss.
See also: Our own contributor guide and the Kubernetes community page.
We also encourage ALL active community participants to act as if they are maintainers, even if you don’t have ‘official’ written permissions. This is a community effort and we are here to serve the Kubernetes community. If you have an active interest and you want to get involved, you have real power!
Office hours
- Join the SIG Cluster Lifecycle Google Group for access to documents and calendars.
- Participate in the conversations on Kubernetes Discuss
- Provider implementers office hours (CAPI)
- Weekly on Wednesdays @ 10:00 am PT (Pacific Time) on Zoom
- Previous meetings: [ notes | recordings ]
- Cluster API Provider GCP office hours (CAPG)
- Monthly on first Thursday @ 09:00 am PT (Pacific Time) on Zoom
- Previous meetings: [ notes|recordings ]
Other ways to communicate with the contributors
Please check in with us in the #cluster-api-gcp on Slack.
Github Issues
Bugs
If you think you have found a bug, please follow the instruction below.
- Please give a small amount of time giving due diligence to the issue tracker. Your issue might be a duplicate.
- Get the logs from the custom controllers and please paste them in the issue.
- Open a bug report.
- Remember users might be searching for the issue in the future, so please make sure to give it a meaningful title to help others.
- Feel free to reach out to the community on slack.
Tracking new feature
We also have an issue tracker to track features. If you think you have a feature idea, that could make Cluster API provider GCP become even more awesome, then follow these steps.
- Open a feature request.
- Remember users might be searching for the issue in the future, so please make sure to give it a meaningful title to help others.
- Clearly define the use case with concrete examples. Example: type
this
and cluster-api-provider-gcp doesthat
. - Some of our larger features will require some design. If you would like to include a technical design in your feature, please go ahead.
- After the new feature is well understood and the design is agreed upon, we can start coding the feature. We would love for you to code it. So please open up a WIP (work in progress) PR and happy coding!
Code of conduct
Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.
Getting started with CAPG
In this section we’ll cover the basics of how to prepare your environment to use Cluster API Provider for GCP.
Before installing CAPG, your Kubernetes cluster has to be transformed into a CAPI management cluster. If you have already done this, you can jump directly to the next section: Installing CAPG. If, on the other hand, you have an existing Kubernetes cluster that is not yet configured as a CAPI management cluster, you can follow the guide from the CAPI book.
Requirements
- Linux or MacOS (Windows isn’t supported at the moment).
- A Google Cloud account.
- Packer and Ansible to build images
make
to useMakefile
targets- Install
coreutils
(for timeout) on OSX
Create a Service Account
To create and manage clusters, this infrastructure provider uses a service account to authenticate with GCP’s APIs.
From your cloud console, follow these instructions to create a new service account with Editor
permissions.
If you plan to use GKE the service account will also need the iam.serviceAccountTokenCreator
role.
Afterwards, generate a JSON Key and store it somewhere safe.
Installing CAPG
There are two major provider installation paths: using clusterctl
or the Cluster API Operator
.
clusterctl
is a command line tool that provides a simple way of interacting with CAPI and is usually the preferred alternative for those who are getting started. It automates fetching the YAML files defining provider components and installing them.
The Cluster API Operator
is a Kubernetes Operator built on top of clusterctl
and designed to empower cluster administrators to handle the lifecycle of Cluster API providers within a management cluster using a declarative approach. It aims to improve user experience in deploying and managing Cluster API, making it easier to handle day-to-day tasks and automate workflows with GitOps. Visit the CAPI Operator quickstart if you want to experiment with this tool.
You can opt for the tool that works best for you or explore both and decide which is best suited for your use case.
clusterctl
The Service Account you created will be used to interact with GCP and it must be base64 encoded and stored in a environment variable before installing the provider via clusterctl
.
export GCP_B64ENCODED_CREDENTIALS=$( cat /path/to/gcp-credentials.json | base64 | tr -d '\n' )
Finally, let’s initialize the provider.
clusterctl init --infrastructure gcp
This process may take some time and, once the provider is running, you’ll be able to see the capg-controller-manager
pod in your CAPI management cluster.
Cluster API Operator
You can refer to the Cluster API Operator book here to learn about the basics of the project and how to install the operator.
When using Cluster API Operator, secrets are used to store credentials for cloud providers and not environment variables, which means you’ll have to create a new secret containing the base64 encoded version of your GCP credentials and it will be referenced in the yaml file used to initialize the provider. As you can see, by using Cluster API Operator, we’re able to manage provider installation declaratively.
Create GCP credentials secret.
export CREDENTIALS_SECRET_NAME="gcp-credentials"
export CREDENTIALS_SECRET_NAMESPACE="default"
export GCP_B64ENCODED_CREDENTIALS=$( cat /path/to/gcp-credentials.json | base64 | tr -d '\n' )
kubectl create secret generic "${CREDENTIALS_SECRET_NAME}" --from-literal=GCP_B64ENCODED_CREDENTIALS="${GCP_B64ENCODED_CREDENTIALS}" --namespace "${CREDENTIALS_SECRET_NAMESPACE}"
Define CAPG provider declaratively in a file capg.yaml
.
apiVersion: v1
kind: Namespace
metadata:
name: capg-system
---
apiVersion: operator.cluster.x-k8s.io/v1alpha2
kind: InfrastructureProvider
metadata:
name: gcp
namespace: capg-system
spec:
version: v1.8.0
configSecret:
name: gcp-credentials
After applying this file, Cluster API Operator will take care of installing CAPG using the set of credentials stored in the specified secret.
kubectl apply -f capg.yaml
Prerequisites
Before provisioning clusters via CAPG, there are a few extra tasks you need to take care of, including configuring the GCP network and building images for GCP virtual machines.
Set environment variables
export GCP_REGION="<GCP_REGION>"
export GCP_PROJECT="<GCP_PROJECT>"
# Make sure to use same kubernetes version here as building the GCE image
export KUBERNETES_VERSION=1.22.3
export GCP_CONTROL_PLANE_MACHINE_TYPE=n1-standard-2
export GCP_NODE_MACHINE_TYPE=n1-standard-2
export GCP_NETWORK_NAME=<GCP_NETWORK_NAME or default>
export CLUSTER_NAME="<CLUSTER_NAME>"
Configure Network and Cloud NAT
Google Cloud accounts come with a default
network which can be found under
VPC Networks.
If you prefer to create a new Network, follow these instructions.
Cloud NAT
This infrastructure provider sets up Kubernetes clusters using a Global Load Balancer with a public ip address.
Kubernetes nodes, to communicate with the control plane, pull container images from registered (e.g. gcr.io or dockerhub) need to have NAT access or a public ip. By default, the provider creates Machines without a public IP.
To make sure your cluster can communicate with the outside world, and the load balancer, you can create a Cloud NAT in the region you’d like your Kubernetes cluster to live in by following these instructions.
NB: The following commands needs to be run if
${GCP_NETWORK_NAME}
is set todefault
# Ensure if network list contains default network
gcloud compute networks list --project="${GCP_PROJECT}"
gcloud compute networks describe "${GCP_NETWORK_NAME}" --project="${GCP_PROJECT}"
# Ensure if firewall rules are enabled
$ gcloud compute firewall-rules list --project "$GCP_PROJECT"
# Create routers
gcloud compute routers create "${CLUSTER_NAME}-myrouter" --project="${GCP_PROJECT}" --region="${GCP_REGION}" --network="default"
# Create NAT
gcloud compute routers nats create "${CLUSTER_NAME}-mynat" --project="${GCP_PROJECT}" --router-region="${GCP_REGION}" --router="${CLUSTER_NAME}-myrouter"
--nat-all-subnet-ip-ranges --auto-allocate-nat-external-ips
Building images
NB: The following commands should not be run as
root
user.
# Export the GCP project id you want to build images in.
export GCP_PROJECT_ID=<project-id>
# Export the path to the service account credentials created in the step above.
export GOOGLE_APPLICATION_CREDENTIALS=</path/to/serviceaccount-key.json>
# Clone the image builder repository if you haven't already.
git clone https://github.com/kubernetes-sigs/image-builder.git image-builder
# Change directory to images/capi within the image builder repository
cd image-builder/images/capi
# Run the Make target to generate GCE images.
make build-gce-ubuntu-2004
# Check that you can see the published images.
gcloud compute images list --project ${GCP_PROJECT_ID} --no-standard-images --filter="family:capi-ubuntu-2004-k8s"
# Export the IMAGE_ID from the above
export IMAGE_ID="projects/${GCP_PROJECT_ID}/global/images/<image-name>"
Clean-up
Delete the NAT gateway
gcloud compute routers nats delete "${CLUSTER_NAME}-mynat" --project="${GCP_PROJECT}" \
--router-region="${GCP_REGION}" --router="${CLUSTER_NAME}-myrouter" --quiet || true
Delete the router
gcloud compute routers delete "${CLUSTER_NAME}-myrouter" --project="${GCP_PROJECT}" \
--region="${GCP_REGION}" --quiet || true
Self-managed clusters
This section contains information about how you can provision self-managed Kubernetes clusters hosted in GCP’s Compute Engine.
Provisioning a self-managed Cluster
This guide uses an example from the ./templates
folder of the CAPG repository. You can inspect the yaml file here.
Configure cluster parameters
While inspecting the cluster definition in ./templates/cluster-template.yaml
you probably noticed that it contains a number of parameterized values that must be substituted with the specifics of your use case. This can be done via environment variables and clusterctl
and effectively makes the template more flexible to adapt to different provisioning scenarios. These are the environment variables that you’ll be required to set before deploying a workload cluster:
export GCP_REGION=us-east4
export GCP_PROJECT=cluster-api-gcp-project
export CONTROL_PLANE_MACHINE_COUNT=1
export WORKER_MACHINE_COUNT=1
export KUBERNETES_VERSION=1.29.3
export GCP_CONTROL_PLANE_MACHINE_TYPE=n1-standard-2
export GCP_NODE_MACHINE_TYPE=n1-standard-2
export GCP_NETWORK_NAME=default
export IMAGE_ID=projects/cluster-api-gcp-project/global/images/your-image
Generate cluster definition
The sample cluster templates are already prepared so that you can use them with clusterctl
to create a self-managed Kubernetes cluster with CAPG.
clusterctl generate cluster capi-gcp-quickstart -i gcp > capi-gcp-quickstart.yaml
In this example, capi-gcp-quickstart
will be used as cluster name.
Create cluster
The resulting file represents the workload cluster definition and you simply need to apply it to your cluster to trigger cluster creation:
kubectl apply -f capi-gcp-quickstart.yaml
Kubeconfig
When creating an GCP cluster 2 kubeconfigs are generated and stored as secrets in the management cluster.
User kubeconfig
This should be used by users that want to connect to the newly created GCP cluster. The name of the secret that contains the kubeconfig will be [cluster-name]-user-kubeconfig
where you need to replace [cluster-name] with the name of your cluster. The -user-kubeconfig in the name indicates that the kubeconfig is for the user use.
To get the user kubeconfig for a cluster named managed-test
you can run a command similar to:
kubectl --namespace=default get secret managed-test-user-kubeconfig \
-o jsonpath={.data.value} | base64 --decode \
> managed-test.kubeconfig
Cluster API (CAPI) kubeconfig
This kubeconfig is used internally by CAPI and shouldn’t be used outside of the management server. It is used by CAPI to perform operations, such as draining a node. The name of the secret that contains the kubeconfig will be [cluster-name]-kubeconfig
where you need to replace [cluster-name] with the name of your cluster. Note that there is NO -user
in the name.
The kubeconfig is regenerated every sync-period
as the token that is embedded in the kubeconfig is only valid for a short period of time.
CNI
By default, no CNI plugin is installed when a self-managed cluster is provisioned. As a user, you need to install your own CNI (e.g. Calico with VXLAN) for the control plane of the cluster to become ready.
This document describes how to use Flannel as your CNI solution.
Modify the Cluster resources
Before deploying the cluster, change the KubeadmControlPlane
value at spec.kubeadmConfigSpec.clusterConfiguration.controllerManager.extraArgs.allocate-node-cidrs
to "true"
apiVersion: controlplane.cluster.x-k8s.io/v1beta1
kind: KubeadmControlPlane
spec:
kubeadmConfigSpec:
clusterConfiguration:
controllerManager:
extraArgs:
allocate-node-cidrs: "true"
Modify Flannel Config
(NOTE): This is based off of the instruction at: deploying-flannel-manually
You need to make an adjustment to the default flannel configuration so that the CIDR inside your CAPG cluster matches the Flannel Network CIDR.
View your capi-cluster.yaml and make note of the Cluster Network CIDR Block. For example:
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
spec:
clusterNetwork:
pods:
cidrBlocks:
- 192.168.0.0/16
Download the file at https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
and modify the kube-flannel-cfg
ConfigMap. Set the value at data.net-conf.json.Network
value to match your Cluster Network CIDR Block.
wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Edit kube-flannel.yml and change this section so that the Network section matches your Cluster CIDR
kind: ConfigMap
apiVersion: v1
metadata:
name: kube-flannel-cfg
data:
net-conf.json: |
{
"Network": "192.168.0.0/16",
"Backend": {
"Type": "vxlan"
}
}
Apply kube-flannel.yml
kubectl apply -f kube-flannel.yml
GKE Support in the GCP Provider
- Feature status: Experimental
- Feature gate (required): GKE=true
Overview
The GCP provider supports creating GKE based cluster. Currently the following features are supported:
- Provisioning/managing a GCP GKE Cluster
- Upgrading the Kubernetes version of the GKE Cluster
- Creating a managed node pool and attaching it to the GKE cluster
The implementation introduces the following CRD kinds:
- GCPManagedCluster - presents the properties needed to provision and manage the general GCP operating infrastructure for the cluster (i.e project, networking, iam)
- GCPManagedControlPlane - specifies the GKE Cluster in GCP and used by the Cluster API GCP Managed Control plane
- GCPManagedMachinePool - defines the managed node pool for the cluster
And a new template is available in the templates folder for creating a managed workload cluster.
SEE ALSO
Provisioning a GKE cluster
This guide uses an example from the ./templates
folder of the CAPG repository. You can inspect the yaml file here.
Configure cluster parameters
While inspecting the cluster definition in ./templates/cluster-template-gke.yaml
you probably noticed that it contains a number of parameterized values that must be substituted with the specifics of your use case. This can be done via environment variables and clusterctl
and effectively makes the template more flexible to adapt to different provisioning scenarios. These are the environment variables that you’ll be required to set before deploying a workload cluster:
export GCP_PROJECT=cluster-api-gcp-project
export GCP_REGION=us-east4
export GCP_NETWORK_NAME=default
export WORKER_MACHINE_COUNT=1
Generate cluster definition
The sample cluster templates are already prepared so that you can use them with clusterctl
to create a GKE cluster with CAPG.
To create a GKE cluster with a managed node group (a.k.a managed machine pool):
clusterctl generate cluster capi-gke-quickstart --flavor gke -i gcp > capi-gke-quickstart.yaml
In this example, capi-gke-quickstart
will be used as cluster name.
Create cluster
The resulting file represents the workload cluster definition and you simply need to apply it to your cluster to trigger cluster creation:
kubectl apply -f capi-gke-quickstart.yaml
Kubeconfig
When creating an GKE cluster 2 kubeconfigs are generated and stored as secrets in the management cluster.
User kubeconfig
This should be used by users that want to connect to the newly created GKE cluster. The name of the secret that contains the kubeconfig will be [cluster-name]-user-kubeconfig
where you need to replace [cluster-name] with the name of your cluster. The -user-kubeconfig in the name indicates that the kubeconfig is for the user use.
To get the user kubeconfig for a cluster named managed-test
you can run a command similar to:
kubectl --namespace=default get secret managed-test-user-kubeconfig \
-o jsonpath={.data.value} | base64 --decode \
> managed-test.kubeconfig
Cluster API (CAPI) kubeconfig
This kubeconfig is used internally by CAPI and shouldn’t be used outside of the management server. It is used by CAPI to perform operations, such as draining a node. The name of the secret that contains the kubeconfig will be [cluster-name]-kubeconfig
where you need to replace [cluster-name] with the name of your cluster. Note that there is NO -user
in the name.
The kubeconfig is regenerated every sync-period
as the token that is embedded in the kubeconfig is only valid for a short period of time.
GKE Cluster Upgrades
Control Plane Upgrade
Upgrading the Kubernetes version of the control plane is supported by the provider. To perform an upgrade you need to update the controlPlaneVersion
in the spec of the GCPManagedControlPlane
. Once the version has changed the provider will handle the upgrade for you.
Enabling GKE Support
Enabling GKE support is done via the GKE feature flag by setting it to true. This can be done before running clusterctl init
by using the EXP_CAPG_GKE environment variable:
export EXP_CAPG_GKE=true
clusterctl init --infrastructure gcp
IMPORTANT: To use GKE the service account used for CAPG will need the
iam.serviceAccountTokenCreator
role assigned.
Disabling GKE Support
Support for GKE is disabled by default when you use the GCP infrastructure provider.
ClusterClass
- Feature status: Experimental
- Feature gate:
ClusterTopology=true
ClusterClass is a collection of templates that define a topology (control plane and machine deployments) to be used to continuously reconcile one or more Clusters. It is built on top of the existing Cluster API resources and provides a set of tools and operations to streamline cluster lifecycle management while maintaining the same underlying API.
CAPG supports the creation of clusters via Cluster Topology for self-managed clusters only.
Provisioning a Cluster via ClusterClass
This guide uses an example from the ./templates
folder of the CAPG repository. You can inspect the yaml file for the ClusterClass
here and the cluster definition here.
Templates and clusters
ClusterClass makes cluster templates more flexible and versatile as it allows users to create cluster flavors that can be reused for cluster provisioning.
In this case, while inspecting the sample files, you probably noticed that there are references to two different yaml:
./templates/cluster-template-clusterclass.yaml
is the class definition. It represents the template that define a topology: control plane and machine deployment but it won’t provision the cluster../templates/cluster-template-topology.yaml
is the cluster definition that references the class. This workload cluster definition is considerably simpler than a regular CAPI cluster template that does not use ClusterClass, as most of the complexity of defining the control plane and machine deployment has been removed by the class.
Configure ClusterClass
While inspecting the templates you probably noticed that they contain a number of parameterized values that must be substituted with the specifics of your use case. This can be done via environment variables and clusterctl
and effectively make the templates more flexible to adapt to different provisioning scenarios. These are the environment variables that you’ll be required to set before deploying a class and a workload cluster from it:
export CLUSTER_CLASS_NAME=sample-cc
export GCP_PROJECT=cluster-api-gcp-project
export GCP_REGION=us-east4
export GCP_NETWORK_NAME=default
export IMAGE_ID=projects/cluster-api-gcp-project/global/images/your-image
Generate ClusterClass definition
The sample ClusterClass template is already prepared so that you can use it with clusterctl
to create a CAPI ClusterClass with CAPG.
clusterctl generate cluster capi-gcp-quickstart-clusterclass --flavor clusterclass -i gcp > capi-gcp-quickstart-clusterclass.yaml
In this example, capi-gcp-quickstart-clusterclass
will be used as class name.
Create ClusterClass
The resulting file represents the class template definition and you simply need to apply it to your cluster to make it available in the API:
kubectl apply -f capi-gcp-quickstart-clusterclass.yaml
Create a cluster from a class
ClusterClass is a powerful feature of CAPI because we can now create one or multiple clusters that are based on the same class that is available in the CAPI Management Cluster. This base template can be parameterized so clusters created from it can make slight changes to the original configuration and adapt to the specifics of the use case, e.g. provisioning clusters for different development, staging and production environments.
Now that the class is available to be referenced by cluster objects, let’s configure the workload cluster and provision it.
export CLUSTER_NAME=sample-cluster
export CLUSTER_CLASS_NAME=sample-cc
export KUBERNETES_VERSION=1.29.3
export CONTROL_PLANE_MACHINE_COUNT=1
export WORKER_MACHINE_COUNT=1
export GCP_REGION=us-east4
export GCP_CONTROL_PLANE_MACHINE_TYPE=n1-standard-2
export GCP_NODE_MACHINE_TYPE=n1-standard-2
export CNI_RESOURCES=./cni-resource
You can take a look at CAPG’s CNI requirements here
You can use clusterctl
to create a cluster definition.
clusterctl generate cluster capi-gcp-quickstart-topology --flavor topology -i gcp > capi-gcp-quickstart-topology.yaml
And by simply applying the resulting template, the cluster will be provisioned based on the existing ClusterClass.
kubectl apply -f capi-gcp-quickstart-topology.yaml
You can now experiment with creating more clusters based on this class while applying different configurations to each workload cluster.
Enabling ClusterClass Support
Enabling ClusterClass support is done via the ClusterTopology feature flag by setting it to true. This can be done before running clusterctl init
by using the CLUSTER_TOPOLOGY environment variable:
export CLUSTER_TOPOLOGY=true
clusterctl init --infrastructure gcp
Disabling ClusterClass Support
Support for ClusterClass is disabled by default when you use the GCP infrastructure provider.
This section contains information about relevant CAPG features and how to use them.
Running Conformance tests
Required environment variables
- Set the GCP region
export GCP_REGION=us-east4
- Set the GCP project to use
export GCP_PROJECT=your-project-id
- Set the path to the service account
export GOOGLE_APPLICATION_CREDENTIALS=path/to/your/service-account.json
Optional environment variables
- Set a specific name for your cluster
export CLUSTER_NAME=test1
- Set a specific name for your network
export NETWORK_NAME=test1-mynetwork
- Skip cleaning up the project resources
export SKIP_CLEANUP=1
Running the conformance tests
scripts/ci-conformance.sh
Machine Locations
This document describes how to configure the location of a CAPG cluster’s compute resources. By default, CAPG requires the user to specify a GCP region for the cluster’s machines by setting the GCP_REGION
environment variable as outlined in the CAPI quickstart guide. The provider then picks a zone to deploy the control plane and worker nodes in and generates the according portions of the cluster’s YAML manifests.
It is possible to override this default behaviour and exercise more fine-grained control over machine locations as outlined in the rest of this document.
Control Plane Machine Location
Before deploying the cluster, add a failureDomains
field to the spec
of your GCPCluster
definition, containing a list of allowed zones:
apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4
kind: GCPCluster
metadata:
name: capi-quickstart
spec:
network:
name: default
project: cyberscan2
region: europe-west3
+ failureDomains:
+ - europe-west3-b
In this example configuration, only a single zone has been added, ensuring the control plane is provisioned in europe-west3-b
.
Node Pool Location
Similar to the above, you can override the auto-generated GCP zone for your MachineDeployment
, by changing the value of the failureDomain
field at spec.template.spec.failureDomain
:
apiVersion: cluster.x-k8s.io/v1alpha4
kind: MachineDeployment
metadata:
name: capi-quickstart-md-0
spec:
clusterName: capi-quickstart
# [...]
template:
spec:
# [...]
clusterName: capi-quickstart
- failureDomain: europe-west3-a
+ failureDomain: europe-west3-b
When combined like this, the above configuration effectively instructs CAPG to deploy the CAPI equivalent of a zonal GKE cluster.
Preemptible Virtual Machines
GCP Preemptible Virtual Machines allows user to run a VM instance at a much lower price when compared to normal VM instances.
Compute Engine might stop (preempt) these instances if it requires access to those resources for other tasks. Preemptible instances will always stop after 24 hours.
When do I use Preemptible Virtual Machines?
A Preemptible VM works best for applications or systems that distribute processes across multiple instances in a cluster. While a shutdown would be disruptive for common enterprise applications, such as databases, it’s hardly noticeable in distributed systems that run across clusters of machines and are designed to tolerate failures.
How do I use Preemptible Virtual Machines?
To enable a machine to be backed by Preemptible Virtual Machine, add preemptible
option to GCPMachineTemplate
and set it to True.
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: GCPMachineTemplate
metadata:
name: capg-md-0
spec:
region: us-west-1
template:
osDisk:
diskSizeGB: 30
managedDisk:
storageAccountType: STANDARD
osType: Linux
vmSize: E2
preemptible: true
Spot VMs
Spot VMs are the latest version of preemptible VMs.
To use a Spot VM instead of a Preemptible VM, add provisioningModel
to GCPMachineTemplate
and set it to Spot
.
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: GCPMachineTemplate
metadata:
name: capg-md-0
spec:
region: us-west-1
template:
osDisk:
diskSizeGB: 30
managedDisk:
storageAccountType: STANDARD
osType: Linux
vmSize: E2
provisioningModel: Spot
NOTE: specifying preemptible: true
and provisioningModel: Spot
is equivalent to only provisioningModel: Spot
. Spot takes priority.
Everything you need to know about contributing to CAPG.
If you are new to the project and want to help but don’t know where to start, you can refer to the Cluster API contributing guide.
Developing Cluster API Provider GCP
Setting up
Base requirements
- Install go
- Get the latest patch version for go v1.18.
- Install jq
brew install jq
on macOS.sudo apt install jq
on Windows + WSL2.sudo apt install jq
on Ubuntu Linux.
- Install gettext package
brew install gettext && brew link --force gettext
on macOS.sudo apt install gettext
on Windows + WSL2.sudo apt install gettext
on Ubuntu Linux.
- Install KIND
GO111MODULE="on" go get sigs.k8s.io/kind@v0.14.0
.
- Install Kustomize
brew install kustomize
on macOS.- install instructions on Windows + WSL2, Linux and macOS.
- Install Python 3.x, if neither is already installed.
- Install make.
brew install make
on MacOS.sudo apt install make
on Windows + WSL2.sudo apt install make
on Linux.
- Install timeout
brew install coreutils
on macOS.
When developing on Windows, it is suggested to set up the project on Windows + WSL2 and the file should be checked out on as wsl file system for better results.
Get the source
git clone https://github.com/kubernetes-sigs/cluster-api-provider-gcp
cd cluster-api-provider-gcp
Get familiar with basic concepts
This provider is modeled after the upstream Cluster API project. To get familiar with Cluster API resources, concepts and conventions (such as CAPI and CAPG), refer to the Cluster API Book.
Dev manifest files
Part of running cluster-api-provider-gcp is generating manifests to run. Generating dev manifests allows you to test dev images instead of the default releases.
Dev images
Container registry
Any public container registry can be leveraged for storing cluster-api-provider-gcp container images.
CAPG Node images
In order to deploy a workload cluster you will need to build the node images to use, for that you can reference the image-builder project, also you can read the image-builder book
Please refer to the image-builder documentation in order to get the latest requirements to build the node images.
To build the node images for GCP: https://image-builder.sigs.k8s.io/capi/providers/gcp.html
Developing
Change some code!
Modules and Dependencies
This repository uses Go Modules to track vendor dependencies.
To pin a new dependency:
- Run
go get <repository>@<version>
- (Optional) Add a replace statement in
go.mod
Makefile targets and scripts are offered to work with go modules:
make verify-modules
checks whether go modules are out of date.make modules
runsgo mod tidy
to ensure proper vendoring.hack/ensure-go.sh
checks that the Go version and environment variables are properly set.
Setting up the environment
Your environment must have the GCP credentials, check Authentication Getting Started
Tilt Requirements
Install Tilt:
brew install tilt-dev/tap/tilt
on macOS or Linuxscoop bucket add tilt-dev https://github.com/tilt-dev/scoop-bucket
&scoop install tilt
on Windows
After the installation is done, verify that you have installed it correctly with: tilt version
Install Helm:
brew install helm
on MacOSchoco install kubernetes-helm
on Windows- Install instructions for Linux
As the project lacks a lot of feature for windows, it would be suggested to follow the above steps on Windows + WSL2 rather than Windows.
Using Tilt
Both of the Tilt setups below will get you started developing CAPG in a local kind cluster. The main difference is the number of components you will build from source and the scope of the changes you’d like to make. If you only want to make changes in CAPG, then follow CAPG instructions. This will save you from having to build all of the images for CAPI, which can take a while. If the scope of your development will span both CAPG and CAPI, then follow the CAPI and CAPG instructions.
Tilt for dev in CAPG
If you want to develop in CAPG and get a local development cluster working quickly, this is the path for you.
From the root of the CAPG repository, run the following to generate a tilt-settings.json
file with your GCP
service account credentials:
$ cat <<EOF > tilt-settings.json
{
"kustomize_substitutions": {
"GCP_B64ENCODED_CREDENTIALS": "$(cat PATH_FOR_GCP_CREDENTIALS_JSON | base64 -w0)"
}
}
EOF
Set the following environment variables with the appropriate values for your environment:
$ export GCP_REGION="<GCP_REGION>" \
$ export GCP_PROJECT="<GCP_PROJECT>" \
$ export CONTROL_PLANE_MACHINE_COUNT=1 \
$ export WORKER_MACHINE_COUNT=1 \
# Make sure to use same kubernetes version here as building the GCE image
$ export KUBERNETES_VERSION=1.23.3 \
$ export GCP_CONTROL_PLANE_MACHINE_TYPE=n1-standard-2 \
$ export GCP_NODE_MACHINE_TYPE=n1-standard-2 \
$ export GCP_NETWORK_NAME=<GCP_NETWORK_NAME or default> \
$ export CLUSTER_NAME="<CLUSTER_NAME>" \
To build a kind cluster and start Tilt, just run:
make tilt-up
Alternatively, you can also run:
./scripts/setup-dev-enviroment.sh
It will setup the network, if you already setup the network you can skip this step for that just run:
./scripts/setup-dev-enviroment.sh --skip-init-network
By default, the Cluster API components deployed by Tilt have experimental features turned off.
If you would like to enable these features, add extra_args
as specified in The Cluster API Book.
Once your kind management cluster is up and running, you can deploy a workload cluster.
To tear down the kind cluster built by the command above, just run:
make kind-reset
And if you need to cleanup the network setup you can run:
./scripts/setup-dev-enviroment.sh --clean-network
Tilt for dev in both CAPG and CAPI
If you want to develop in both CAPI and CAPG at the same time, then this is the path for you.
To use Tilt for a simplified development workflow, follow the instructions in the cluster-api repo. The instructions will walk you through cloning the Cluster API (CAPI) repository and configuring Tilt to use kind
to deploy the cluster api management components.
you may wish to checkout out the correct version of CAPI to match the version used in CAPG
Note that tilt up
will be run from the cluster-api repository
directory and the tilt-settings.json
file will point back to the cluster-api-provider-gcp
repository directory. Any changes you make to the source code in cluster-api
or cluster-api-provider-gcp
repositories will automatically redeployed to the kind
cluster.
After you have cloned both repositories, your folder structure should look like:
|-- src/cluster-api-provider-gcp
|-- src/cluster-api (run `tilt up` here)
After configuring the environment variables, run the following to generate your tilt-settings.json
file:
cat <<EOF > tilt-settings.json
{
"default_registry": "${REGISTRY}",
"provider_repos": ["../cluster-api-provider-gcp"],
"enable_providers": ["gcp", "docker", "kubeadm-bootstrap", "kubeadm-control-plane"],
"kustomize_substitutions": {
"GCP_B64ENCODED_CREDENTIALS": "$(cat PATH_FOR_GCP_CREDENTIALS_JSON | base64 -w0)"
}
}
EOF
$REGISTRY
should be in the formatdocker.io/<dockerhub-username>
The cluster-api management components that are deployed are configured at the /config
folder of each repository respectively. Making changes to those files will trigger a redeploy of the management cluster components.
Debugging
If you would like to debug CAPG you can run the provider with delve, a Go debugger tool. This will then allow you to attach to delve and troubleshoot the processes.
To do this you need to use the debug configuration in tilt-settings.json. Full details of the options can be seen here.
An example tilt-settings.json:
{
"default_registry": "gcr.io/your-project-name-her",
"provider_repos": ["../cluster-api-provider-gcp"],
"enable_providers": ["gcp", "kubeadm-bootstrap", "kubeadm-control-plane"],
"debug": {
"gcp": {
"continue": true,
"port": 30000,
"profiler_port": 40000,
"metrics_port": 40001
}
},
"kustomize_substitutions": {
"GCP_B64ENCODED_CREDENTIALS": "$(cat PATH_FOR_GCP_CREDENTIALS_JSON | base64 -w0)"
}
}
Once you have run tilt (see section below) you will be able to connect to the running instance of delve.
For vscode, you can use the a launch configuration like this:
{
"version": "0.2.0",
"configurations": [
{
"name": "Core CAPI Controller GCP",
"type": "go",
"request": "attach",
"mode": "remote",
"remotePath": "",
"port": 30000,
"host": "127.0.0.1",
"showLog": true,
"trace": "log",
"logOutput": "rpc"
}
]
}
Create a new configuration and add it to the “Debug” menu to configure debugging in GoLand/IntelliJ following these instructions.
Alternatively, you may use delve straight from the CLI by executing a command like this:
delve -a tcp://localhost:30000
Deploying a workload cluster
After your kind management cluster is up and running with Tilt, ensure you have all the environment variables set as described in Tilt for dev in CAPG, and deploy a workload cluster with the following:
make create-workload-cluster
To delete the cluster:
make delete-workload-cluster
Submitting PRs and testing
Pull requests and issues are highly encouraged! If you’re interested in submitting PRs to the project, please be sure to run some initial checks prior to submission:
Do make sure to set the GOOGLE_APPLICATION_CREDENTIALS
environment variable with the path to your JSON file. Check out the this doc to generate the credential.
make lint # Runs a suite of quick scripts to check code structure
make test # Runs tests on the Go code
Executing unit tests
make test
executes the project’s unit tests. These tests do not stand up a
Kubernetes cluster, nor do they have external dependencies.
Nightly Builds
Nightly builds are regular automated builds of the CAPG source code that occur every night.
These builds are generated directly from the latest commit of source code on the main branch.
Nightly builds serve several purposes:
- Early Testing: They provide an opportunity for developers and testers to access the most recent changes in the codebase and identify any issues or bugs that may have been introduced.
- Feedback Loop: They facilitate a rapid feedback loop, enabling developers to receive feedback on their changes quickly, allowing them to iterate and improve the code more efficiently.
- Preview of New Features: Users and can get a preview of upcoming features or changes by testing nightly builds, although these builds may not always be stable enough for production use.
Overall, nightly builds play a crucial role in software development by promoting user testing, early bug detection, and rapid iteration.
CAPG Nightly build jobs run in Prow.
Usage
To try a nightly build, you can download the latest built nightly CAPG manifests, you can find the available ones by executing the following command:
curl -sL -H 'Accept: application/json' "https://storage.googleapis.com/storage/v1/b/k8s-staging-cluster-api-gcp/o" | jq -r '.items | map(select(.name | startswith("components/nightly_main"))) | .[] | [.timeCreated,.mediaLink] | @tsv'
The output should look something like this:
2024-05-03T08:03:09.087Z https://storage.googleapis.com/download/storage/v1/b/k8s-staging-cluster-api-gcp/o/components%2Fnightly_main_2024050x?generation=1714723389033961&alt=media
2024-05-04T08:02:52.517Z https://storage.googleapis.com/download/storage/v1/b/k8s-staging-cluster-api-gcp/o/components%2Fnightly_main_2024050y?generation=1714809772486582&alt=media
2024-05-05T08:02:45.840Z https://storage.googleapis.com/download/storage/v1/b/k8s-staging-cluster-api-gcp/o/components%2Fnightly_main_2024050z?generation=1714896165803510&alt=media
Now visit the link for the manifest you want to download. This will automatically download the manifest for you.
Once downloaded you can apply the manifest directly to your testing CAPI management cluster/namespace (e.g. with kubectl), as the downloaded CAPG manifest will already contain the correct, corresponding CAPG nightly image reference.
Creating cluster without clusterctl
This document describes how to create a management cluster and workload cluster without using clusterctl. For creating a cluster with clusterctl, checkout our Cluster API Quick Start
For creating a Management cluster
-
Build required images by using the following commands:
docker build --tag=gcr.io/k8s-staging-cluster-api-gcp/cluster-api-gcp-controller:e2e .
make docker-build-all
-
Set the required environment variables. For example:
export GCP_REGION=us-east4 export GCP_PROJECT=k8s-staging-cluster-api-gcp export CONTROL_PLANE_MACHINE_COUNT=1 export WORKER_MACHINE_COUNT=1 export KUBERNETES_VERSION=1.21.6 export GCP_CONTROL_PLANE_MACHINE_TYPE=n1-standard-2 export GCP_NODE_MACHINE_TYPE=n1-standard-2 export GCP_NETWORK_NAME=default export GCP_B64ENCODED_CREDENTIALS=$( cat /path/to/gcp_credentials.json | base64 | tr -d '\n' ) export CLUSTER_NAME="capg-test" export IMAGE_ID=projects/k8s-staging-cluster-api-gcp/global/images/cluster-api-ubuntu-2204-v1-27-3-nightly
You can check for other images to set the IMAGE_ID
of your choice.
- Run
make create-management-cluster
from root directory.
Jobs
This document provides an overview of our jobs running via Prow and Github actions.
Builds and tests running on the default branch
Legend
🟢 REQUIRED - Jobs that have to run successfully to get the PR merged.
Presubmits
Prow Presubmits:
-
🟢pull-cluster-api-provider-gcp-test
./scripts/ci-test.sh
-
🟢pull-cluster-api-provider-gcp-build
../scripts/ci-build.sh
-
🟢pull-cluster-api-provider-gcp-make
runner.sh
./scripts/ci-make.sh
-
🟢pull-cluster-api-provider-gcp-e2e-test
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" ./scripts/ci-e2e.sh
-
pull-cluster-api-provider-gcp-conformance-ci-artifacts
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" ./scripts/ci-conformance.sh --use-ci-artifacts
-
pull-cluster-api-provider-gcp-conformance
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" ./scripts/ci-conformance.sh
-
pull-cluster-api-provider-gcp-capi-e2e
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" GINKGO_FOCUS="Cluster API E2E tests" ./scripts/ci-e2e.sh
-
pull-cluster-api-provider-gcp-test-release-0-4
./scripts/ci-test.sh
-
pull-cluster-api-provider-gcp-build-release-0-4
./scripts/ci-build.sh
-
pull-cluster-api-provider-gcp-make-release-0-4
runner.sh
./scripts/ci-make.sh
-
pull-cluster-api-provider-gcp-e2e-test-release-0-4
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" ./scripts/ci-e2e.sh
-
pull-cluster-api-provider-gcp-make-conformance-release-0-4
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" ./scripts/ci-conformance.sh --use-ci-artifacts
Github Presubmits Workflows:
-
Markdown-link-check
find . -name \*.md | xargs -I{} markdown-link-check -c .markdownlinkcheck.json {}
-
🟢Lint-check
make lint
Postsubmits
Github Postsubmit Workflows:
- Code-coverage-check
make test-cover
Periodics
Prow Periodics:
- periodic-cluster-api-provider-gcp-build
runner.sh
./scripts/ci-build.sh
- periodic-cluster-api-provider-gcp-test
runner.sh
./scripts/ci-test.sh
- periodic-cluster-api-provider-gcp-make-conformance-v1alpha4
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" ./scripts/ci-conformance.sh
- periodic-cluster-api-provider-gcp-make-conformance-v1alpha4-k8s-ci-artifacts
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" ./scripts/ci-conformance.sh --use-ci-artifacts
- periodic-cluster-api-provider-gcp-conformance-v1alpha4
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" ./scripts/ci-conformance.sh
- periodic-cluster-api-provider-gcp-conformance-v1alpha4-k8s-ci-artifacts
"BOSKOS_HOST"="boskos.test-pods.svc.cluster.local" ./scripts/ci-conformance.sh --use-ci-artifacts
Adding new E2E test
E2E tests verify a complete, real-world workflow ensuring that all parts of the system work together as expected. If you are introducing a new feature that interconnects with other parts of the software, you will likely be required to add a verification step for this functionality with a new E2E scenario (unless it is already covered by existing test suites).
Create a cluster template
The test suite will provision a cluster based on a pre-defined yaml template (stored in ./test/e2e/data
) which is then sourced in ./test/e2e/config/gcp-ci.yaml
. New cluster definitions for E2E tests have to be added and sourced before being available to use in the E2E workflow.
Add test case
When the template is available, you can reference it as a flavor in Go. For example, adding a new test for self-managed cluster provisioning would look like the following:
Context("Creating a control-plane cluster with an internal load balancer", func() {
It("Should create a cluster with 1 control-plane and 1 worker node with an internal load balancer", func() {
By("Creating a cluster with internal load balancer")
clusterctl.ApplyClusterTemplateAndWait(ctx, clusterctl.ApplyClusterTemplateAndWaitInput{
ClusterProxy: bootstrapClusterProxy,
ConfigCluster: clusterctl.ConfigClusterInput{
LogFolder: clusterctlLogFolder,
ClusterctlConfigPath: clusterctlConfigPath,
KubeconfigPath: bootstrapClusterProxy.GetKubeconfigPath(),
InfrastructureProvider: clusterctl.DefaultInfrastructureProvider,
Flavor: "ci-with-internal-lb",
Namespace: namespace.Name,
ClusterName: clusterName,
KubernetesVersion: e2eConfig.GetVariable(KubernetesVersion),
ControlPlaneMachineCount: ptr.To[int64](1),
WorkerMachineCount: ptr.To[int64](1),
},
WaitForClusterIntervals: e2eConfig.GetIntervals(specName, "wait-cluster"),
WaitForControlPlaneIntervals: e2eConfig.GetIntervals(specName, "wait-control-plane"),
WaitForMachineDeployments: e2eConfig.GetIntervals(specName, "wait-worker-nodes"),
}, result)
})
})
In this case, the flavor ci-with-internal-lb
is a reference to the template cluster-template-ci-with-internal-lb.yaml
which is available in ./test/e2e/data/infrastructure-gcp/cluster-template-ci-with-internal-lb.yaml
.
Release Process
Change milestone
- Create a new GitHub milestone for the next release
- Change milestone applier so new changes can be applied to the appropriate release
- Open a PR in https://github.com/kubernetes/test-infra to change this line
- Example PR: https://github.com/kubernetes/test-infra/pull/16827
- Open a PR in https://github.com/kubernetes/test-infra to change this line
Prepare branch, tag and release notes
- Update the file
metadata.yaml
if is a major or minor release - Submit a PR for the
metadata.yaml
update if needed, wait for it to be merged before continuing, and pull any changes prior to continuing. - Create tag with git
export RELEASE_TAG=v0.4.6
(the tag of the release to be cut)git tag -s ${RELEASE_TAG} -m "${RELEASE_TAG}"
-s
creates a signed tag, you must have a GPG key added to your GitHub account
git push upstream ${RELEASE_TAG}
make release
from repo, this will create the release artifacts in theout/
folder- Install the
release-notes
tool according to instructions - Export GITHUB_TOKEN
- Run the release-notes tool with the appropriate commits. Commits range from the first commit after the previous release to the new release commit.
release-notes --org kubernetes-sigs --repo cluster-api-provider-gcp \
--start-sha 1cf1ec4a1effd9340fe7370ab45b173a4979dc8f \
--end-sha e843409f896981185ca31d6b4a4c939f27d975de
--branch <RELEASE_BRANCH_OR_MAIN_BRANCH>
- Manually format and categorize the release notes
Promote image to prod repo
Promote image
- Images are built by the post push images job
- Create a PR in https://github.com/kubernetes/k8s.io to add the image and tag
- Example PR: https://github.com/kubernetes/k8s.io/pull/1462
- Location of image: https://console.cloud.google.com/gcr/images/k8s-staging-cluster-api-gcp/GLOBAL/cluster-api-gcp-controller?rImageListsize=30
To promote the image you should use a tool called cip-mm
, please refer: https://github.com/kubernetes-sigs/promo-tools/tree/main/cmd/cip-mm
For example, we want to promote v0.3.1 release, we can run the following command:
$ cip-mm --base_dir=$GOPATH/src/k8s.io/k8s.io/k8s.gcr.io --staging_repo=gcr.io/k8s-staging-cluster-api-gcp --filter_tag=v0.3.1
Release in GitHub
Create the GitHub release in the UI
- Create a draft release in GitHub and associate it with the tag that was created
- Copy paste the release notes
- Upload artifacts from the
out/
folder - Publish release
- Announce the release
Versioning
cluster-api-provider-gcp follows the semantic versioning specification.
Example versions:
- Pre-release:
v0.1.1-alpha.1
- Minor release:
v0.1.0
- Patch release:
v0.1.1
- Major release:
v1.0.0
Expected artifacts
- A release yaml file
infrastructure-components.yaml
containing the resources needed to deploy to Kubernetes - A
cluster-templates.yaml
for each supported flavor - A
metadata.yaml
which maps release series to cluster-api contract version - Release notes
Communication
Patch Releases
- Announce the release in Kubernetes Slack on the #cluster-api-gcp channel.
Minor/Major Releases
- Follow the communications process for pre-releases
- An announcement email is sent to
kubernetes-sig-cluster-lifecycle@googlegroups.com
with the subject[ANNOUNCE] cluster-api-provider-gcp <version> has been released
Cluster API GCP roadmap
This roadmap is a constant work in progress, subject to frequent revision. Dates are approximations. Features are listed in no particular order.
v0.4 (v1Alpha4)
Description | Issue/Proposal/PR |
---|
v1beta1/v1
Proposal awaits.
Lifecycle frozen
Items within this category have been identified as potential candidates for the project and can be moved up into a milestone if there is enough interest.
Description | Issue/Proposal/PR |
---|---|
Enabling GPU enabled clusters | #289 |
Publish images in GCP | #152 |
Proper bootstrap of manually deleted worker VMs | #173 |
Correct URI for subnetwork setup | #278 |
Workload identity support | #311 |
Implement GCPMachinePool using MIGs | #297 |