Terraform+Elemental 快速构建 RKE2 集群

前言

Terraform 是基础设施即代码 (IaC) 工具,允许用户使用声明性语言定义和管理基础设施资源 它支持跨各种云提供商和本地环境实现基础设施配置和管理的自动化。Terraform 使用配置语言 (HCL) 来定义基础设施组件,从而轻松实现基础设施配置的版本控制、重用和共享。

Elemental 是由 SUSE Rancher 团队推出的开源项目,旨在通过 Kubernetes 原生技术实现操作系统的云原生管理。它主要解决大规模集群环境中操作系统的构建、部署、升级和维护的自动化问题,尤其适用于边缘计算、数据中心及混合云场景。

先决条件

Env Version
Rancher v2.9.2
Elemental 1.6.4
Terraform v1.12.1
terraform-provider-rancher2 5.2.0

build OS

  1. 创建 Registration Endpoints
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
config:
cloud-config:
users:
- name: root
passwd: root
elemental:
install:
device-selector:
- key: Name
operator: In
values:
- /dev/sda
- /dev/vda
- /dev/nvme0
- key: Size
operator: Gt
values:
- 25Gi
reboot: true
snapshotter:
type: btrfs
reset:
reboot: true
reset-oem: true
reset-persistent: true
machineInventoryLabels:
BlockDevices: ${System Data/Block Devices/Number Devices}
CPUCores: ${System Data/CPU/Total Cores}
CPUModel: ${System Data/CPU/Model}
CPUThreads: ${System Data/CPU/Total Threads}
CPUVender: ${System Data/CPU/Vendor}
GPUVender: ${System Data/GPU/Vendor}
Hostname: ${System Data/Runtime/Hostname}
NetworkInterfaces: ${System Data/Network/Number Interfaces}
TotalMemoryBytes: ${System Data/Memory/Total Physical Bytes}
machineUUID: ${System Information/UUID}
manufacturer: ${System Information/Manufacturer}
serialNumber: ${System Information/Serial Number}
env: dev

这里的 label 可以参考 Elemental 官方文档说明

  1. 构建镜像。

构建完后,点击右边的 Download Media 下载镜像

创建 Inventory of Machines

  1. 先将镜像导入到虚拟化平台上,例如这里导入到 Harvester 。
  1. 创建虚拟机,这里创建的流程与其他虚拟机一样,但是在最后一步要勾选这两个选项:Enable TPMBooting in EFI mode
  1. 等待节点注册上来,并且为 active 状态即注册成功。

这里如果节点一直注册不上来,是因为网络的原因,需要想办法科学上网一下。

Terraform 构建 RKE2 集群

  1. 下载 Terraform 到任意一台可以访问 Rancher 的主机上。

Terraform 下载方式可以参考 Terraform 官方文档地址,这里我选择直接下载二进制文件。

1
2
3
wget https://releases.hashicorp.com/terraform/1.12.1/terraform_1.12.1_linux_amd64.zip
unzip terraform_1.12.1_linux_amd64.zip
mv terraform /usr/local/bin/.
  1. 配置初始化 main.tf文件。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
cat > main.tf << EOF 
terraform {
required_providers {
rancher2 = {
source = "rancher/rancher2"
version = "5.2.0"
}
}
}

provider "rancher2" {
api_url = "https://rancher.zerchin.xyz"
access_key = "token-xxx"
secret_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
insecure = true
}
EOF

参数说明:

  • api_url:Rancher URL 地址
  • access_key & secret_key:这里需要到Rancher UI 上创建一个 API Key

初始化Rancher 连接。

1
terraform init

输出如下结果,说明连接成功。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
Initializing the backend...
Initializing provider plugins...
- Finding rancher/rancher2 versions matching "5.2.0"...
- Installing rancher/rancher2 v5.2.0...
- Installed rancher/rancher2 v5.2.0 (signed by a HashiCorp partner, key ID 2EEB0F9AD44A135C)
Partner and community providers are signed by their developers.
If you'd like to know more about provider signing, you can read about it here:
https://developer.hashicorp.com/terraform/cli/plugins/signing
Terraform has created a lock file .terraform.lock.hcl to record the provider
selections it made above. Include this file in your version control repository
so that Terraform can guarantee to make the same selections by default when
you run "terraform init" in the future.

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
  1. 创建MachineInventorySelectorTemplate节点资源池选择器,这里我们通过matchExpressions匹配我们刚刚设定的标签env=dev
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
cat > pool.yaml << EOF 
apiVersion: elemental.cattle.io/v1beta1
kind: MachineInventorySelectorTemplate
metadata:
name: dev-pool-1
namespace: fleet-default
spec:
template:
spec:
selector:
matchExpressions:
- key: env
operator: In
values:
- dev
EOF

导入到 Rancher local 集群中。

1
kubectl create -f pool.yaml
  1. 配置初始化 RKE2 集群和 elemental 节点池参数,在 main.tf 文件中新增如下配置:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
resource "rancher2_cluster_v2" "demo" {
name = "demo"
kubernetes_version = "v1.30.13+rke2r1"
rke_config {
machine_pools {
name = "master"
etcd_role = true
control_plane_role = true
worker_role = true
quantity = 1
machine_config {
api_version = "elemental.cattle.io/v1beta1"
kind = "MachineInventorySelectorTemplate"
name = "dev-pool-1"
}
machine_labels = {
team = "devops"
}
}
}
}

主要参数说明:

  • rancher2_cluster_v2:使用 rancher2_cluster_v2 资源创建和管理 RKE2 集群
  • name:集群名称
  • kubernetes_version:RKE2 版本,这里可以在 Rancher UI 上创建一个空集群看看当前支持什么版本
  • rke_config:集群相关配置都在这里进行配置
    • machine_pools:节点池配置
    • etcd_role&control_plane_role&worker_role:节点角色选择
    • quantity:该节点池的数量
    • machine_config:节点池连接方式,这里按照这个格式去填写即可,其中 name 即为上一步构建 MachineInventorySelectorTemplate资源名称
  • machine_labels:节点标签
  1. 使用 Terraform 创建集群
1
terraform apply

输出如下内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create

Terraform will perform the following actions:

# rancher2_cluster_v2.demo will be created
+ resource "rancher2_cluster_v2" "demo" {
+ annotations = (known after apply)
+ cluster_registration_token = (sensitive value)
+ cluster_v1_id = (known after apply)
+ enable_network_policy = (known after apply)
+ fleet_namespace = "fleet-default"
+ id = (known after apply)
+ kube_config = (sensitive value)
+ kubernetes_version = "v1.30.13+rke2r1"
+ labels = (known after apply)
+ name = "demo"
+ resource_version = (known after apply)

+ rke_config {
+ etcd (known after apply)
+ machine_pools {
+ annotations = (known after apply)
+ control_plane_role = true
+ drain_before_delete = false
+ etcd_role = true
+ labels = (known after apply)
+ machine_labels = {
+ "team" = "devops"
}
+ name = "master"
+ paused = false
+ quantity = 1
+ worker_role = true

+ machine_config {
+ api_version = "elemental.cattle.io/v1beta1"
+ kind = "MachineInventorySelectorTemplate"
+ name = "dev-pool-1"
}
}
+ machine_selector_config (known after apply)
+ machine_selector_files (known after apply)
}
}

Plan: 1 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.

Enter a value: yes

这里输入yes,集群就开始创建,并等待集群创建出来。

1
2
3
4
5
6
7
8
9

rancher2_cluster_v2.demo: Creating...
rancher2_cluster_v2.demo: Still creating... [00m10s elapsed]
rancher2_cluster_v2.demo: Still creating... [00m20s elapsed]
rancher2_cluster_v2.demo: Still creating... [00m30s elapsed]
rancher2_cluster_v2.demo: Still creating... [00m40s elapsed]
rancher2_cluster_v2.demo: Still creating... [00m50s elapsed]
rancher2_cluster_v2.demo: Still creating... [01m00s elapsed]
rancher2_cluster_v2.demo: Still creating... [01m10s elapsed]

进入 Rancher UI,可以看到集群正在创建中。

输出如下内容, 集群创建成功。

1
2
3
4
5
6
rancher2_cluster_v2.demo: Still creating... [08m50s elapsed]
rancher2_cluster_v2.demo: Still creating... [09m00s elapsed]
rancher2_cluster_v2.demo: Still creating... [09m10s elapsed]
rancher2_cluster_v2.demo: Creation complete after 9m17s [id=fleet-default/demo]

Apply complete! Resources: 1 added, 0 changed, 0 destroyed.