[Airflow] Helm을 사용하여 k8s airflow에 git.sync sidecar 적용
KubernetesExecutor로 재배포할 때 사용한 values.yaml 파일을 사용하여 git-sync sidecar 기능을 사용할 수 있습니다. 해당 파일을 열어보면 gitSync라는 설정 부분이 있습니다. 다만 그전에 git repo를 사용하기 위한 ssh를 생성해주도록 하겠습니다.
1
$ ssh-keygen -t rsa -b 4096 -C "your_email@example.com"
사용할 repo에 ssh-key를 등록하여 줍시다.
그리고 생성한 ssh key를 secret 오브젝트로 생성해줍니다.
1
2
3
4
kubectl create secret generic airflow-ssh-secret \
--from-file=gitSshKey=/home/ysbaek/.ssh/id_rsa \
--from-file=known_hosts=/home/ysbaek/.ssh/known_hosts \
--from-file=id_rsa.pub=/home/ysbaek/.ssh/id_rsa.pub -n airflow
그리고 values.yaml의 enabled, branch, rev, subPath, credentialsSecret, sshKeySecret 부분을 수정하였습니다. 수정 예시는 아래 yaml을 확인해주세요
변경한 value.yaml 예시 - 펼치기
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# Git sync
dags:
persistence:
# Annotations for dags PVC
annotations: {}
# Enable persistent volume for storing dags
enabled: false
# Volume size for dags
size: 1Gi
# If using a custom storageClass, pass name here
storageClassName:
# access mode of the persistent volume
accessMode: ReadWriteOnce
## the name of an existing PVC to use
existingClaim:
## optional subpath for dag volume mount
subPath: ~
gitSync:
enabled: true
# git repo clone url
# ssh example: git@github.com:apache/airflow.git
# https example: https://github.com/apache/airflow.git
repo: git@github.com:{your-id}}/{your-repo}.git
branch: {branch-name}
rev: HEAD
depth: 1
# the number of consecutive failures allowed before aborting
maxFailures: 10 # 이 값 바꾸지 않으면, airflow-triggerer-0 / airflow-scheduler가 restart 됨.
# subpath within the repo where dags are located
# should be "" if dags are at repo root
subPath: "dags/"
# if your repo needs a user name password
# you can load them to a k8s secret like the one below
# ---
# apiVersion: v1
# kind: Secret
# metadata:
# name: git-credentials
# data:
# GIT_SYNC_USERNAME: <base64_encoded_git_username>
# GIT_SYNC_PASSWORD: <base64_encoded_git_password>
# and specify the name of the secret below
#
# credentialsSecret: git-credentials
#
#
# If you are using an ssh clone url, you can load
# the ssh private key to a k8s secret like the one below
# ---
# apiVersion: v1
# kind: Secret
# metadata:
# name: airflow-ssh-secret
# data:
# # key needs to be gitSshKey
# gitSshKey: <base64_encoded_data>
# and specify the name of the secret below
sshKeySecret: airflow-ssh-secret
#
# If you are using an ssh private key, you can additionally
# specify the content of your known_hosts file, example:
#
# knownHosts: |
# <host1>,<ip1> <key1>
# <host2>,<ip2> <key2>
# interval between git sync attempts in seconds
# high values are more likely to cause DAGs to become out of sync between different components
# low values cause more traffic to the remote git repository
wait: 60
containerName: git-sync
uid: 65533
# When not set, the values defined in the global securityContext will be used
securityContext: {}
# runAsUser: 65533
# runAsGroup: 0
securityContexts:
container: {}
# Mount additional volumes into git-sync. It can be templated like in the following example:
# extraVolumeMounts:
# - name: my-templated-extra-volume
# mountPath: path
# readOnly: true
extraVolumeMounts: []
env: []
# Supported env vars for gitsync can be found at https://github.com/kubernetes/git-sync
# - name: ""
# value: ""
resources: {}
# limits:
# cpu: 100m
# memory: 128Mi
# requests:
# cpu: 100m
# memory: 128Mi
이후 helm을 사용하여 airflow를 재배포하여 줍니다.
1
$ helm upgrade --install airflow apache-airflow/airflow -n airflow -f values.yaml --debug
배포 완료 후, Pod 상태를 확인해줍니다.
Leave a comment