[Airflow] Helm을 사용하여 k8s airflow에 git.sync sidecar 적용


KubernetesExecutor로 재배포할 때 사용한 values.yaml 파일을 사용하여 git-sync sidecar 기능을 사용할 수 있습니다. 해당 파일을 열어보면 gitSync라는 설정 부분이 있습니다. 다만 그전에 git repo를 사용하기 위한 ssh를 생성해주도록 하겠습니다.

1
$ ssh-keygen -t rsa -b 4096 -C "your_email@example.com"


사용할 repo에 ssh-key를 등록하여 줍시다.

image

그리고 생성한 ssh key를 secret 오브젝트로 생성해줍니다.

1
2
3
4
kubectl create secret generic airflow-ssh-secret \
--from-file=gitSshKey=/home/ysbaek/.ssh/id_rsa \
--from-file=known_hosts=/home/ysbaek/.ssh/known_hosts \
--from-file=id_rsa.pub=/home/ysbaek/.ssh/id_rsa.pub -n airflow

image


그리고 values.yaml의 enabled, branch, rev, subPath, credentialsSecret, sshKeySecret 부분을 수정하였습니다. 수정 예시는 아래 yaml을 확인해주세요

변경한 value.yaml 예시 - 펼치기
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
# Git sync
dags:
  persistence:
    # Annotations for dags PVC
    annotations: {}
    # Enable persistent volume for storing dags
    enabled: false
    # Volume size for dags
    size: 1Gi
    # If using a custom storageClass, pass name here
    storageClassName:
    # access mode of the persistent volume
    accessMode: ReadWriteOnce
    ## the name of an existing PVC to use
    existingClaim:
    ## optional subpath for dag volume mount
    subPath: ~
  gitSync:
    enabled: true

    # git repo clone url
    # ssh example: git@github.com:apache/airflow.git
    # https example: https://github.com/apache/airflow.git
    repo: git@github.com:{your-id}}/{your-repo}.git
    branch: {branch-name}
    rev: HEAD
    depth: 1
    # the number of consecutive failures allowed before aborting
    maxFailures: 10 # 이 값 바꾸지 않으면, airflow-triggerer-0 / airflow-scheduler가 restart 됨.
    # subpath within the repo where dags are located
    # should be "" if dags are at repo root
    subPath: "dags/"
    # if your repo needs a user name password
    # you can load them to a k8s secret like the one below
    #   ---
    #   apiVersion: v1
    #   kind: Secret
    #   metadata:
    #     name: git-credentials
    #   data:
    #     GIT_SYNC_USERNAME: <base64_encoded_git_username>
    #     GIT_SYNC_PASSWORD: <base64_encoded_git_password>
    # and specify the name of the secret below
    #
    # credentialsSecret: git-credentials
    #
    #
    # If you are using an ssh clone url, you can load
    # the ssh private key to a k8s secret like the one below
    #   ---
    #   apiVersion: v1
    #   kind: Secret
    #   metadata:
    #     name: airflow-ssh-secret
    #   data:
    #     # key needs to be gitSshKey
    #     gitSshKey: <base64_encoded_data>
    # and specify the name of the secret below
    sshKeySecret: airflow-ssh-secret
    #
    # If you are using an ssh private key, you can additionally
    # specify the content of your known_hosts file, example:
    #
    # knownHosts: |
    #    <host1>,<ip1> <key1>
    #    <host2>,<ip2> <key2>

    # interval between git sync attempts in seconds
    # high values are more likely to cause DAGs to become out of sync between different components
    # low values cause more traffic to the remote git repository
    wait: 60
    containerName: git-sync
    uid: 65533

    # When not set, the values defined in the global securityContext will be used
    securityContext: {}
    #  runAsUser: 65533
    #  runAsGroup: 0

    securityContexts:
      container: {}

    # Mount additional volumes into git-sync. It can be templated like in the following example:
    #   extraVolumeMounts:
    #     - name: my-templated-extra-volume
    #       mountPath: path
    #       readOnly: true
    extraVolumeMounts: []
    env: []
    # Supported env vars for gitsync can be found at https://github.com/kubernetes/git-sync
    # - name: ""
    #   value: ""

    resources: {}
    #  limits:
    #   cpu: 100m
    #   memory: 128Mi
    #  requests:
    #   cpu: 100m
    #   memory: 128Mi


이후 helm을 사용하여 airflow를 재배포하여 줍니다.

1
$ helm upgrade --install airflow apache-airflow/airflow -n airflow -f values.yaml --debug


배포 완료 후, Pod 상태를 확인해줍니다.

image

Updated:

Leave a comment