728x90

Installation

  • seatunnel Docker Image

[zeta local mode]

  • Dockerfile
    • 사용하는 connector를 미리 Dockerfile 이미지에 넣어 빌드 (기본 zeta engine 사용)
FROM openjdk:8

ENV SEATUNNEL_VERSION="2.3.9"
ENV SEATUNNEL_HOME="/opt/seatunnel"

RUN wget https://dlcdn.apache.org/seatunnel/${SEATUNNEL_VERSION}/apache-seatunnel-${SEATUNNEL_VERSION}-bin.tar.gz
RUN tar -xzvf apache-seatunnel-${SEATUNNEL_VERSION}-bin.tar.gz
RUN mv apache-seatunnel-${SEATUNNEL_VERSION} ${SEATUNNEL_HOME}

# Download necessary driver JAR files and copy to the lib directory
RUN wget -P "$SEATUNNEL_HOME/lib/" "https://repo1.maven.org/maven2/com/mysql/mysql-connector-j/8.3.0/mysql-connector-j-8.3.0.jar" \
    && wget -P "$SEATUNNEL_HOME/lib/" "https://repo1.maven.org/maven2/org/postgresql/postgresql/42.7.3/postgresql-42.7.3.jar" \
    && wget -P "$SEATUNNEL_HOME/lib/" "https://repo1.maven.org/maven2/org/tikv/tikv-client-java/3.3.5/tikv-client-java-3.3.5.jar" \
    && wget -P "$SEATUNNEL_HOME/lib/" "https://repo1.maven.org/maven2/net/postgis/postgis-jdbc/2024.1.0/postgis-jdbc-2024.1.0.jar" \
    && wget -P "$SEATUNNEL_HOME/lib/" "https://repo1.maven.org/maven2/org/neo4j/driver/neo4j-java-driver/5.27.0/neo4j-java-driver-5.27.0.jar"

# Set the working directory
WORKDIR "$SEATUNNEL_HOME"

RUN cd ${SEATUNNEL_HOME} && sh bin/install-plugin.sh ${SEATUNNEL_VERSION}

 

 

  • image 빌드
docker build -t seatunnel:2.3.9 -f Dockerfile .

 

  • Zeta 엔진으로 local에서 실행, seatunnel.streaming.conf 파일 실행 예제
    • STREAMING 모드로 fake data pipeline test

[seatunnel.streaming.conf 예제]

 

env {
  parallelism = 2
  job.mode = "STREAMING"
  checkpoint.interval = 2000
}

source {
  FakeSource {
    parallelism = 2
    plugin_output = "fake"
    row.num = 16
    schema = {
      fields {
        name = "string"
        age = "int"
      }
    }
  }
}

sink {
  Console {
  }
}

 

  • seatunnel-config 설정
kubectl create cm seatunnel-config \
--from-file=seatunnel.streaming.conf=seatunnel.streaming.conf

[seatunnel.yaml 파일 설정]

  • seatunnel.streaming.conf에 정의된 streaming script를 실행
  • 파일은 volumeMounts에서 관리
apiVersion: v1
kind: Pod
metadata:
  name: seatunnel
spec:
  containers:
  - name: seatunnel
    image: seatunnel:2.3.9
    command: ["/bin/sh","-c","/opt/seatunnel/bin/seatunnel.sh --config /data/seatunnel.streaming.conf -e local"]
    resources:
      limits:
        cpu: "1"
        memory: 4G
      requests:
        cpu: "1"
        memory: 2G
    volumeMounts:
      - name: seatunnel-config
        mountPath: /data/seatunnel.streaming.conf
        subPath: seatunnel.streaming.conf
  volumes:
        - name: seatunnel-config
          configMap:
            name: seatunnel-config
            items:
            - key: seatunnel.streaming.conf
              path: seatunnel.streaming.conf

 

  • seatunnel.yaml 적용
kubectl create -f seatunnel.yaml
  • port-forward로 web 실행 로그확인
kubectl port-forward -n default seatunnel 8080:8080

 

 

 

반응형

'🛢️ Database' 카테고리의 다른 글

[Kafka] Kafka Kraft mode Docker 설치  (0) 2024.12.08
데이터 이주 To NAS (synology, dropbox)  (0) 2024.09.03
데이터 품질의 비밀 (4)  (0) 2024.08.20
Data Orchestration  (0) 2024.08.03
데이터 품질의 비밀 (3)  (0) 2024.07.30
다했다