集群准备工作

准备工作

2.SeaTunnel 安装

(1)下载seatunnel安装包

Apache SeaTunnel

(2)解压下载好的tar.gz包

tar -zxvf /export/server/apache-seatunnel-2.3.3-bin.tar.gz -C ./

(3)查看Seatunnel使用的脚本

cd /export/server/apache-seatunnel-2.3.3目录下

install-plugin.sh                              --安装连接器脚本
seatunnel-cluster.sh                           -–集群模式启动脚本
seatunnel-cluster.sh                           --本地模式启动脚本
start-seatunnel-flink-13-connector-v2.sh       –-flink1.2-1.4版本引擎启动脚本
start-seatunnel-flink-15-connector-v2.sh       –-flink1.5-1.6版本引擎启动脚本
start-seatunnel-spark-2-connector-v2.sh        –-saprk2.x版本引擎启动脚本
start-seatunnel-spark-3-connector-v2.sh        –-saprk3.x版本引擎启动脚本

(4)下载连接器

cd /export/server/apache-seatunnel-2.3.3
./bin/install-plugin.sh

3.配置环境变量

在/etc/profile.d/seatunnel.sh 中配置环境变量

这里其实应该是修改 /etc/profile,添加配置:

export SEATUNNEL_HOME=/export/server/apache-seatunnel-2.3.3
export PATH=$PATH:$SEATUNNEL_HOME/bin

立刻生效并且验证。

source /etc/profile           #使环境变量生效
echo $SEATUNNEL_HOME  #查看变量是否生效

4.配置 SeaTunnel Engine JVM

将 JVM 选项添加到$SEATUNNEL_HOME/bin/seatunnel-cluster.sh第一行

JAVA_OPTS="-Xms2G -Xmx2G"

这个只是指定 jvm 的大小,结合实际调整。

5.配置SeaTunnel

配置$SEATUNNEL_HOME/config/seatunnel.yaml文件

eg:

seatunnel:
  engine:
    history-job-expire-minutes: 1440
    backup-count: 1
    queue-type: blockingqueue
    print-execution-info-interval: 60
    print-job-metrics-info-interval: 60
    slot-service:
      dynamic-slot: true
    checkpoint:
      interval: 10000
      timeout: 60000
      storage:
        type: hdfs
        max-retained: 3
        plugin-config:
          namespace: /tmp/seatunnel/checkpoint_snapshot
          storage.type: hdfs
          fs.defaultFS: hdfs://cdh01:8020 # Ensure that the directory has written permission

本地测试只用本地文件的方式,并且创建爱你对应的 checkpoint 文件夹

mkdir -p ~/bigdata/seatunnel/checkpoint

本地文件如下:

seatunnel:
    engine:
        backup-count: 1
        queue-type: blockingqueue
        print-execution-info-interval: 60
        slot-service:
            dynamic-slot: true
        checkpoint:
            interval: 300000
            timeout: 10000
            storage:
                type: localfile
                max-retained: 3
                plugin-config:
                    # 这里改成 linux 的对应文件路径
                    namespace: C:\ProgramData\seatunnel\checkpoint\

6.配置SeaTunnel引擎

配置 $SEATUNNEL_HOME/config/hazelcast.yaml 文件

hazelcast:
  cluster-name: seatunnel
  network:
    rest-api:
      enabled: true
      endpoint-groups:
        CLUSTER_WRITE:
          enabled: true
        DATA:
          enabled: true
    join:
      tcp-ip:
        enabled: true
        member-list:
          # 指定集群的对应集群 IP
          - ip1
          - ip2
          - ip3
    port:
      auto-increment: false
      port: 5801
  properties:
    hazelcast.invocation.max.retry.count: 20
    hazelcast.tcp.join.port.try.count: 30
    hazelcast.logging.type: log4j2
    hazelcast.operation.generic.thread.count: 50

7.配置 SeaTunnel 引擎服务器

配置 $SEATUNNEL_HOME/config/hazelcast-client.yaml 文件

cluster-name客户端必须与 SeaTunnel Engine相同。否则,SeaTunnel Engine 将拒绝客户端请求。

eg:

hazelcast-client:
  cluster-name: seatunnel

  network:
    cluster-members:
      - localhost:5801
      - localhost:5802
      - localhost:5803

8.部署SeaTunnel分布式集群

(1) 拷贝安装包和配置文件

把安装包+配置文件拷贝到其他项目。

cd /export/server
   scp -r apache-seatunnel-2.3.3/ root@cdh02:$PWD
   scp -r apache-seatunnel-2.3.3/ root@cdh03:$PWD

cd /etc/profile.d/
   scp /etc/profile.d/seatunnel.sh root@cdh02:$PWD
   scp /etc/profile.d/seatunnel.sh root@cdh03:$PWD

(2)启动SeaTunnel集群

mkdir -p $SEATUNNEL_HOME/logs  -- 如果有请忽略
nohup $SEATUNNEL_HOME/bin/seatunnel-cluster.sh 2>&1 &   -- 每个节点启动集群

日志将写入 $SEATUNNEL_HOME/logs/seatunnel-engine-server.log

fake 测试验证:

#进入安装目录
$   cd /home/dh/bigdata/seatunnel-2.3.3/backend/apache-seatunnel-2.3.3

# 启动服务
$   ./bin/seatunnel.sh --config ./config/v2.batch.config.template -e local

(3)任务提交命令

$SEATUNNEL_HOME/bin/seatunnel.sh --config $SEATUNNEL_HOME/config/v2.batch.config.template

(4)任务停止命令

在$SEATUNNEL_HOME/logs/seatunnel-engine-server.log日志中查找运行的job_id

${SEATUNNEL_HOME}/bin/seatunnel.sh -can 749188983002497026   --job_id
$SEATUNNEL_HOME/bin/stop-seatunnel-cluster.sh  -- 停止集群

参考资料

https://blog.csdn.net/jenya007/article/details/132599219