dolphinscheduler(3.1.4)

IP1 master 5678:接收来自 Worker 节点的请求,并管理任务调度和工作流执行。 5679
worker 1234:主要用于Worker节点向Master节点汇报心跳、接收任务分配等 1235:Worker节点上运行的自定义插件与外部系统进行通信
api-server 12345:Web UI 端口,用于展示 Master 节点的状态和管理界面
IP2 worker 1234:主要用于Worker节点向Master节点汇报心跳、接收任务分配等 1235:Worker节点上运行的自定义插件与外部系统进行通信
IP3 worker 1234:主要用于Worker节点向Master节点汇报心跳、接收任务分配等 1235:Worker节点上运行的自定义插件与外部系统进行通信

前置准备工作

dolphinScheduler需要依赖jdk、hadoop、zookeeper、postgresql等,部署集群前请先完成以上部署

配置用户免密及权限

创建部署用户,并且一定要配置 sudo 免密。以创建 dolphinscheduler 用户为例

1
2
3
4
5
6
7
8
9
10
11
12
# 创建用户需使用 root 登录
useradd app

# 添加密码
echo "app" | passwd --stdin dolphinscheduler

# 配置 sudo 免密
sed -i '$app ALL=(ALL) NOPASSWD: NOPASSWD: ALL' /etc/sudoers
sed -i 's/Defaults requirett/#Defaults requirett/g' /etc/sudoers

# 修改目录权限,使得部署用户对二进制包解压后的 apache-dolphinscheduler-*-bin 目录有操作权限
chown -R app:app apache-dolphinscheduler-*-bin

注:以上需要全部集群都配置

配置机器SSH免密登陆

1
2
3
4
5
su dolphinscheduler

ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys

注意: 配置完成后,可以通过运行命令 ssh localhost 判断是否成功,如果不需要输入密码就能ssh登陆则证明成功

修改相关配置

完成基础环境的准备后,需要根据你的机器环境修改配置文件。配置文件可以在目录 bin/env 中找到,他们分别是 并命名为 install_env.sh 和 dolphinscheduler_env.sh

修改 **install_env.sh** 文件

文件 install_env.sh 描述了哪些机器将被安装 DolphinScheduler 以及每台机器对应安装哪些服务。您可以在路径 bin/env/install_env.sh 中找到此文件,可通过以下方式更改env变量,export <ENV_NAME>=,配置详情如下。

注:使用hosts可能会有一些节点故障死锁问题,使用具体ip解决

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
ips=${ips:-"IP1,IP2,IP3"}

# Port of SSH protocol, default value is 22. For now we only support same port in all `ips` machine
# modify it if you use different ssh port
sshPort=${sshPort:-"22"}

# A comma separated list of machine hostname or IP would be installed Master server, it
# must be a subset of configuration `ips`.
# Example for hostnames: masters="ds1,ds2", Example for IPs: masters="192.168.8.1,192.168.8.2"
masters=${masters:-"IP1,IP2"}

# A comma separated list of machine <hostname>:<workerGroup> or <IP>:<workerGroup>.All hostname or IP must be a
# subset of configuration `ips`, And workerGroup have default value as `default`, but we recommend you declare behind the hosts
# Example for hostnames: workers="ds1:default,ds2:default,ds3:default", Example for IPs: workers="192.168.8.1:default,192.168.8.2:default,192.168.8.3:default"
workers=${workers:-"IP1:default,IP2:default,IP3:default"}

# A comma separated list of machine hostname or IP would be installed Alert server, it
# must be a subset of configuration `ips`.
# Example for hostname: alertServer="ds3", Example for IP: alertServer="192.168.8.3"
alertServer=${alertServer:-"IP3"}

# A comma separated list of machine hostname or IP would be installed API server, it
# must be a subset of configuration `ips`.
# Example for hostname: apiServers="ds1", Example for IP: apiServers="192.168.8.1"
apiServers=${apiServers:-"IP1"}

# The directory to install DolphinScheduler for all machine we config above. It will automatically be created by `install.sh` script if not exists.
# Do not set this configuration same as the current path (pwd). Do not add quotes to it if you using related path.
installPath=${installPath:-"/data/dolphinscheduler"}

# The user to deploy DolphinScheduler for all machine we config above. For now user must create by yourself before running `install.sh`
# script. The user needs to have sudo privileges and permissions to operate hdfs. If hdfs is enabled than the root directory needs
# to be created by this user
deployUser=${deployUser:-"app"}

# The root of zookeeper, for now DolphinScheduler default registry server is zookeeper.
zkRoot=${zkRoot:-"/dolphinscheduler"}

修改 **dolphinscheduler_env.sh** 文件

ps:只配置需要的就行

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

# JAVA_HOME, will use it to start DolphinScheduler server
export JAVA_HOME=${JAVA_HOME:-/data/jdk1.8.0_231}

# Database related configuration, set database type, username and password
export DATABASE=${DATABASE:-postgresql}
export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_DATASOURCE_URL="jdbc:postgresql://IP6:5432/dolphinscheduler"
export SPRING_DATASOURCE_USERNAME="app"
export SPRING_DATASOURCE_PASSWORD="yourpassword"

# DolphinScheduler server related configuration
export SPRING_CACHE_TYPE=${SPRING_CACHE_TYPE:-none}
export SPRING_JACKSON_TIME_ZONE=${SPRING_JACKSON_TIME_ZONE:-UTC}
export MASTER_FETCH_COMMAND_NUM=${MASTER_FETCH_COMMAND_NUM:-10}

# Registry center configuration, determines the type and link of the registry center
export REGISTRY_TYPE=${REGISTRY_TYPE:-zookeeper}
export REGISTRY_ZOOKEEPER_CONNECT_STRING=${REGISTRY_ZOOKEEPER_CONNECT_STRING:-queryServer1:2181,queryServer2:2181,queryServer3:2181}

# Tasks related configurations, need to change the configuration if you use the related tasks.
export HADOOP_HOME=${HADOOP_HOME:-/data/hadoop-3.3.0}
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-/data/hadoop-3.3.0/etc/hadoop}
export SPARK_HOME1=${SPARK_HOME1:-/data/spark}
export SPARK_HOME2=${SPARK_HOME2:-/opt/soft/spark2}
export PYTHON_HOME=${PYTHON_HOME:-/opt/soft/python}
export HIVE_HOME=${HIVE_HOME:-/opt/soft/hive}
export FLINK_HOME=${FLINK_HOME:-/opt/soft/flink}
export DATAX_HOME=${DATAX_HOME:-/opt/soft/datax}
export SEATUNNEL_HOME=${SEATUNNEL_HOME:-/opt/soft/seatunnel}
export CHUNJUN_HOME=${CHUNJUN_HOME:-/opt/soft/chunjun}

export PATH=$HADOOP_HOME/bin:$SPARK_HOME1/bin:$SPARK_HOME2/bin:$PYTHON_HOME/bin:$JAVA_HOME/bin:$HIVE_HOME/bin:$FLINK_HOME/bin:$DATAX_HOME/bin:$SEATUNNEL_HOME/bin:$CHUNJUN_HOME/bin:$PATH

安装初始化数据库

对于 PostgreSQL:

1
2
3
4
5
6
7
8
9
10
11
12
# 采用命令行工具登陆 PostgreSQL
psql
# 创建数据库
postgres=# CREATE DATABASE dolphinscheduler;
# 修改 {user} 和 {password} 为你希望的用户名和密码
postgres=# CREATE USER {user} PASSWORD {password};
postgres=# ALTER DATABASE dolphinscheduler OWNER TO {user};
# 退出 PostgreSQL
postgres=#\q
# 在终端执行如下命令,向配置文件新增登陆权限,并重载 PostgreSQL 配置,替换 {ip} 为对应的 DS 集群服务器 IP 地址段
echo "host dolphinscheduler {user} {ip} md5" >> $PGDATA/pg_hba.conf
pg_ctl reload

对于 PostgreSQL (dolphinscheduler_env.sh配置文件):

1
2
3
4
5
6
# for postgresql
export DATABASE=${DATABASE:-postgresql}
export SPRING_PROFILES_ACTIVE=${DATABASE}
export SPRING_DATASOURCE_URL="jdbc:postgresql://127.0.0.1:5432/dolphinscheduler"
export SPRING_DATASOURCE_USERNAME={user}
export SPRING_DATASOURCE_PASSWORD={password}

完成上述步骤后,您已经为 DolphinScheduler 创建一个新数据库,现在你可以通过快速的 Shell 脚本来初始化数据库

1
bash tools/bin/upgrade-schema.sh

启动 DolphinScheduler【computeServer1主机执行】

使用上面创建的部署用户运行以下命令完成部署,部署后的运行日志将存放在 logs 文件夹内

1
2
bash ./bin/install.sh

注意: 第一次部署的话,可能出现 5 次sh: bin/dolphinscheduler-daemon.sh: No such file or directory相关信息,此为非重要信息直接忽略即可

出现下图所示即为成功

image.png

默认账号密码:admin/dolphinscheduler123