orchestrator后台依赖MySQL或者SQLite存储管理数据,以MySQL为例,搭建Orchestrator环境,需要先搭建⼀个MySQL后台数据
库,MySQL具体搭建过程不再详细介绍,搭建完,将MySQL账号密码等信息写⼊配置⽂件,如下:
"MySQLOrchestratorHost": "127.0.0.1",
"MySQLOrchestratorPort": 3306,
"MySQLOrchestratorDatabase": "orchestrator",
"MySQLOrchestratorUser": "root",
"MySQLOrchestratorPassword": "123456",
如果觉得安装MySQL太⿇烦,只想快速体验⼀下Orchestrator,建议使⽤SQLite,只需在配置⽂件中写⼊如下配置:
"BackendDB": "sqlite",
"SQLite3DataFile": "/root/orchestrator/orchestrator.sqlite3",
4. 执⾏命令
orchestrator 通过 -c 来执⾏具体的命令,通过 orchestrator help 查看所有命令的帮助⽂档, orchestrator help relocate 查看具体命令relocate的帮助⽂档。
orchestrator 提供的命令很多,这⾥提⼀些⽐较重要和常⽤的命令,没有提到的可⾃⾏去⽂档或者源码中查看。
⽐如执⾏⼀个命令:
./orchestrator --config=./f.json -c discover -i mysql_host_name
4.1 MySQL实例管理命令
discover
forget
begin-maintenance
end-maintenance
in-maintenance
begin-downtime
end-downtime
discover
⽤于发现实例以及该实例的主、从库信息,将获取到的信息写⼊后台数据库database_instance等相关表
orchestrator --config=./f.json -c discover -i host_name
forget
移除实例信息,即从database_instance表中删除相关记录
orchestrator --config=./f.json -c forget -i host_name
begin-maintenance
标记⼀个实例进⼊维护模式,在database_instance_maintenance表中插⼊记录
orchestrator -c begin-maintenance -lock --duration=3h --reason="load testing; do not disturb"
end-maintenance
mysql存储文档标记⼀个实例退出维护模式,即更新 database_instance_maintenance 表中相关记录
orchestrator -c end-maintenance -i locked.instance
in-maintenance
查询实例是否处于维护模式,从表database_instance_maintenance中查询
orchestrator -c in-maintenance -i locked.instance
begin-downtime
标记⼀个实例进⼊下线模式,在database_instance_downtime表中插⼊记录
orchestrator -c begin-downtime -downtime --duration=3h --reason="dba handling; do not do recovery"
end-downtime
标记⼀个实例退出下线模式,在database_instance_downtime表中删除记录
orchestrator -c end-downtime -i downtimed.instance
4.2 MySQL实例信息查询命令
find
search
clusters
clusters-alias
all-clusters-masters
topology
topology-tabulated
all-instances
which-instance
which-cluster
which-cluster-domain
which-heuristic-domain-instance
which-cluster-master
which-cluster-instances
which-cluster-osc-replicas
which-cluster-gh-ost-replicas
which-master
which-downtimed-instances
which-replicas
which-lost-in-recovery
instance-status
get-cluster-heuristic-lag
find
通过正则表达式搜索实例名
orchestrator -c find -pattern "backup.*us-east"
search
通过关键字匹配搜索实例名
orchestrator -c search -pattern "search string"
clusters
输出所有的MySQL集名称,通过sql查询database_instance相关表获取
orchestrator -c clusters
clusters-alias
输出所有MySQL集名称以及别名
orchestrator -c clusters-alias
all-clusters-masters
输出所有MySQL集可写的主库信息
orchestrator -c all-clusters-masters
topology
输出实例所属集的拓扑信息
orchestrator -c topology -i a.topology
topology-tabulated
输出实例所属集的拓扑信息,类似topology命令,输出格式稍有不同orchestrator -c topology-tabulated -i a.topology
all-instances
输出所有已知的实例
orchestrator -c all-instances
which-instance
输出实例的完整的信息
orchestrator -c which-instance -heck
which-cluster
输出MySQL实例所属的集名称
orchestrator -c which-cluster -heck
which-cluster-domain
输出MySQL实例所属集的域名
orchestrator -c which-cluster-domain -heck
which-heuristic-domain-instance
给定⼀个集域名,输出与其关联的可写的实例
orchestrator -c which-heuristic-domain-instance -alias some_alias
which-cluster-master
输出实例所属集的主库信息
orchestrator -c which-cluster-master -heck
which-cluster-instances
输出实例所属集的所有实例信息
orchestrator -c which-cluster-instances -heck
which-master
列出实例所属集的主库信息,与which-cluster-master类似
orchestrator -c which-master -i plica
which-downtimed-instances
列出处于下线状态的实例
orchestrator -c which-downtimed-instances
which-replicas
输出实例的从库信息
orchestrator -c which-replicas -i a.known.instance
which-lost-in-recovery
输出处于下线状态,在故障恢复过程中丢失的实例
orchestrator -c which-lost-in-recovery
instance-status
输出实例的状态信息
orchestrator -c instance-status -investigate
get-cluster-heuristic-lag
输出实例所属集的最⼤延迟信息
orchestrator -c get-cluster-heuristic-lag -i instance.that.is.part.of.cluster
4.3 故障恢复命令
recover
recover-lite
force-master-failover
force-master-takeover
graceful-master-takeover
replication-analysis
ack-all-recoveries
ack-cluster-recoveries
ack-instance-recoveries
relocate
recover
主库故障切换,主库必须关闭,执⾏才有效果, -i 参数必须是已经关闭的主库, 新主库不需要指定,由orchestrator⾃⼰选择。orchestrator -c recover -i dead.instance --debug
recover-lite
主库故障切换,与recover类似,简化的部分操作,更加轻量化。
orchestrator -c recover-lite -i dead.instance --debug
force-master-failover
不管主库是否正常,强制故障切换,切换后主库不关闭,新主库不需要指定,由orchestrator选择。这个操作⽐较危险,谨慎使⽤。orchestrator -c force-master-failover
force-master-takeover
不管主库是否正常,强制主从切换,-i指定集中任⼀实例,-d 指定新主库, 注意 切换后旧主库不会指向新主库,需要⼿动操作。orchestrator -c force-master-takeover -i levant.cluster -d immediate.child.of.master
graceful-master-takeover
主从切换,旧主库会指向新主库,但是复制线程是停⽌的,需要⼈⼯⼿动执⾏start slave,恢复复制。
orchestrator -c graceful-master-takeover -i levant.cluster -d immediate.child.of.master
replication-analysis
根据已有的拓扑关系分析潜在的故障事件,分析结果输出格式不稳定,未来可能改变,建议不要使⽤该功能。
orchestrator -c replication-analysis
ack-all-recoveries
ack-cluster-recoveries
ack-instance-recoveries
确认已有的故障恢复,防⽌未来再次发⽣故障时,会阻塞故障切换
orchestrator -c ack-all-recoveries --reason="dba has taken taken necessary steps"
orchestrator -c ack-cluster-recoveries -i instance.in.a.cluster --reason="reson message"
orchestrator -c ack-instance-recoveries -i instance.that.failed --reason="reson message"
relocate
调整拓扑结构,-i 指定的实例更改为 -d 指定实例的从库。
orchestrator -c relocate -locate -d instance.that.becomes.its.master
5. ⾃动故障切换
Orchestrator能够配置成⾃动检测主库故障,并完成故障切换。