搭建MySQL InnoDB 集群

介绍

东西既然是别人的,自然是人家的介绍最准确喽(就知道偷懒)

MySQL InnoDB Cluster 为 MySQL 提供了完整的高可用性解决方案。通过使用 MySQL Shell中包含的 AdminAPI,您可以轻松地配置和管理一组至少三个 MySQL 服务器实例以充当 InnoDB 集群。

InnoDB Cluster 中的每个 MySQL 服务器实例都运行 MySQL Group Replication,它提供了在 InnoDB Cluster 中复制数据的机制,具有内置的故障转移功能。AdminAPI 消除了在 InnoDB 集群中直接使用 Group Replication 的需要,但有关更多信息,请参阅 解释详细信息的Group Replication。从 MySQL 8.0.27 开始,您还可以设置 InnoDB ClusterSet(请参阅 第 8 章,MySQL InnoDB ClusterSet),通过将主 InnoDB Cluster 与其在备用位置的一个或多个副本链接起来,为 InnoDB Cluster 部署提供容灾能力,例如不同的数据中心。

MySQL Router可以根据您部署的集群自动配置自己,将客户端应用程序透明地连接到服务器实例。如果服务器实例发生意外故障,集群会自动重新配置。在默认的单主模式下,InnoDB Cluster 有一个读写服务器实例——主实例。多个辅助服务器实例是主服务器的副本。如果主服务器发生故障,辅助服务器会自动提升为主服务器的角色。MySQL Router 检测到这一点并将客户端应用程序转发到新的主节点。高级用户还可以将集群配置为具有多个主节点。

下图概述了这些技术如何协同工作:

图 InnoDB Cluster 概览

三个 MySQL 服务器组合在一起作为一个高可用性集群。 其中一台服务器是读/写主实例,另外两台是只读从实例。 Group Replication 用于将数据从主实例复制到辅助实例。 MySQL Router 将客户端应用程序(在本例中为 MySQL 连接器)连接到主实例。

InnoDB Cluster 由至少三个 MySQL Server 实例组成,它提供高可用性和扩展功能。InnoDB Cluster 使用以下 MySQL 技术:

  • MySQL Shell,它是 MySQL 的高级客户端和代码编辑器。

  • MySQL Server 和Group Replication,它使一组 MySQL 实例能够提供高可用性。InnoDB Cluster 提供了一种替代的、易于使用的编程方式来使用组复制。

  • MySQL Router,一种轻量级中间件,可在您的应用程序和 InnoDB Cluster 之间提供透明路由。

旁白: mysql 的集群有两种方案(至少在8.0版本是这样)。一个是 mysql InnoDB Cluster. 一个是 mysql NDB Cluster。其中 NDB 是网络数据库,是基于内存的,这个以后有时间了再研究下,今天在这里我们只讨论下 mysql InnoDB Cluster.

前期准备

安装mysql及组件

我的虚拟器是 Centos7 系统。为了方便这里就直接采用 yum 命令来安装了

  • 配置 mysql yum 仓库

    • 下载 rpm 包

      文件地址:https://dev.mysql.com/downloads/repo/yum/

      image-20230204121628274

      因为我是 centos7 系统,所以我下载了红框内的

      下载后把文件传到服务器上或者在服务器上可以通过以下命令下载

      1
      2
      #命令:(如果不支持 wget 命令就执行 yum install wget 来安装下)
      wget https://dev.mysql.com/get/mysql80-community-release-el7-7.noarch.rpm
    • 安装配置

      执行:

      1
      sudo yum install mysql80-community-release-el7-7.noarch.rpm
  • 安装 MySQL-Server

    执行:

    1
    2
    3
    4
    5
    6
    sudo yum install mysql-community-server

    #启动mysql 命令 “systemctl start mysqld”

    #查看运行状态命令 “systemctl status mysqld”

    关于 mysql 的安装详细步骤 请看我的另外一篇博客 centos7-安装mysql

  • 安装 Mysql-Shell

    执行:

    1
    sudo yum install mysql-shell
  • 安装 MySQL-router

    执行:

    1
    sudo yum install mysql-router

mysql-router 部署后是提供给客户端来访问的,在一台机器上安装就够用了。MySQL-Server 和 Msql-Shell 需要在每台机器上安装

前期配置

参考文档:

在安装 InnoDB Cluster 的生产部署之前,请确保您打算使用的服务器实例满足以下要求。

  • InnoDB Cluster 使用组复制,因此您的服务器实例必须满足相同的要求。请参阅 组复制要求。AdminAPI 提供了 dba.checkInstanceConfiguration()验证实例是否满足组复制要求的 dba.configureInstance()方法,以及配置实例以满足要求的方法。

笔记

使用沙盒部署时,实例被配置为自动满足这些要求。

  • 用于 Group Replication 的数据,以及用于 InnoDB Cluster 的数据,必须存储在 InnoDB事务存储引擎中。使用其他存储引擎,包括临时MEMORY存储引擎,可能会导致组复制错误。InnoDB在将实例与 Group Replication 和 InnoDB Cluster一起使用之前,转换其他存储引擎中的任何表以供 使用。您可以通过在服务器实例上设置系统变量来阻止使用其他存储引擎 disabled_storage_engines ,例如:
1
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"
  • 设置集群时,任何服务器实例上都不能有入站复制通道。Group Replication (group_replication_appliergroup_replication_recovery) 自动创建的通道在正在采用的复制组上是允许的。InnoDB Cluster 不支持在使用 AdminAPI 管理的通道之外手动配置的异步复制通道。如果您正在将现有的复制拓扑迁移到 InnoDB 集群部署,并且需要在设置过程中暂时跳过此验证,则可以 force在创建集群时使用该选项来绕过它。

  • group_replication_tls_source不得设置为mysql_admin

  • 必须在要与 InnoDB Cluster 一起使用的任何实例上启用性能模式。

  • MySQL Shell 用于配置服务器以在 InnoDB Cluster 中使用的供应脚本需要访问 Python。在 Windows 上,MySQL Shell 包括 Python,不需要用户配置。在 Unix 上,Python 必须作为 shell 环境的一部分找到。要检查您的系统是否正确配置了 Python,请执行以下操作:

1
$ /usr/bin/env python

如果 Python 解释器启动,则不需要进一步的操作。如果前面的命令失败, 请在/usr/bin/python您选择的 Python 二进制文件之间创建一个软链接。有关详细信息,请参阅 支持的语言

  • 从 8.0.17 版本开始,实例必须 server_id在 InnoDB 集群中使用唯一性。当您使用该 操作时,如果集群 中的实例已使用该操作,则该操作将失败并出现错误。 *Cluster*.addInstance(*instance*)server_idinstance

  • 从 8.0.23 版本开始,实例应配置为使用并行复制应用程序。请参阅 第 7.5.6 节,“配置并行复制应用程序”

  • 在为InnoDB Cluster配置实例的过程中,配置了使用实例所需的大部分系统变量。但是 AdminAPI 没有配置 transaction_isolation 系统变量,也就是说默认为 REPEATABLE READ. 这不会影响单主集群,但如果您使用的是多主集群,那么除非您依赖REPEATABLE READ应用程序中的语义,否则我们建议使用READ COMMITTED隔离级别。请参阅组复制限制

  • 实例的相关配置选项,尤其是 Group Replication 配置选项,必须位于单个选项文件中。InnoDB Cluster 只支持服务器实例的单个选项文件,不支持使用 --defaults-extra-file 选项指定一个额外的选项文件。对于使用实例选项文件的任何 AdminAPI 操作,必须指定主文件。如果要对与 InnoDB Cluster 无关的配置选项使用多个选项文件,则必须手动配置文件,考虑到使用多个选项文件的优先规则,确保它们正确更新,并确保相关设置InnoDB Cluster 不会被额外的无法识别的选项文件中的选项错误地覆盖。

由于我这里 mysql 安装的是最新版 8.0.23。所以可以省略许多配置,即便如此还要配置如下内容

配置组复制要求

关闭其他存储引擎

vim /etc/my.cnf (集群的每个服务都要配置)

1
2
# 在【mysqld】 节点下 配置
disabled_storage_engines="MyISAM,BLACKHOLE,FEDERATED,ARCHIVE,MEMORY"

重启:

1
systemctl restart mysqld

登录 mysql 查看是否生效

1
SELECT @@GLOBAL.disabled_storage_engines;

设置服务器实例唯一标识符

设置实例唯一标识符 server-id:

1
2
登录mysql
SET GLOBAL server_id = 10;

或 vim /etc/my.cnf 配置后重启

1
2
#添加配置
server-id=10

查询 server-id

1
2
3
4
#登录MySQL
SELECT @@GLOBAL.server_id;
#或
SHOW GLOBAL VARIABLES LIKE '%server_id%';

注意:集群的每个 mysql-server 都要设置 server-id,但 server-id 的值不能相同

开启全局事务

vim /etc/my.cnf (集群的每个服务都要配置) 添加后重启

1
2
gtid_mode=ON
enforce_gtid_consistency=ON

在线运行时配置请参照官方文档:https://dev.mysql.com/doc/refman/8.0/en/replication-mode-change-online-enable-gtids.html

更多 GTID 操作请参照:https://dev.mysql.com/doc/refman/8.0/en/replication-gtids-howto.html

默认表加密

要求:对所有组成员 设置 相同的值。默认架构和表空间加密可以启用 ( ON) 或禁用(OFF,默认值),只要所有成员的设置都相同。

查看设置:

1
SELECT @@GLOBAL.default_table_encryption;

小写表名称

要求:对所有组成员 设置 相同的值。

1
2
#查看小写表名称语句
SELECT @@GLOBAL.lower_case_table_names;

如果设置为 0,则表名按指定存储并且比较区分大小写。如果设置为 1,则表名以小写形式存储在磁盘上并且比较不区分大小写。如果设置为 2,表名按给定的形式存储,但以小写形式进行比较。此选项也适用于数据库名称和表别名。有关其他详细信息,请参阅 第 9.2.3 节,“标识符区分大小写”

二进制日志依赖跟踪

设置 binlog_transaction_dependency_tracking=WRITESET_SESSION 可以提高组成员的性能,具体取决于组的工作量。当从中继日志应用事务时,Group Replication 在认证后执行自己的并行化,独立于为 设置的值 binlog_transaction_dependency_tracking。然而,价值 binlog_transaction_dependency_tracking 确实会影响将事务写入组复制成员上的二进制日志的方式。这些日志中的依赖信息用于协助从捐赠者的二进制日志进行状态传输以进行分布式恢复,每当成员加入或重新加入组时都会发生这种情况。

多线程应用程序

Group Replication 成员可以配置为多线程副本,使事务能够并行应用。从 MySQL 8.0.27 开始,所有副本默认配置为多线程。系统变量的非零值 replica_parallel_workers (来自 MySQL 8.0.26)或 slave_parallel_workers (在 MySQL 8.0.26 之前)在成员上启用多线程应用程序。MySQL 8.0.27 默认是 4 个并行应用程序线程,最多可以指定 1024 个并行应用程序线程。对于多线程副本,还需要以下设置,这是 MySQL 8.0.27 的默认设置:

设置 replica_parallel_workers=0slave_parallel_workers=0 禁用并行执行并为副本提供单个应用程序线程而没有协调程序线程。使用该设置,replica_parallel_type orslave_parallel_typereplica_preserve_commit_order or slave_preserve_commit_order 选项无效并被忽略。从 MySQL 8.0.27 开始,如果在副本上使用 GTID 时禁用并行执行,则副本实际上使用一个并行工作程序,以利用在不访问文件位置的情况下重试事务的方法。但是,此行为不会为用户改变任何内容。

配置并行应用程序

vim /etc/my.cnf 添加 (集群的每个服务都要配置)

1
2
3
4
binlog_transaction_dependency_tracking=WRITESET
slave_preserve_commit_order=ON
slave_parallel_type=LOGICAL_CLOCK
transaction_write_set_extraction=XXHASH64

文档:https://dev.mysql.com/doc/mysql-shell/8.0/en/configuring-parallel-applier.html

配置主机名

修改 localhost.localdomain 为自己定义的主机名 【配置完后需要重启

vim /etc/hostname

1
host01

并在集群各主机中配置 ip 解析

vim /etc/host

1
2
3
4
127.0.0.1   localhost host01 localhost4 localhost4.localdomain4
::1 localhost host01 localhost6 localhost6.localdomain6
192.168.205.135 host02
192.168.205.136 host03

备注:我这里有三台主机 192.168.205.134(作为 master),192.168.205.135(作为 Slave),192.168.205.136(作为 Slave)

以上为 192.168.205.134 的配置


192.168.205.135 主机的配置:

vim /etc/hostname

1
host02

vim /etc/host

1
2
3
4
127.0.0.1   localhost host02 localhost4 localhost4.localdomain4
::1 localhost host02 localhost6 localhost6.localdomain6
192.168.205.134 host01
192.168.205.136 host03

192.168.205.136 主机的配置:

vim /etc/hostname

1
host03

vim /etc/host

1
2
3
4
127.0.0.1   localhost host03 localhost4 localhost4.localdomain4
::1 localhost host03 localhost6 localhost6.localdomain6
192.168.205.134 host01
192.168.205.135 host02

部署集群

  • 创建集群配置账户

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    #进入MySQL-shell
    MySQL JS > dba.configureInstance('root@localhost:3306', {clusterAdmin: "'config-account'@'%'"});
    Please provide the password for 'root@localhost:3306': ********
    Save password for 'root@localhost:3306'? [Y]es/[N]o/Ne[v]er (default No): y
    Configuring local MySQL instance listening at port 3306 for use in an InnoDB cluster...

    This instance reports its own address as localhost:3306
    Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed.
    Password for new account: ********
    Confirm password: ********

    applierWorkerThreads will be set to the default value of 4.

    The instance 'localhost:3306' is valid to be used in an InnoDB cluster.

    Cluster admin user 'config-account'@'%' created.
    The instance 'localhost:3306' is already ready to be used in an InnoDB cluster.

    Successfully enabled parallel appliers.
    MySQL JS >

    须在各个集群主机上分别执行

    官方文档:https://dev.mysql.com/doc/mysql-shell/8.0/en/creating-user-accounts-for-admin-api.html

  • 创建一个 innoDB 集群

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    # 通过 \connect "root@host01:3306" 连接mysql 之后
    MySQL host01:3306 ssl JS > var cluster = dba.createCluster('myCluster');
    A new InnoDB Cluster will be created on instance 'host01:3306'.

    Validating instance configuration at host01:3306...

    ERROR: New account(s) with proper source address specification to allow remote connection from all instances must be created to manage the cluster.
    Dba.createCluster: User 'root' can only connect from 'localhost'. (RuntimeError)

    # 修改root 可远程访问。后 再次尝试
    MySQL host01:3306 ssl JS > var cluster = dba.createCluster('myCluster');
    A new InnoDB Cluster will be created on instance 'host01:3306'.

    Validating instance configuration at host01:3306...
    ERROR: The account 'root'@'%' is missing privileges required to manage an InnoDB cluster:
    GRANT CLONE_ADMIN, CONNECTION_ADMIN, GROUP_REPLICATION_ADMIN, PERSIST_RO_VARIABLES_ADMIN, REPLICATION_APPLIER, REPLICATION_SLAVE_ADMIN, ROLE_ADMIN, SYSTEM_VARIABLES_ADMIN ON *.* TO 'root'@'%' WITH GRANT OPTION;
    For more information, see the online documentation.
    Dba.createCluster: The account 'root'@'%' is missing privileges required to manage an InnoDB cluster. (RuntimeError)
    MySQL host01:3306 ssl JS >
    # 切换 sql 模式
    MySQL host01:3306 ssl JS > \sql
    Switching to SQL mode... Commands end with ;
    Fetching global names for auto-completion... Press ^C to stop.
    #执行授权
    MySQL host01:3306 ssl SQL > GRANT CLONE_ADMIN, CONNECTION_ADMIN, GROUP_REPLICATION_ADMIN, PERSIST_RO_VARIABLES_ADMIN, REPLICATION_APPLIER, REPLICATION_SLAVE_ADMIN, ROLE_ADMIN, SYSTEM_VARIABLES_ADMIN ON *.* TO 'root'@'%' WITH GRANT OPTION;
    Query OK, 0 rows affected (0.0026 sec)
    MySQL host01:3306 ssl SQL > flush privileges;
    Query OK, 0 rows affected (0.0020 sec)
    # 再次创建集群 报错
    MySQL host01:3306 ssl SQL > var cluster = dba.createCluster('myCluster');
    ERROR: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'var cluster = dba.createCluster('myCluster')' at line 1
    # 切换为 js 模式
    MySQL host01:3306 ssl SQL > \js
    Switching to JavaScript mode...
    MySQL host01:3306 ssl JS > var cluster = dba.createCluster('myCluster');
    A new InnoDB Cluster will be created on instance 'host01:3306'.

    Validating instance configuration at host01:3306...

    This instance reports its own address as host01:3306

    Instance configuration is suitable.
    NOTE: Group Replication will communicate with other members using 'host01:3306'. Use the localAddress option to override.

    Creating InnoDB Cluster 'myCluster' on 'host01:3306'...

    Adding Seed Instance...
    Cluster successfully created. Use Cluster.addInstance() to add MySQL instances.
    At least 3 instances are needed for the cluster to be able to withstand up to
    one server failure.

    MySQL host01:3306 ssl JS >
    ## 成功!!

  • 添加实例到 InnoDB 集群

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    38
    39
    40
    41
    42
    43
    44
    45
    46
    47
    48
    49
    50
    51
    52
    53
    54
    55
    56
    57
    58
    59
    60
    MySQL  host01:3306 ssl  JS > cluster.addInstance('config-account@host02:3306');
    ERROR: Unable to connect to the target instance 'host02:3306'. Please verify the connection settings, make sure the instance is available and try again.
    Cluster.addInstance: Could not open connection to 'host02:3306': Can't connect to MySQL server on 'host02:3306' (113) (MySQL Error 2003)
    MySQL host01:3306 ssl JS >
    ## 连接不上 host02 实例。 关闭 host02 的防火墙后再试
    MySQL host01:3306 ssl JS > var cluster = dba.getCluster('myCluster');

    ## 发现还是不行 把账户 改成了 root 后成功了(原来这里不能用 配置账户)
    MySQL host01:3306 ssl JS > cluster.addInstance('root@host02:3306');

    WARNING: A GTID set check of the MySQL instance at 'host02:3306' determined that it contains transactions that do not originate from the cluster, which must be discarded before it can join the cluster.

    host02:3306 has the following errant GTIDs that do not exist in the cluster:
    8fa6a86a-a2ad-11ed-a202-000c29919b55:1-4

    WARNING: Discarding these extra GTID events can either be done manually or by completely overwriting the state of host02:3306 with a physical snapshot from an existing cluster member. To use this method by default, set the 'recoveryMethod' option to 'clone'.

    Having extra GTID events is not expected, and it is recommended to investigate this further and ensure that the data can be removed prior to choosing the clone recovery method.

    Please select a recovery method [C]lone/[A]bort (default Abort): c
    Validating instance configuration at host02:3306...

    This instance reports its own address as host02:3306

    Instance configuration is suitable.
    NOTE: Group Replication will communicate with other members using 'host02:3306'. Use the localAddress option to override.

    A new instance will be added to the InnoDB Cluster. Depending on the amount of
    data on the cluster this might take from a few seconds to several hours.

    Adding instance to the cluster...

    Monitoring recovery process of the new cluster member. Press ^C to stop monitoring and let it continue in background.
    Clone based state recovery is now in progress.

    NOTE: A server restart is expected to happen as part of the clone process. If the
    server does not support the RESTART command or does not come back after a
    while, you may need to manually start it back.

    * Waiting for clone to finish...
    NOTE: host02:3306 is being cloned from host01:3306
    ** Stage DROP DATA: Completed
    ** Clone Transfer
    FILE COPY ############################################################ 100% Completed
    PAGE COPY ############################################################ 100% Completed
    REDO COPY ############################################################ 100% Completed

    NOTE: host02:3306 is shutting down...

    * Waiting for server restart... ready
    * host02:3306 has restarted, waiting for clone to finish...
    ** Stage RESTART: Completed
    * Clone process has finished: 74.70 MB transferred in about 1 second (~74.70 MB/s)

    State recovery already finished for 'host02:3306'

    The instance 'host02:3306' was successfully added to the cluster.

    MySQL host01:3306 ssl JS >

    在添加另外一个实例的时候出错了

    导致这个情况的原因可能是因为,我第三台机是通过虚拟机克隆的第二台。所以 uuid 也相同了

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
     MySQL  host01:3306 ssl  JS > cluster.addInstance('root@host03:3306');

    WARNING: A GTID set check of the MySQL instance at 'host03:3306' determined that it contains transactions that do not originate from the cluster, which must be discarded before it can join the cluster.

    host03:3306 has the following errant GTIDs that do not exist in the cluster:
    8fa6a86a-a2ad-11ed-a202-000c29919b55:1-4

    WARNING: Discarding these extra GTID events can either be done manually or by completely overwriting the state of host03:3306 with a physical snapshot from an existing cluster member. To use this method by default, set the 'recoveryMethod' option to 'clone'.

    Having extra GTID events is not expected, and it is recommended to investigate this further and ensure that the data can be removed prior to choosing the clone recovery method.

    Please select a recovery method [C]lone/[A]bort (default Abort): c
    Validating instance configuration at host03:3306...

    This instance reports its own address as host03:3306

    Instance configuration is suitable.
    ERROR: RuntimeError: Cannot add an instance with the same server UUID (8fa6a86a-a2ad-11ed-a202-000c29919b55) of an active member of the cluster 'host02:3306'. Please change the server UUID of the instance to add, all members must have a unique server UUID.
    Cluster.addInstance: Cannot add an instance with the same server UUID (8fa6a86a-a2ad-11ed-a202-000c29919b55) of an active member of the cluster 'host02:3306'. Please change the server UUID of the instance to add, all members must have a unique server UUID. (RuntimeError)
    #提示 uuid 重复了

    修改uuid

    vim /var/lib/mysql/auto.cnf 注意这里不是 /etc/my.cnf 文件

    1
    2
    3
    [auto]
    server-uuid=ca36cab1-a394-11ed-bd39-000c29c44934
    # 修改完就好了

恢复集群

昨天休息的时候把虚拟器直接关闭了,导致今天通过 router 连不上数据库。在确定 mysql-server ,mysql-router 都运行正常后怀疑是集群没起来。通过查询官方文档得知服务异常断开(比如我这种一下把虚拟机全部关闭了)会导致这个情况。

官方文档:https://dev.mysql.com/doc/mysql-shell/8.0/en/reboot-outage.html

恢复命令:

1
var cluster = dba.rebootClusterFromCompleteOutage()

我虚拟机的执行过程:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
MySQL  host01:3306 ssl  JS >  dba.rebootClusterFromCompleteOutage('myCluster');
Restoring the Cluster 'myCluster' from complete outage...

Cluster instances: 'host01:3306' (OFFLINE), 'host02:3306' (UNREACHABLE), 'host03:3306' (UNREACHABLE)
WARNING: One or more instances of the Cluster could not be reached and cannot be rejoined nor ensured to be OFFLINE: 'host02:3306', 'host03:3306'. Cluster may diverge and become inconsistent unless all instances are either reachable or certain to be OFFLINE and not accepting new transactions. You may use the 'force' option to bypass this check and proceed anyway.
ERROR: Could not determine if Cluster is completely OFFLINE
Dba.rebootClusterFromCompleteOutage: Could not determine if Cluster is completely OFFLINE (RuntimeError)

#上方警告提示后面说了 可以通过强制参数(force)来启动

MySQL host01:3306 ssl JS > var cluster = dba.rebootClusterFromCompleteOutage("myCluster",{force: true})
Restoring the Cluster 'myCluster' from complete outage...

Cluster instances: 'host01:3306' (OFFLINE), 'host02:3306' (UNREACHABLE), 'host03:3306' (UNREACHABLE)
WARNING: One or more instances of the Cluster could not be reached and cannot be rejoined nor ensured to be OFFLINE: 'host02:3306', 'host03:3306'. Cluster may diverge and become inconsistent unless all instances are either reachable or certain to be OFFLINE and not accepting new transactions. You may use the 'force' option to bypass this check and proceed anyway.
Waiting for instances to apply pending received transactions...
Validating instance configuration at host01:3306...

This instance reports its own address as host01:3306

Instance configuration is suitable.
* Waiting for seed instance to become ONLINE...
host01:3306 was restored.
The Cluster was successfully rebooted.


注意:

  • 生产环境请参照官方文档 选取 超集的事务 机器来执行
  • 在恢复之前确保机器之间可以相互访问(开启端口 或 关闭防火墙)

我在本地测试的时候遇到了这个问题

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
#在我执行强制恢复集群命令后查看状态如下:
MySQL host01:3306 ssl JS > cluster.status()
{
"clusterName": "myCluster",
"defaultReplicaSet": {
"name": "default",
"primary": "host01:3306",
"ssl": "REQUIRED",
"status": "OK_NO_TOLERANCE_PARTIAL",
"statusText": "Cluster is NOT tolerant to any failures. 2 members are not active.",
"topology": {
"host01:3306": {
"address": "host01:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.32"
},
"host02:3306": {
"address": "host02:3306",
"memberRole": "SECONDARY",
"mode": "n/a",
"readReplicas": {},
"role": "HA",
"shellConnectError": "MySQL Error 2003: Could not open connection to 'host02:3306': Can't connect to MySQL server on 'host02:3306' (113)",
"status": "(MISSING)"
},
"host03:3306": {
"address": "host03:3306",
"memberRole": "SECONDARY",
"mode": "n/a",
"readReplicas": {},
"role": "HA",
"shellConnectError": "MySQL Error 2003: Could not open connection to 'host03:3306': Can't connect to MySQL server on 'host03:3306' (113)",
"status": "(MISSING)"
}
},
"topologyMode": "Single-Primary"
},
"groupInformationSourceMember": "host01:3306"
}
MySQL host01:3306 ssl JS > cluster.re
rejoinInstance() removeInstance() removeRouterMetadata() rescan() resetRecoveryAccountsPassword()
#此处尝试 重新加入实例
MySQL host01:3306 ssl JS > cluster.rejoinInstance("host02:3306")
ERROR: Unable to connect to the target instance 'host02:3306'. Please verify the connection settings, make sure the instance is available and try again.
Cluster.rejoinInstance: Could not open connection to 'host02:3306': Can't connect to MySQL server on 'host02:3306' (113) (MySQL Error 2003)
#上面提示不能连接 是我忘记了关闭防火墙
MySQL host01:3306 ssl JS > cluster.rejoinInstance("host02:3306")
Validating instance configuration at host02:3306...

This instance reports its own address as host02:3306

Instance configuration is suitable.
Rejoining instance 'host02:3306' to cluster 'myCluster'...
#在我关闭防火墙后 再次执行重加入实例命令 然后就卡住了。。。。。。。。。。。。。。
Re-creating recovery account...
NOTE: User 'mysql_innodb_cluster_11'@'%' already existed at instance 'host01:3306'. It will be deleted and created again with a new password.
y^H

^C^C^C^C^C^C^C




^C^C
^C
^C^C
clear
^C^C^C^C
## 此处命令行卡住了 无法退出。于是我重启了虚拟机
## 重启后 一直无法登录,登上去 就 掉下来,登上去后就掉下来。
## 我平躺了会儿后准备重装系统时 再次登录成功
连接断开
连接主机...
连接主机成功
###在执行恢复命令后发现一切正常了
MySQL host01:3306 ssl JS > var cluster = dba.rebootClusterFromCompleteOutage('myCluster')
Restoring the Cluster 'myCluster' from complete outage...

Cluster instances: 'host01:3306' (OFFLINE), 'host02:3306' (OFFLINE), 'host03:3306' (OFFLINE)
Waiting for instances to apply pending received transactions...
Validating instance configuration at host01:3306...

This instance reports its own address as host01:3306

Instance configuration is suitable.
* Waiting for seed instance to become ONLINE...
host01:3306 was restored.
Validating instance configuration at host02:3306...

This instance reports its own address as host02:3306

Instance configuration is suitable.
Rejoining instance 'host02:3306' to cluster 'myCluster'...

Re-creating recovery account...
NOTE: User 'mysql_innodb_cluster_11'@'%' already existed at instance 'host01:3306'. It will be deleted and created again with a new password.

* Waiting for the Cluster to synchronize with the PRIMARY Cluster...
** Transactions replicated ############################################################ 100%

The instance 'host02:3306' was successfully rejoined to the cluster.

Validating instance configuration at host03:3306...

This instance reports its own address as host03:3306

Instance configuration is suitable.
Rejoining instance 'host03:3306' to cluster 'myCluster'...

Re-creating recovery account...
NOTE: User 'mysql_innodb_cluster_12'@'%' already existed at instance 'host01:3306'. It will be deleted and created again with a new password.

* Waiting for the Cluster to synchronize with the PRIMARY Cluster...
** Transactions replicated ############################################################ 100%

The instance 'host03:3306' was successfully rejoined to the cluster.

The Cluster was successfully rebooted.

MySQL host01:3306 ssl JS > cluster.status()
{
"clusterName": "myCluster",
"defaultReplicaSet": {
"name": "default",
"primary": "host01:3306",
"ssl": "REQUIRED",
"status": "OK",
"statusText": "Cluster is ONLINE and can tolerate up to ONE failure.",
"topology": {
"host01:3306": {
"address": "host01:3306",
"memberRole": "PRIMARY",
"mode": "R/W",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.32"
},
"host02:3306": {
"address": "host02:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.32"
},
"host03:3306": {
"address": "host03:3306",
"memberRole": "SECONDARY",
"mode": "R/O",
"readReplicas": {},
"replicationLag": "applier_queue_applied",
"role": "HA",
"status": "ONLINE",
"version": "8.0.32"
}
},
"topologyMode": "Single-Primary"
},
"groupInformationSourceMember": "host01:3306"
}
MySQL host01:3306 ssl JS >

测试

  • 测试创建数据库

    • 在主数据库上创建一个数据库

      image-20230203161224546

    • 在从数据上查询

      image-20230203161336986

    我是把实例添加到集群后就开始测试的。测试还未给实例创建管理员账户(狗头)

  • 创建数据表和插入数据

    idea 控制台 日志:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    sql> create table if not exists sys_role
    (
    role_id bigint not null
    primary key,
    role_name varchar(30) null
    )
    [2023-02-04 14:38:09] completed in 205 ms
    sql> insert into sys_role (role_id,role_name) VALUE ('2323232',"SYS_CONFIG")
    [2023-02-04 14:40:38] 1 row affected in 7 ms

    登录从服务器查询

    image-20230204161911356

MySQL路由

配置 mysql-router 连接 MySQL 实例实现自我配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
#这里我用的是 config-account 账户来连接mysql。也可以配置其他用户
[root@host01 ~]# mysqlrouter -B "config-account@host01:3306"
Error: You are bootstrapping as a superuser.
This will make all the result files (config etc.) privately owned by the superuser.
Please use --user=username option to specify the user that will be running the router.
Use --user=root if this really should be the superuser.
[root@host01 ~]# mysqlrouter -B "config-account@host01:3306" --user=root
Please enter MySQL password for config-account:
# Bootstrapping system MySQL Router instance...

- Creating account(s) (only those that are needed, if any)
- Verifying account (using it to run SQL queries that would be run by Router)
- Storing account in keyring
- Adjusting permissions of generated files
- Creating configuration /etc/mysqlrouter/mysqlrouter.conf

Existing configuration backed up to '/etc/mysqlrouter/mysqlrouter.conf.bak'

# MySQL Router configured for the InnoDB Cluster 'myCluster'

After this MySQL Router has been started with the generated configuration

$ /etc/init.d/mysqlrouter restart
or
$ systemctl start mysqlrouter
or
$ mysqlrouter -c /etc/mysqlrouter/mysqlrouter.conf

InnoDB Cluster 'myCluster' can be reached by connecting to:

## MySQL Classic protocol

- Read/Write Connections: localhost:6446
- Read/Only Connections: localhost:6447

## MySQL X protocol

- Read/Write Connections: localhost:6448
- Read/Only Connections: localhost:6449

[root@host01 ~]#

1
2
3
4
5
6
7
#如果使用其他用户来连接 MySQL 实例 所需最低权限如下
GRANT CREATE USER ON *.* TO 'bootstrapuser'@'%' WITH GRANT OPTION;
GRANT SELECT, INSERT, UPDATE, DELETE, EXECUTE ON mysql_innodb_cluster_metadata.* TO 'bootstrapuser'@'%';
GRANT SELECT ON mysql.user TO 'bootstrapuser'@'%';
GRANT SELECT ON performance_schema.replication_group_members TO 'bootstrapuser'@'%';
GRANT SELECT ON performance_schema.replication_group_member_stats TO 'bootstrapuser'@'%';
GRANT SELECT ON performance_schema.global_variables TO 'bootstrapuser'@'%';

启动 mysql-router

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
[root@host01 /]# sudo systemctl start mysqlrouter.service
Job for mysqlrouter.service failed because the control process exited with error code. See "systemctl status mysqlrouter.service" and "journalctl -xe" for details.
[root@host01 /]# systemctl status mysqlrouter.service
● mysqlrouter.service - MySQL Router
Loaded: loaded (/usr/lib/systemd/system/mysqlrouter.service; disabled; vendor preset: disabled)
Active: failed (Result: start-limit) since 五 2023-02-03 20:42:30 CST; 12s ago
Process: 93503 ExecStart=/usr/bin/.mysqlrouter (code=exited, status=203/EXEC)
Main PID: 93503 (code=exited, status=203/EXEC)

2月 03 20:42:30 host01 systemd[1]: mysqlrouter.service: main process exited, code=exited, status=203/EXEC
2月 03 20:42:30 host01 systemd[1]: Failed to start MySQL Router.
2月 03 20:42:30 host01 systemd[1]: Unit mysqlrouter.service entered failed state.
2月 03 20:42:30 host01 systemd[1]: mysqlrouter.service failed.
2月 03 20:42:30 host01 systemd[1]: mysqlrouter.service holdoff time over, scheduling restart.
2月 03 20:42:30 host01 systemd[1]: Stopped MySQL Router.
2月 03 20:42:30 host01 systemd[1]: start request repeated too quickly for mysqlrouter.service
2月 03 20:42:30 host01 systemd[1]: Failed to start MySQL Router.
2月 03 20:42:30 host01 systemd[1]: Unit mysqlrouter.service entered failed state.
2月 03 20:42:30 host01 systemd[1]: mysqlrouter.service failed.
# 启动报错了 查找博文 发现了是 配置引导项 --bootstrap 时的用户 使用 root 所导致的

于是我修改了文件
vim /usr/lib/systemd/system/mysqlrouter.service
#User=mysqlrouter
#Group=mysqlrouter
User=root
Group=root

再次执行 systemctl start mysqlrouter.service 已正常

连接访问

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 虚拟机上 mysql-shell 访问
[root@host02 ~]# mysqlsh root@host01:6446
Please provide the password for 'root@host01:6446': ********
Save password for 'root@host01:6446'? [Y]es/[N]o/Ne[v]er (default No): y
MySQL Shell 8.0.32

Copyright (c) 2016, 2023, Oracle and/or its affiliates.
Oracle is a registered trademark of Oracle Corporation and/or its affiliates.
Other names may be trademarks of their respective owners.

Type '\help' or '\?' for help; '\quit' to exit.
Creating a session to 'root@host01:6446'
Fetching schema names for auto-completion... Press ^C to stop.
Your MySQL connection id is 1184
Server version: 8.0.32 MySQL Community Server - GPL
No default schema selected; type \use <schema> to set one.
MySQL host01:6446 ssl JS > \sql
Switching to SQL mode... Commands end with ;
Fetching global names for auto-completion... Press ^C to stop.
MySQL host01:6446 ssl SQL > select @@port
-> ;
+--------+
| @@port |
+--------+
| 3306 |
+--------+
1 row in set (0.0007 sec)

idea 连接测试

image-20230203211919002

注意事项

  • 集群组要复制的每个数据表必须要有主键
  • 集群组内服务之间网络必须要好
  • 所有机器账户授权需要给到位。否则在添加实例的时候会提示密码错误!
  • 机器停机后需要恢复集群,否则会造成通过 router 无法连接数据库的情况

总结

官方文档内容较多,想找个东西好麻烦!

其他

一些命令

  • 设置系统变量

  • 设置全局系统变量:

    1
    2
    SET GLOBAL max_connections = 1000;
    SET @@GLOBAL.max_connections = 1000;
  • 将全局系统变量持久化到 mysqld-auto.cnf文件(并设置运行时值):

    1
    2
    SET PERSIST max_connections = 1000;
    SET @@PERSIST.max_connections = 1000;
  • 将全局系统变量持久化到 mysqld-auto.cnf文件(不设置运行时值):

    1
    2
    SET PERSIST_ONLY back_log = 1000;
    SET @@PERSIST_ONLY.back_log = 1000;
  • 设置会话系统变量:

    1
    2
    3
    SET SESSION sql_mode = 'TRADITIONAL';
    SET @@SESSION.sql_mode = 'TRADITIONAL';
    SET @@sql_mode = 'TRADITIONAL';
  • 查看变量语句

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    SHOW [GLOBAL | SESSION] VARIABLES
    [LIKE 'pattern' | WHERE expr]

    ## 例如:
    mysql> SHOW VARIABLES;

    SHOW VARIABLES LIKE 'max_join_size';
    SHOW SESSION VARIABLES LIKE 'max_join_size';

    SHOW VARIABLES LIKE '%size%';
    SHOW GLOBAL VARIABLES LIKE '%size%';

    SHOW VARIABLES受依赖于版本的显示宽度限制。对于具有未完全显示的非常长的值的变量, SELECT用作解决方法。例如:

    1
    SELECT @@GLOBAL.innodb_data_file_path;
  • other commands

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    #查看二进制日志是否开启
    SELECT @@GLOBAL.log_bin;
    #已记录副本更新
    SELECT @@GLOBAL.log_replica_updates;
    #查看二进制日志格式
    SELECT @@GLOBAL.binlog_format;
    #查看二进制日志校验
    SELECT @@GLOBAL.binlog_checksum;
    #查看GTID事务(全局事务)
    SELECT @@GLOBAL.ENFORCE_GTID_CONSISTENCY;
    SELECT @@GLOBAL.GTID_MODE;
    #查看默认表加密
    SELECT @@GLOBAL.default_table_encryption;
    #查看小写表名称
    SELECT @@GLOBAL.lower_case_table_names;
    #查看二进制日志依赖跟踪
    SELECT @@GLOBAL.binlog_transaction_dependency_tracking;
    #查看多线程应用程序
    SELECT @@GLOBAL.replica_parallel_workers;(来自 MySQL 8.0.26)
    SELECT @@GLOBAL.slave_parallel_workers;(在 MySQL 8.0.26 之前)
    #查看分离的XA事务配置
    SELECT @@GLOBAL.xa_detach_on_prepare;
    #查看性能模式
    SELECT @@GLOBAL.performance_schema;
    SHOW GLOBAL VARIABLES like 'performance_schema';
    #查看副本维持提交顺序
    SELECT @@GLOBAL.slave_preserve_commit_order;
    SHOW GLOBAL VARIABLES LIKE '%slave_preserve_commit_order%';
    #查看副本并行类型
    SELECT @@GLOBAL.slave_parallel_type;
    SHOW GLOBAL VARIABLES LIKE '%slave_parallel_type%';
    #用于散列事务期间提取的写入的算法
    SELECT @@GLOBAL.transaction_write_set_extraction;
    SHOW GLOBAL VARIABLES LIKE '%transaction_write_set_extraction%';

参考文献:

官方文档:

https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-innodb-cluster.html

https://dev.mysql.com/doc/mysql-shell/8.0/en/creating-user-accounts-for-admin-api.html

https://dev.mysql.com/doc/mysql-shell/8.0/en/mysql-innodb-cluster.html

https://dev.mysql.com/doc/mysql-shell/8.0/en/mysqlsh.html

https://dev.mysql.com/doc/mysql-router/8.0/en/mysql-router-installation.html

https://dev.mysql.com/doc/mysql-shell/8.0/en/reboot-outage.html

博客:https://blog.csdn.net/wudi53433927/article/details/128157252