基于 AdminAPI 部署 MySQL InnoDB Cluster 实践指南

InnoDB Cluster 为 MySQL 提供完整的高可用性解决方案。使用随 MySQL Shell 一起提供的 AdminAPI，可轻松配置和管理一组至少由三个 MySQL Server 实例组成的 InnoDB 集群。使用 InnoDB Cluster 时，MySQL Router 负责读写分离，组复制负责主从同步

InnoDB Cluster 特征

InnoDB Cluster 默认情况下，以单主模式运行，其中集群有一个主服务器接受读写查询（R/W），集群中的所有其余实例只接受读取查询（R/O）。而多主模式下，所有节点均可读写
InnoDB Cluster 实例由各实例的 server_uuid 变量标识
InnoDB Cluster 中至少需要三个实例才能容忍一个实例的故障，添加更多实例会增加对 InnoDB Cluster 故障的容忍度，最大实例数为 9
InnoDB Cluster 中的每个 MySQL 服务器实例都运行 MySQL Group Replication，其提供 InnoDB Cluster 中复制数据的机制，具有内置的故障转移功能，甚至可以在 InnoDB Cluster 在线时更改集群的拓扑结构
当集群失去法定人数时，您将无法再处理集群的写入事务，或更改集群的拓扑结构，如添加、重新加入、删除实例。但是，如果您有一个包含 InnoDB Cluster 元数据的在线实例，则可以恢复具有法定人数的集群
如果集群的所有实例都运行 MySQL 服务器版本 8.0.15 或更高版本，可在线更改集群的拓扑结构
在 MySQL 8.0.17+，InnoDB Cluster 集成 MySQL Clone 插件，以提供加入实例时的自动配置。在之前版本中，Group Replication 只提供异步复制将新实例与集群主实例同步，然而对于具有大量事务的集群，新实例在能够加入集群之前，可能需要很长时间才能恢复所有事务
MySQL Router 可基于 InnoDB Cluster 自动配置自己，在一个称为引导的过程中，您无需手动配置路由，MySQL Router 透明地将客户端应用程序连接到 InnoDB 集群，为客户端连接提供路由和负载平衡

MySQL 实例配置

1、各 MySQL 实例配置（/etc/my.cnf）

[mysqld]
server_id = 1                   # 不同节点之 MySQL 实例，server_id 需保持不同
log_timestamps = SYSTEM         # 默认时间戳格式为 UTC ，与北京时间相差 8 小时
default-authentication-plugin = caching_sha2_password

datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

# 启用 mysql bin 日志，并设置日志前缀名称为 binlog; 配置选项 skip-log-bin 可禁用 binlog
log_bin = binlog
binlog_format = ROW

sync_binlog = 10
innodb_flush_log_at_trx_commit = 2

# 各类缓冲区大小设置
innodb_buffer_pool_size = 128M
join_buffer_size = 128M
sort_buffer_size = 2M
read_rnd_buffer_size = 2M

max_connections = 1000

MySQL 配置检查

1、必须配置各节点主机名与其 IP 地址的映射，否则添加组实例时报错 !!

cat /etc/hosts
192.168.31.11 a1
192.168.31.12 a2
192.168.31.13 a3

2、为各 MySQL 实例的账号授予组复制权限，及验证各实例是否满足 Group Replication 要求

-- 为用户 'root'@'%' 授予组复制管理相关权限
MySQL  localhost:33060+ ssl  SQL > GRANT CLONE_ADMIN, CONNECTION_ADMIN, GROUP_REPLICATION_ADMIN, PERSIST_RO_VARIABLES_ADMIN, REPLICATION_APPLIER, REPLICATION_SLAVE_ADMIN, ROLE_ADMIN, SYSTEM_VARIABLES_ADMIN ON *.* TO 'root'@'%' WITH GRANT OPTION;

-- 验证当前实例是否满足 Group Replication 要求
MySQL  localhost:33060+ ssl  JS > dba.checkInstanceConfiguration()
-- NOTE: Some configuration options need to be fixed:
-- +----------------------------------------+---------------+----------------+--------------------------------------------------+
-- | Variable                               | Current Value | Required Value | Note                                             |
-- +----------------------------------------+---------------+----------------+--------------------------------------------------+
-- | binlog_transaction_dependency_tracking | COMMIT_ORDER  | WRITESET       | Update the server variable                       |
-- | enforce_gtid_consistency               | OFF           | ON             | Update read-only variable and restart the server |
-- | gtid_mode                              | OFF           | ON             | Update read-only variable and restart the server |
-- +----------------------------------------+---------------+----------------+--------------------------------------------------+

-- 若不满足要求，可通过执行 dba.configureLocalInstance() 或 dba.configureInstance() 方法来自动配置当前实例以满足要求
MySQL  localhost:33060+ ssl  JS > dba.configureLocalInstance()

-- 再次验证当前实例配置是否满足要求
MySQL  localhost:33060+ ssl  JS > dba.checkInstanceConfiguration()
-- Validating local MySQL instance listening at port 3306 for use in an InnoDB cluster...
--
-- This instance reports its own address as a1:3306
-- Clients and other cluster members will communicate with it through this address by default. If this is not correct, the report_host MySQL system variable should be changed.
--
-- Checking whether existing tables comply with Group Replication requirements...
-- No incompatible tables detected
--
-- Checking instance configuration...
-- Instance configuration is compatible with InnoDB cluster
--
-- The instance 'a1:3306' is valid to be used in an InnoDB cluster.
--
-- {
    -- "status": "ok"
-- }

创建 InnoDB Cluster

// 自动创建一个用于存储 innodb cluster 元数据信息的数据库 mysql_innodb_cluster_metadata
// 配置当前实例为 InnoDB Cluster 的种子（Seed）节点，且该节点为集群在单主模式下的 Primary 节点
MySQL  localhost:33060+ ssl  JS > var cluster = dba.createCluster('testCluster', { memberWeight: 50, communicationStack: "xcom", replicationAllowedHost:'192.168.31.0/24' })
> // A new InnoDB Cluster will be created on instance 'a1:3306'.
>
> // Validating instance configuration at localhost:3306...
>
> // This instance reports its own address as a1:3306
>
> // Instance configuration is suitable.
> // NOTE: Group Replication will communicate with other members using 'a1:33061'. Use the localAddress option to override.
>
> // Creating InnoDB Cluster 'testCluster' on 'a1:3306'...
>
> // Adding Seed Instance...
> // Cluster successfully created. Use Cluster.addInstance() to add MySQL instances.
> // At least 3 instances are needed for the cluster to be able to withstand up to
> // one server failure.


// 将另一个服务器实例添加到 InnoDB 集群时，如果实例已经包含数据，首先使用 cluster.checkInstanceState() 函数来验证现有数据不会阻止实例加入集群
// 设置 memberWeight（默认值 50，取值范围 0 ~ 100 整数） 时，会在实例上配置系统变量 group_replication_member_weight，当故障转移选主时，memberWeight 数值越大越容易被选为 Primary 节点
// 多个实例具有相同的 memberWeight 值，则实例将根据其服务器 UUID 按字典顺序（最低）并选择第一个来确定优先级
MySQL  localhost:33060+ ssl  JS > cluster.addInstance('root@192.168.31.12:3306', { memberWeight: 50 })
MySQL  localhost:33060+ ssl  JS > cluster.addInstance('root@192.168.31.13:3306', { memberWeight: 60 })

// 移除指定实例
MySQL  localhost:33060+ ssl  JS > cluster.removeInstance("root@192.168.31.12:3306")

管理 InnoDB Cluster

1、手动设置 Innodb Cluster 的主实例为 ‘root@a3:3306’

MySQL  localhost:33060+ ssl  JS > cluster.setPrimaryInstance('root@a3:3306')
// Setting instance 'a1:3306' as the primary instance of cluster 'testCluster'...
//
// Instance 'a3:3306' was switched from PRIMARY to SECONDARY.
// Instance 'a2:3306' remains SECONDARY.
// Instance 'a1:3306' was switched from SECONDARY to PRIMARY.
//
// WARNING: The cluster internal session is not the primary member anymore. For cluster management operations please obtain a fresh cluster handle using dba.getCluster().
//
// The instance 'a1:3306' was successfully elected as primary.

2、解散 InnoDB 集群（删除 InnoDB 集群配置信息）

// 连接到 InnoDB 集群的一个读写实例，如单主集群中的主实例，并使用` cluster.dissolve() `命令。这将删除与集群关联的所有元数据和配置，并在实例上禁用组复制。但不会删除在实例之间复制的任何数据
MySQL  localhost:33060+ ssl  JS > cluster.dissolve({force: true})

3、当 InnoDB 集群所有 MySQL 实例被重启，则需执行如下命令恢复集群

MySQL  localhost:33060+ ssl  JS > dba.rebootClusterFromCompleteOutage()

4、查看组复制延迟信息

-- 查看组复制成员
SELECT * from replication_group_members\G;
>-- +---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
>-- | CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION | MEMBER_COMMUNICATION_STACK |
>-- +---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+
>-- | group_replication_applier | 54b99d9f-f883-11ed-a4af-52540000e2d7 | a2          |        3306 | ONLINE       | SECONDARY   | 8.0.30         | XCom                       |
>-- | group_replication_applier | db2603d4-fb8f-11ed-b875-525400aae54f | a3          |        3306 | ONLINE       | PRIMARY     | 8.0.30         | XCom                       |
>-- | group_replication_applier | fbd00ef4-f780-11ed-bfa3-525400993357 | a1          |        3306 | ONLINE       | SECONDARY   | 8.0.30         | XCom                       |
>-- +---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+----------------------------+

-- 查看组复制成员事务提交进度（即复制延迟情况）
SELECT * from replication_group_member_stats\G;
>-- *************************** 1. row ***************************
>--                               CHANNEL_NAME: group_replication_applier
>--                                    VIEW_ID: 16853427634617072:16
>--                                  MEMBER_ID: 54b99d9f-f883-11ed-a4af-52540000e2d7
>--                COUNT_TRANSACTIONS_IN_QUEUE: 0
>--                 COUNT_TRANSACTIONS_CHECKED: 100195
>--                   COUNT_CONFLICTS_DETECTED: 0
>--         COUNT_TRANSACTIONS_ROWS_VALIDATING: 79955
>--         TRANSACTIONS_COMMITTED_ALL_MEMBERS: 7a8a7a5b-fdec-11ed-8af6-525400993357:1-78:1000076-1367945,
>-- 7a8a8ad2-fdec-11ed-8af6-525400993357:1-9
>--             LAST_CONFLICT_FREE_TRANSACTION: 7a8a7a5b-fdec-11ed-8af6-525400993357:1434768
>-- COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 63664
>--          COUNT_TRANSACTIONS_REMOTE_APPLIED: 36532
>--          COUNT_TRANSACTIONS_LOCAL_PROPOSED: 0
>--          COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
>-- *************************** 2. row ***************************
>--                               CHANNEL_NAME: group_replication_applier
>--                                    VIEW_ID: 16853427634617072:16
>--                                  MEMBER_ID: db2603d4-fb8f-11ed-b875-525400aae54f
>--                COUNT_TRANSACTIONS_IN_QUEUE: 0
>--                 COUNT_TRANSACTIONS_CHECKED: 100215
>--                   COUNT_CONFLICTS_DETECTED: 0
>--         COUNT_TRANSACTIONS_ROWS_VALIDATING: 79975
>--         TRANSACTIONS_COMMITTED_ALL_MEMBERS: 7a8a7a5b-fdec-11ed-8af6-525400993357:1-78:1000076-1367945,
>-- 7a8a8ad2-fdec-11ed-8af6-525400993357:1-9
>--             LAST_CONFLICT_FREE_TRANSACTION: 7a8a7a5b-fdec-11ed-8af6-525400993357:1447920
>-- COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 0
>--          COUNT_TRANSACTIONS_REMOTE_APPLIED: 2
>--          COUNT_TRANSACTIONS_LOCAL_PROPOSED: 100213
>--          COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0
>-- *************************** 3. row ***************************
>--                               CHANNEL_NAME: group_replication_applier
>--                                    VIEW_ID: 16853427634617072:16
>--                                  MEMBER_ID: fbd00ef4-f780-11ed-bfa3-525400993357
>--                COUNT_TRANSACTIONS_IN_QUEUE: 0
>--                 COUNT_TRANSACTIONS_CHECKED: 99877
>--                   COUNT_CONFLICTS_DETECTED: 0
>--         COUNT_TRANSACTIONS_ROWS_VALIDATING: 79637
>--         TRANSACTIONS_COMMITTED_ALL_MEMBERS: 7a8a7a5b-fdec-11ed-8af6-525400993357:1-78:1000076-1351795,
>-- 7a8a8ad2-fdec-11ed-8af6-525400993357:1-9
>--             LAST_CONFLICT_FREE_TRANSACTION: 7a8a7a5b-fdec-11ed-8af6-525400993357:1426233
>-- COUNT_TRANSACTIONS_REMOTE_IN_APPLIER_QUEUE: 63500
>--          COUNT_TRANSACTIONS_REMOTE_APPLIED: 36377
>--          COUNT_TRANSACTIONS_LOCAL_PROPOSED: 2
>--          COUNT_TRANSACTIONS_LOCAL_ROLLBACK: 0

监控 InnoDB Cluster

-- 验证与集群相关实例的 gtid 状态
cluster.checkInstanceState('root@192.168.31.12:3306')

cluster.describe()
{
    "clusterName": "testCluster",
    "defaultReplicaSet": {
        "name": "default",
        "topology": [
            {
                "address": "a1:3306",
                "label": "a1:3306",
                "role": "HA"
            },
            {
                "address": "a2:3306",
                "label": "a2:3306",
                "role": "HA"
            }
        ],
        "topologyMode": "Single-Primary"
    }
}

-- 查看 innodb cluster 状态
cluster.status()
{
    "clusterName": "testCluster",
    "defaultReplicaSet": {
        "name": "default",
        "primary": "a1:3306",
        "ssl": "REQUIRED",
        "status": "OK_NO_TOLERANCE",
        "statusText": "Cluster is NOT tolerant to any failures.",
        "topology": {
            "a1:3306": {
                "address": "a1:3306",
                "memberRole": "PRIMARY",
                "mode": "R/W",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.30"
            },
            "a2:3306": {
                "address": "a2:3306",
                "memberRole": "SECONDARY",
                "mode": "R/O",
                "readReplicas": {},
                "replicationLag": "applier_queue_applied",
                "role": "HA",
                "status": "ONLINE",
                "version": "8.0.30"
            }
        },
        "topologyMode": "Single-Primary"
    },
    "groupInformationSourceMember": "a1:3306"
}

-- Coordinator 线程分发事务 event 至 4 个 worker 线程
mysql> show processlist;
-- +----+------------------------+-----------+------+--------------------------------------------+------+----------------------------------------------------------+----------------------------------+
-- | Id | User                   | Host      | db   | Command                                    | Time | State                                                    | Info                             |
-- +----+------------------------+-----------+------+--------------------------------------------+------+----------------------------------------------------------+----------------------------------+
-- | 45 | system user            |           | NULL | Connect                                    |   10 | waiting for handler commit                               | Group replication applier module |
-- | 47 | system user            |           | NULL | Query                                      |    3 | Replica has read all relay log; waiting for more updates | NULL                             |
-- | 48 | system user            |           | NULL | Query                                      |    4 | Waiting for an event from Coordinator                    | NULL                             |
-- | 49 | system user            |           | NULL | Connect                                    |   10 | Waiting for an event from Coordinator                    | NULL                             |
-- | 50 | system user            |           | NULL | Connect                                    |   10 | Waiting for an event from Coordinator                    | NULL                             |
-- | 51 | system user            |           | NULL | Connect                                    |   10 | Waiting for an event from Coordinator                    | NULL                             |
-- | 55 | mysql_innodb_cluster_2 | a1:45320  | NULL | Group Replication Data Stream subscription |   10 | Connection delegated to Group Replication                | NULL                             |
-- +----+------------------------+-----------+------+--------------------------------------------+------+----------------------------------------------------------+----------------------------------+
-- 9 rows in set (0.00 sec)

组复制性能优化

组复制性能 | 全方位认识 MySQL 8.0 Group Replication
组复制性能与限制

1、增加 applier 线程数量，可减少主从复制延迟

set global replica_parallel_workers = 8;
# 重启组复制使其生效
stop group_replication; 
start group_replication; 

# replica_worker 为 applier 线程，replica_sql 为 coordinator 线程
use performance_schema;
SELECT THREAD_ID,name from threads;
#> thread/sql/replica_sql 
#> thread/sql/replica_worker
#> thread/sql/replica_worker
#> ...

从版本 8.0.23 开始，实例支持并启用并行复制 applier Worker 线程
dba.configureInstance(‘root@a2:3306’, {applierWorkerThreads: 16, restart: true}) # restart 貌似不生效，需手动 MySQL 实例

2、MySQL 内部各线程磁盘 IO

# 查看 MySQL 当前内部线程全称
select thread_os_id, name from threads;
# 查看 MySQL 当前内部线程 CPU、内存资源占用
top -Hp 8662
# 查看 MySQL 当前内部线程磁盘 IO 资源占用
pidstat -dt -p 8662 1

# 在复制过程中，副本磁盘 IO 较大的 MySQL 线程说明，单位 KB
4071.93     ib_pg_flush_co（thread/innodb/page_flush_coordinator_thread）
1417.98     ib_log_writer（thread/innodb/log_writer_thread）
643.42      gr_apply（thread/group_rpl/THD_applier_module_receiver）
133.33      rpl_rca_wkr-0（thread/sql/replica_worker）
...
...


协调器（coordinator）线程 rpl_rca_sql-0（thread/sql/replica_sql）利用率近 100%，造成复制延迟 !! 且 show processlist 发现 worker 线程处于状态" waiting for handler commit "，有哪些原因 ??

# 当 sync_binlog=N(N>1 )，innodb_flush_log_at_trx_commit=2 时，在当前模式下 RDS for MySQL 的写操作才能达到最高性能
[mysqld]
sync_binlog = 2                         # 控制 MySQL 服务器将二进制日志同步到磁盘的频率，取值范围 0,1,N
innodb_flush_log_at_trx_commit = 2      # 取值范围 0,1,2

innodb_flush_log_at_trx_commit 和 sync_binlog 参数详解

MySQL Router

MySQL Router 可以根据您部署的集群自动配置自己，将客户端应用程序透明地连接到各 MySQL 服务器实例
在默认单主模式下，InnoDB Cluster 有一个读写服务器实例——主实例，多个辅助服务器实例是主服务器的副本。如果主服务器发生故障，辅助服务器会自动提升为主服务器的角色，且 MySQL Router 可检测到这一点并将客户端应用程序转发到新的主节点

1、引导、自动配置 MySQL Router ，以与 MySQL InnoDB 集群一起运行

mysqlrouter --bootstrap 192.168.31.13:3306 --directory /opt/myrouter --user root
#> Please enter MySQL password for root:
#> # Bootstrapping MySQL Router instance at '/opt/myrouter'...
#> Fetching Cluster Members
#> trying to connect to mysql-server at a1:3306
#> - Creating account(s) (only those that are needed, if any)
#> - Verifying account (using it to run SQL queries that would be run by Router)
#> - Storing account in keyring
#> - Adjusting permissions of generated files
#> - Creating configuration /opt/myrouter/mysqlrouter.conf
#> 
#> # MySQL Router configured for the InnoDB Cluster 'testCluster'
#> After this MySQL Router has been started with the generated configuration
#> 
#>     $ mysqlrouter -c /opt/myrouter/mysqlrouter.conf
#> 
#> InnoDB Cluster 'testCluster' can be reached by connecting to:
#> 
#> ## MySQL Classic protoco
#> - Read/Write Connections: localhost:6446
#> - Read/Only Connections:  localhost:6447
#> 
#> ## MySQL X protocol
#> - Read/Write Connections: localhost:6448
#> - Read/Only Connections:  localhost:6449

2、启动 MySQL Router，客户端应用可通过 MySQL Classic protoco 端口 6446 连接到集群主库，执行读写操作。从库则为端口 6447

bash /opt/myrouter/start.sh

bash /opt/myrouter/stop.sh

3、通过 MySQL Shell 查看 Innodb 集群 Router 信息

MySQL  a2:33060+ ssl  performance_schema  JS > cluster.listRouters()
{
    "clusterName": "testCluster", 
    "routers": {
        "dev::": {
            "hostname": "dev", 
            "lastCheckIn": null, 
            "roPort": "6447", 
            "roXPort": "6449", 
            "rwPort": "6446", 
            "rwXPort": "6448", 
            "version": "8.0.30"
        }
    }
}

Q&A

1、使用 dba.createCluster() 创建集群时报错如下

> # 2023-05-23 17:40:47.982223 [Error] [MY-011735] Plugin group_replication reported: '[GCS] The group communication engine failed to test connectivity to the local group communication engine on a1:33061. This may be due to one or more invalid configuration settings. Double-check your group replication local address, firewall, SE Linux and TLS configurations and try restarting Group Replication on this server.'
> # 2023-05-23 17:40:48.065324 [Error] [MY-011735] Plugin group_replication reported: '[GCS] The member was unable to join the group. Local port: 33061'

# 方法一：关闭 SELinux，不太安全，不特别推荐
setenforce 0

# 方法二：开放通讯端口（推荐）
yum install -y policycoreutils-python           # 安装 SELinux policy core python utilities
semanage port -a -t mysqld_port_t -p tcp 33061

semanage port -l                # List records of the port object type
semanage port -l | grep mysqld
> # mysqld_port_t                  tcp      33061, 1186, 3306, 63132-63164

2、当集群的所有节点的 MySQL 实例重启后，组复制 MEMBER_STATE 都处于 offline 时，直接获取集群信息失败，其报错信息如下:

// 查看组复制成员（Innodb cluster node）在线状态
MySQL  localhost:33060+ ssl  SQL > SELECT * from replication_group_members;

MySQL  localhost:33060+ ssl  JS > var cluster = dba.getCluster('testCluster') ;
> // Dba.getCluster: This function is not available through a session to a standalone instance (metadata exists, instance belongs to that metadata, but GR is not active) (MYSQLSH 51314)

// 可使用如下命令恢复集群
MySQL  localhost:33060+ ssl  JS > dba.rebootClusterFromCompleteOutage('testCluster')

参考

InnoDB Cluster 搭建
 部署 MySQL InnoDB 集群