4、keepalived高可用nginx负载均衡

时间：2018-12-14 00:53:44 阅读：201 评论：0 收藏：0 [点我收藏+]

标签：int cat addition gen 回顾 cipher ilb role 16px

keepalived：

HTTP_GET //使用keepalived获取后端real server健康状态检测

SSL_GET(https) //这里以为这后端使用的是http协议

TCP_CHECK

下面演示基于TCP_CHECK做检测

# man keepalived //查看TCP_CHECK配置段

# TCP healthchecker
TCP_CHECK
{
# ======== generic connection options
# Optional IP address to connect to.
# The default is the realserver IP //默认使用real server的IP
connect_ip <IP ADDRESS> //可省略
# Optional port to connect to
# The default is the realserver port
connect_port <PORT> //可省略
# Optional interface to use to
# originate the connection
bindto <IP ADDRESS>
# Optional source port to
# originate the connection from
bind_port <PORT>
# Optional connection timeout in seconds.
# The default is 5 seconds
connect_timeout <INTEGER>
# Optional fwmark to mark all outgoing
# checker packets with
fwmark <INTEGER>

# Optional random delay to start the initial check
# for maximum N seconds.
# Useful to scatter multiple simultaneous
# checks to the same RS. Enabled by default, with
# the maximum at delay_loop. Specify 0 to disable
warmup <INT>
# Retry count to make additional checks if check
# of an alive server fails. Default: 1
retry <INT>
# Delay in seconds before retrying. Default: 1
delay_before_retry <INT>
} #TCP_CHECK

# cd /etc/keepalived

# vim keepalived.conf //两台keepalived都要设置

 1 virtual_server 192.168.184.150 80 {    //这里可以合并
 2     delay_loop 6
 3     lb_algo wrr 
 4     lb_kind DR
 5     net_mask 255.255.0.0
 6     protocol TCP 
 7     sorry_server 127.0.0.1 80
 8 
 9     real_server 192.168.184.143 80 {
10         weight 1
11         TCP_CHECK {
12             connect_timeout 3
13         }   
14     }   
15 
16     real_server 192.168.184.144 80 {
17         weight 2
18         TCP_CHECK {
19             connect_timeout 3
20         }   
21     }   
22 }

systemctl restart keepalived

# systemctl status keepalived

 1 ● keepalived.service - LVS and VRRP High Availability Monitor
 2    Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
 3    Active: active (running) since Thu 2018-12-13 23:11:06 CST; 1min 32s ago
 4   Process: 6233 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 5  Main PID: 6234 (keepalived)
 6    CGroup: /system.slice/keepalived.service
 7            ├─6234 /usr/sbin/keepalived -D
 8            ├─6235 /usr/sbin/keepalived -D
 9            └─6236 /usr/sbin/keepalived -D
10 
11 Dec 13 23:11:11 node1 Keepalived_healthcheckers[6235]: Check on service [192.168.184.144]:80 failed after 1 retry.
12 Dec 13 23:11:11 node1 Keepalived_healthcheckers[6235]: Removing service [192.168.184.144]:80 from VS [192.168.184.150]:80
13 Dec 13 23:11:11 node1 Keepalived_healthcheckers[6235]: Remote SMTP server [127.0.0.1]:25 connected.
14 Dec 13 23:11:11 node1 Keepalived_healthcheckers[6235]: SMTP alert successfully sent.
15 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150
16 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.184.150
17 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150
18 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150
19 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150
20 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150    //发送广播地址已经添加
21 You have new mail in /var/spool/mail/root

示例：

HTTP_GET {

url {

path /

status_code 200

}

connect_timeout 3

nb_get_retry 3

delay_before_retry 3

}

TCP_CHECK {

connect_timeout 3

}

HA Services:

nginx

100: -25

96: -20 79 --> 99 --> 79

博客作业：

keepalived 高可用 ipvs

nginx

active/active

Linux HA Cluster

LB, HA, HP, hadoop

LB:

传输层：lvs

应用层：nginx, haproxy, httpd, perlbal, ats, varnish

HA:

vrrp: keepalived

AIS: heartbeat, OpenAIS, corosync/pacemaker, cman/rgmanager(conga) RHCS

HA:

故障场景：

硬件故障：

设计缺陷

使用过久自然损坏

人为故障

…… ……

软件故障

设计缺陷

bug

人为误操作

……

A=MTBF/(MTBF+MTTR)

MTBF: Mean Time Between Failure

MTTR: Mean Time To Repair

0<A<1: 百分比

90%, 95%, 99%

99.9%, 99.99%, 99.999%

提供冗余：

network partition： vote system

隔离：

STONITH：shoot the other node on the head 节点级别隔离

Fence: 资源级别的隔离

failover domain：

fda: node1, node5

fdb: node2, node5

fdc: node3, node5

fdd: node4, node5

资源的约束性：

位置约束：资源对节点的倾向性；

排列约束：资源彼此间是否能运行于同一节点的倾向性；

顺序约束：多个资源启动顺序依赖关系；

vote system:

少数服从多数：quorum

> total/2

with quorum: 拥有法定票数

without quorum: 不拥有法定票数

两个节点(偶数个节点)：

Ping node

qdisk

failover

failback

Messaging Layer:

heartbeat

corosync

cman

Cluster Resource Manager(CRM):

heartbeat v1 haresources (配置接口：配置文件haresources)

heartbeat v2 crm (在每个节点运行一个crmd(5560/tcp)守护进程，有命令行接口crmsh; GUI: hb_gui)

heartbeat v3, pacemaker (配置接口：crmsh, pcs; GUI: hawk(suse), LCMC, pacemaker-gui)

rgmanager (配置接口：cluster.conf, system-config-cluster, conga(webgui), cman_tool, clustat)

组合方式：

heartbeat v1 (haresources)

heartbeat v2 (crm)

heartbeat v3 + pacemaker

corosync + pacemaker

corosync v1 + pacemaker (plugin)

corosync v2 + pacemaker (standalone service)

cman + rgmanager

corosync v1 + cman + pacemaker

RHCS: Red Hat Cluster Suite

RHEL5: cman + rgmanager + conga (ricci/luci)

RHEL6: cman + rgmanager + conga (ricci/luci)

corosync + pacemaker

corosync + cman + pacemaker

RHEL7: corosync + pacemaker

Resource Agent：

service: /etc/ha.d/haresources.d/目录下的脚本；

LSB: /etc/rc.d/init.d/目录下的脚本；

OCF：Open Cluster Framework

provider:

STONITH:

Systemd:

资源类型：

primitive：主资源，原始资源；在集群中只能运行一个实例；

clone：克隆资源，在集群中可运行多个实例；

匿名克隆、全局惟一克隆、状态克隆(主动、被动)

multi-state(master/slave)：克隆资源的特殊实现；多状态资源；

group: 组资源；

启动或停止；

资源监视

相关性：

资源属性：

priority: 优先级；

target-role：started, stopped, master;

is-managed: 是否允许集群管理此资源；

resource-stickiness: 资源粘性；

allow-migrate: 是否允许迁移；

约束：score

位置约束：资源对节点的倾向性；

(-oo, +oo):

任何值+无穷大=无穷大

任何值+负无穷大=负无穷大

无穷大+负无穷大=负无穷大

排列约束：资源彼此间是否能运行于同一节点的倾向性；

(-oo, +oo)

顺序约束：多个资源启动顺序依赖关系；

(-oo, +oo)

Mandatory

安装配置：

CentOS 7： corosync v2 + pacemaker

corosync v2: vote system

pacemaker: 独立服务

集群的全生命周期管理工具：

pcs: agent(pcsd)

crmsh: agentless (pssh)

配置集群的前提：

(1) 时间同步；

(2) 基于当前正在使用的主机名互相访问；

(3) 是否会用到仲裁设备；

web serivce：

vip: 172.16.100.91

httpd

回顾：AIS HA

Messaging Layer:

heartbeat v1, v2, v3

corosync v1, v2(votequorum)

OpenAIS

CRM:

pacemaker

配置接口：crmsh (agentless), pssh

pcs (agent), pcsd

conga(ricci/luci)

group, constraint

rgmanager(cman)

resource group:

failover domain

配置：

全局属性：property, stonith-enable等等；

高可用服务：资源，通过RA

RA:

LSB: /etc/rc.d/init.d/

systemd：/etc/systemd/system/multi-user.wants

处于enable状态的服务；

OCF: [provider]

heartbeat

pacemaker

linbit

service

stonith

高可用集群的可用方案：

heartbeat v1

heartbeat v2

heartbeat v3 + pacemaker X

corosync + pacemaker

cman + rgmanager

corosync + cman + pacemaker

corosync + pacemaker

keepalived

HA Cluster(2)

Heartbeat信息传递：

Unicast, udpu

Mutlicast, udp

Broadcast

组播地址：用于标识一个IP组播域；IANA把D类地址留给组播使用：224.0.0.0-239.255.255.255

永久组播地址：224.0.0.0-224.0.0.255

临时组播地址：224.0.1.0-238.255.255.255

本地组播地址：239.0.0.0-239.255.255.255

示例配置文件：

totem {

version: 2

crypto_cipher: aes128

crypto_hash: sha1

secauth: on

interface {

ringnumber: 0

bindnetaddr: 172.16.0.0

mcastaddr: 239.185.1.31

mcastport: 5405

ttl: 1

}

nodelist {

node {

ring0_addr: 172.16.100.67

nodeid: 1

}

node {

ring0_addr: 172.16.100.68

nodeid: 2

}

node {

ring0_addr: 172.16.100.69

nodeid: 3

}

logging {

fileline: off

to_stderr: no

to_logfile: yes

logfile: /var/log/cluster/corosync.log

to_syslog: no

debug: off

timestamp: on

logger_subsys {

subsys: QUORUM

debug: off

}

quorum {

provider: corosync_votequorum

}

HA Web Service:

vip: 172.16.100.92, ocf:heartbeat:IPaddr

httpd: systemd

nfs shared storage: ocf:heartbeat:Filesystem

HA Cluster工作模型:

A/P：两节点集群; active/passive;

without-quorum-policy={stop|ignore|suicide|freeze}

A/A：双主模型

N-M: N个节点，M个服务，N>M;

N-N: N个节点，N个服务；

network partition:

brain-split：块级别的共享存储时，非常危险；

vote quorum:

with quorum > total/2

without quorum <= total/2

stop

ignore

suicide

freeze

CAP:

C: consistency

A: availiability

P: partition tolerance

webip, webstore, webserver

node1: 100 + 0 + 0

node2: 0 + 0 + 0

node3: 0 + 0 + 0

node2: 50+50+50

A --> B --> C

C --> B --> A

pcs：

cluster

auth

setup

resource

describe

list

create

delete

constraint

colocation

order

location

property

list

set

status

config

博客作业：

(1) 手动配置，多播：corosync+pacemaker+crmsh, 配置高可用的mysql集群，datadir指向的路径为nfs导出路径；

(2) pcs/pcsd，单播：corosync+pacemaker，配置高可用的web集群；

单播配置示例：

某些环境中可能不支持组播。这时应该配置 Corosync 使用单播，下面是使用单播的 Corosync 配置文件的一部分：

totem {

#...

interface {

ringnumber: 0

bindnetaddr: 192.168.42.0

broadcast: yes

mcastport: 5405

}

interface {

ringnumber: 1

bindnetaddr: 10.0.42.0

broadcast: yes

mcastport: 5405

}

transport: udpu

}

nodelist {

node {

ring0_addr: 192.168.42.1

ring1_addr: 10.0.42.1

nodeid: 1

}

node {

ring0_addr: 192.168.42.2

ring1_addr: 10.0.42.2

nodeid: 2

}

如果将 broadcast 设置为 yes ，集群心跳将通过广播实现。设置该参数时，不能设置 mcastaddr 。

transport 配置项决定集群通信方式。要完全禁用组播，应该配置单播传输参数 udpu 。这要求将所有的节点服务器信息写入 nodelist ，也就是需要在配署 HA 集群之前确定节点组成。配认配置是 udp 。通信方式类型还支持 udpu 和 iba 。

在 nodelist 之下可以为某一节点设置只与该节点相关的信息，这些设置项只能包含在 node 之中，即只能对属于集群的节点服务器进行设置，而且只应包括那些与默认设置不同的参数。每台服务器都必须配置 ring0_addr 。

4、keepalived高可用nginx负载均衡

标签：int cat addition gen 回顾 cipher ilb role 16px

原文地址：https://www.cnblogs.com/hanshanxiaoheshang/p/10117152.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行