对nginx做负载均衡实现双机热备
⾸先:
使⽤nginx做为负载均衡器时,通讯模型类似于LVS-NAT,在某些情况下,随着集节点数量的增长,nginx将会成为⽹络通讯的瓶颈,因为所有应答数据包都必须通过nginx,⼀颗400MHz的处理器能够容纳100Mbps的连接,因此,在⼀般情况下,⽹络更可能⽐LVS Director更可能成为瓶颈。在这种情况下,使⽤LVS-DR⽐使⽤nginx做负载均衡器上更可靠⼀些。
使⽤nginx+keepalived的可⾏性:
Keepalived是Linux下⾯实现VRRP 备份路由的⾼可靠性运⾏件。基于Keepalived设计的服务模式能够真正做到主服务器和备份服务器故障时IP瞬间⽆缝交接。在新浪动态应⽤平台上,Keepalived配合LVS在线上服务中有着很好的稳定性。
Nginx是基于Linux 2.6内核中epoll模型http服务器,与Apache进程派⽣模式不同的是Nginx进程基于于Master+Slave多进程模型,⾃⾝具有⾮常稳定的⼦进程管理功能。在Master进程分配模式下,Master进程永远不进⾏业务处理,只是进⾏任务分发,从⽽达到Master进程的存活⾼可靠性,Slave进程所有的业务信号都由主进程发出,Slave进程所有的超时任务都会被Master中⽌,属于⾮阻塞式任务模型。在新浪
博客应⽤平台上,经过将近8个⽉的运⾏,没有因为主进程退出或者⼦进程僵死导致服务中致的故障存在。
在⽣产环境中,任何的机器宕机导致的损失都需要被降到最低,传统的⽣产环境中,都是将服务器直接放置在4/7层交换机后⾯以避免因为服务器或者服务器软件故障导致的服务中⽌。当前的业务模式下,有许多⾼并发的服务需求,Js⼩⽂件、⾼速动态接⼝、Nginx七层业务,都希望所有的Socket操作能够尽快完成,减少⽤户的时间等待。4/7层交换机由于负责了新浪全站多个产品的服务,经常会成为⾼并发服务应⽤的⼀个制约条件。于是,就孕育出了使⽤Keepalived+Nginx实现双机交叉热备使⽤公⽹ ip进⾏DNS轮询服务的想法,这个⽅案可以运⽤于需要⾼并发服务的所有应⽤环境。越少的Socket通讯层,数据到达⽤户桌⾯的速度越快。
1、服务器IP存活检测:
服务器IP存活检测是由Keepalived⾃⼰本⾝完成的,将2台服务器配置成Keepalived互为主辅关系,任意⼀⽅机器故障对⽅都能够将IP接管过去。
2、服务器应⽤服务存活检测:
⼀个正常的业务服务,除了保证服务器的状态存活之外,还需要应⽤业务的存活。之前之所以有Apac
he服务器因为进程僵死导致HTTP不响应从⽽影响服务是因为Apache的进程模式导致的。在Nginx的进程模型下,可以认为只要Nginx进程存活状态,服务就是正常的,于是只需要做到检测进程存活就能够做到检测服务的存活。Slave进程的健康状态由Nginx⾃⾝的Master进程去完成,Master进程的存活可以通过服务器上的专⽤脚本进⾏监测,⼀旦发现Nginx Master进程异常退出,则⽴即重新启动Nginx进程,该⽅案已经在新浪博客系统上运⾏近半年。
3、服务器在线维护:
Keepalived的服务IP通过其配置⽂件进⾏管理,依靠其⾃⾝的进程去确定服务器的存活状态,如果在需要对服务器进程在线维护的情况下,只需要停掉被维护机器的Keepalived服务进程,另外⼀台服务器就能够接管该台服务器的所有应⽤。
上⾯的可⾏性的⽂章转⾃其他blog,以下是根据上⾯的⽅案做的配置笔记,另外,我还没有搞明⽩keepalived如何防⽌脑裂,因此,现在,个⼈觉得,⽤heartbeat做双机的热备更可靠⼀些,⽂章的后⾯有使⽤heartbeat做双机热备的配置。
⽅案⼀使⽤keepalived做nginx负载均衡器的双机热备
Keepalived为LVS集提供强劲的健康检查机制。它实现了⼀个多层L3、L4、L5/7容错健康检查框架,
当有Server Pool宕机后通过socket 通知***内核***将其从Server Pools中剔除,进⼀步提⾼Linux Virtual Server project项⽬的High Availability。同时提供了独⽴的VRRPv2栈来及时处理 director failover ,及时为LVS集节点健康检查及LVS directors failover。
在这⾥我们只使⽤keepalived的vrrp的功能,使主服务器和备份服务器故障时IP瞬间⽆缝交接。
1 安装
./configure --prefix=/usr/local/keepalived
make
make install
2 配置⽂件
vi /usr/local/keepalived/etc/f
Master的配置⽂件
vrrp_instance VI_INET1 {
state MASTER #(主机为MASTER,备⽤机为BACKUP)
interface eth0 #(HA监测⽹络接⼝)
mcast_src_ip 192.168.7.191 #(VRRP Multicast⼴播源地址,分别取主、备机地址,不能取与virtual_ipaddress相同)
track_interface { #其他要监测状态的接⼝
  eth1
}
virtual_router_id 53 #(主、备机的virtual_router_id必须相同)
priority 200 #(主、备机取不同的优先级,主机值较⼤,备份机值较⼩,值越⼤优先级越⾼)
advert_int 5 #(VRRP Multicast⼴播周期秒数)
authentication {
auth_type pass #(VRRP认证⽅式)
auth_pass yourpass #(VRRP⼝令字)
}
virtual_ipaddress {
192.168.7.100 #(VRRP HA虚拟地址)
}
}
Slave的配置⽂件
vrrp_instance VI_INET1 {
state BACKUP
interface eth0
track_interface {
  eth1
}
virtual_router_id 53
priority 100
advert_int 5
authentication {
auth_type pass
auth_pass yourpass
}
virtual_ipaddress {
192.168.7.100
}
}
track_interface的意思是将Linux中你想监控的⽹络接⼝卡监控起来,当其中的⼀块出现故障是keepalived都将视为路由器出现故障。启动
在启动前先查看IP地址。注:不能使⽤ifconfig查看
[root@real1 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:0c:29:c8:9b:3c brd ff:ff:ff:ff:ff:ff
inet 192.168.7.191/24 brd 192.168.7.255 scope global eth0
/usr/local/keepalived/sbin/keepalived -D -f /usr/local/keepalived/etc/f
启动后
[root@real1 ~]# ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast qlen 1000
link/ether 00:0c:29:c8:9b:3c brd ff:ff:ff:ff:ff:ff
inet 192.168.7.191/24 brd 192.168.7.255 scope global eth0
inet 192.168.7.100/32 scope global eth0
当Master失效时,Backup就会通过MultiCast地址:224.0.0.18(vrrp的默认地址)这个组播地址,获得这个消息,并将192.168.7.100这个地址接管过来。
别忘记在iptables配置当中增加:
-I INPUT -s <;主/备服务器ip> -d 224.0.0.18 -j ACCEPT
⽼外的HA配置:
Using keepalived to failover routers
vrrpd is a router failover demon protocol. While keepalived uses it to failover LVS, vrrpd can be used independantly of LVS
to failover a pair of routers.
Graeme Fowler graeme (at) graemef (dot) net 11 Sep 2007
config for the ACTIVE router looks like:
# f for HA "routers"
global_defs {
notification_email {
recipient@mail.domain
}
notification_email_from root@fqdn.of.machine
smtp_server 1.2.3.4
smtp_connect_timeout 60
router_id router_1
}
vrrp_script check_running {
script "/usr/local/bin/check_running"
interval 10
weight 10
}
vrrp_script always_succeed {
script "/bin/date"
interval 10
weight 10
}
vrrp_script always_fail {
script "/usr/local/bin/always_fail"
interval 10
weight 10
}
vrrp_instance ROUTER_1 {
state MASTER
smtp_alert
interface eth0
virtual_router_id 101
priority 100
advert_int 3
authentication {
auth_type PASS
auth_pass whatever
}
virtual_ipaddress {
1.1.1.1
}
track_script {
check_running weight 20
}
}
...the corresponding config for the BACKUP looks like:
# f for HA "routers"
nginx 配置文件
global_defs {
notification_email {
recipient@mail.domain
}
notification_email_from root@fqdn.of.machine
smtp_server 1.2.3.4
smtp_connect_timeout 60
router_id router_2
}
vrrp_script check_running {
script "/usr/local/bin/check_running"
interval 10
weight 10
}
vrrp_script always_succeed {
script "/bin/date"
interval 10
weight 10
}
vrrp_script always_fail {
script "/usr/local/bin/always_fail"
interval 10
weight 10
}
vrrp_instance ROUTER_1 {
state BACKUP
smtp_alert
interface eth0
virtual_router_id 101
priority 90
advert_int 3
authentication {
auth_type PASS
auth_pass whatever
}
virtual_ipaddress {
1.1.1.1
}
track_script {
check_running weight 20
}
}
< it differs in the "weight" stanza for the VRRP definition (90 instead of 100) and there are cosmetic differences to the name.
The "check_running" script is simply a wrapper round:
KILLALL -0 procname
if the result code ($?) is 0, it exits with 0. If not, it exits with 1.
If it exits with 1, the weight of the VRRP announcement is pulled down by 20 - this makes sure that the critical process on this machine is up, and if it isn't then we play a smaller part in the VRRP adverts (these are derived from a pair of frontend mail servers).
关于检查nginx状态的脚本,使⽤写好的启动脚本,运⾏时判断状态是否running就可以了,crontab,定时运⾏。
nginx启动脚本,放于/etc/init.d/nginxd
#!/bin/bash
# nginx Startup script for the Nginx HTTP Server
# this script create it by jackbillow at 2007.10.15.
# it is v.0.0.2 version.
# if you find any errors on this scripts,please contact jackbillow.
# and send mail to jackbillow at gmail dot com.
#
# chkconfig: - 85 15
# description: Nginx is a high-performance web and proxy server.
# It has a lot of features, but it's not for everyone.
# processname: nginx
# pidfile: /usr/local/nginx/logs/nginx.pid
# config: /usr/local/nginx/f
nginxd=/usr/local/nginx/sbin/nginx
nginx_config=/usr/local/nginx/f
nginx_pid=/var/run/nginx.pid
RETVAL=0
prog="nginx"
# Source function library.
. /etc/rc.d/init.d/functions
# Source networking configuration.
. /etc/sysconfig/network
# Check that networking is up.
[ ${NETWORKING} = "no" ] && exit 0
[ -x $nginxd ] || exit 0
# Start nginx daemons functions.
start() {
if [ -e $nginx_pid ];then
echo "nginx "
exit 1
fi
echo -n $"Starting $prog: "
daemon $nginxd -c ${nginx_config}
RETVAL=$?
echo
[ $RETVAL = 0 ] && touch /var/lock/subsys/nginx
return $RETVAL
}
# Stop nginx daemons functions.