首页 > 代码库 > 【Nginx】负载均衡-加权轮询策略剖析
【Nginx】负载均衡-加权轮询策略剖析
转自:江南烟雨
本文介绍的是客户端请求在多个后端服务器之间的均衡,注意与客户端请求在多个nginx进程之间的均衡相区别。
如果Nginx是以反向代理的形式配置运行,那么对请求的实际处理需要转发到后端服务器运行,如果后端服务器有多台,如何选择一台合适的后端服务器来处理当前请求,就是本文要说的负载均衡。这两种均衡互不冲突并且能同时生效。
负载均衡模块简介
负载均衡模块Load-balance是辅助模块,主要为Upstream模块服务,目标明确且单一:如何从多台后端服务器中选择出一台合适的服务器来处理当前请求
nginx负载均衡模块ngx_http_upstream_module允许定义一组服务器,这组服务器可以被proxy_pass,fastcgi_pass和memcached_pass这些指令引用。
配置例子:
upstream backend { server backend1.example.com weight=5; server backend2.example.com:8080; server unix:/tmp/backend3; server backup1.example.com:8080 backup; server backup2.example.com:8080 backup;} server { location / { proxy_pass http://backend; }}
负载均衡策略
nginx的负载均衡策略可以划分为两大类:内置策略和扩展策略。内置策略包含加权轮询和ip hash,在默认情况下这两种策略会编译进nginx内核,只需在nginx配置中指明参数即可。扩展策略有很多,如fair、通用hash、consistent hash等,默认不编译进nginx内核,是第三方模块。
nginx 的 upstream目前支持 4 种方式的分配 :
1)轮询(默认)
每个请求按时间顺序逐一分配到不同的后端服务器,如果后端服务器down掉,能自动剔除。
2)weight
指定轮询几率,weight和访问比率成正比,用于后端服务器性能不均的情况。
2)ip_hash
每个请求按访问ip的hash结果分配,这样每个访客固定访问一个后端服务器,可以解决session的问题。
3)fair(第三方)
按后端服务器的响应时间来分配请求,响应时间短的优先分配。
4)url_hash(第三方)
Nginx 默认采用round_robin加权算法。如果要选择其他的负载均衡算法,必须在upstream的配置上下文中通过配置指令明确指定(该配置项最好放在其他server指令等的前面,以便检查server的配置选项是否合理)。比如采用Ip_hash的upstream配置如下所示:
upstream load_balance{ ip_hash; server localhost:8001; server localhost:8002;}
当 整个http配置块被Nginx解析完毕之后,会调用各个http模块对应的初始函数。对于模块ngx_http_upstream_module而言, 对应的main配置初始函数是ngx_http_upstream_init_main_conf(),在这个函数中有这样一段代码:
for (i = 0; i < umcf->upstreams.nelts; i++) { init = uscfp[i]->peer.init_upstream ? uscfp[i]->peer.init_upstream: ngx_http_upstream_init_round_robin; if (init(cf, uscfp[i]) != NGX_OK) { return NGX_CONF_ERROR; }}
默 认采用加权轮询策略的原因就是在于上述代码中的init赋值一行。如果用户没有做任何策略选择,那么执行的策略初始函数为 ngx_http_upstream_init_round_robin,也就是加权轮询策略。否则的话执行的是 uscfp[i]->peer.init_upstream指针函数,如果有配置执行ip_hash ,那么就是ngx_http_upstream_init_ip_hash()。
全局准备工作
//函数:初始化服务器负载均衡表 //参数://us:ngx_http_upstream_main_conf_t结构体中upstreams数组元素ngx_int_tngx_http_upstream_init_round_robin(ngx_conf_t *cf, ngx_http_upstream_srv_conf_t *us){ ngx_url_t u; ngx_uint_t i, j, n, w; ngx_http_upstream_server_t *server; ngx_http_upstream_rr_peers_t *peers, *backup; //回调指针设置 us->peer.init = ngx_http_upstream_init_round_robin_peer; //服务器数组指针不为空 if (us->servers) { server = us->servers->elts; n = 0; w = 0; //遍历所有服务器 for (i = 0; i < us->servers->nelts; i++) { //是后备服务器,跳过 if (server[i].backup) { continue; } //服务器地址数量统计 n += server[i].naddrs; //总的权重计算 w += server[i].naddrs * server[i].weight; } if (n == 0) { ngx_log_error(NGX_LOG_EMERG, cf->log, 0, "no servers in upstream \"%V\" in %s:%ui", &us->host, us->file_name, us->line); return NGX_ERROR; } //为非后备服务器分配空间 peers = ngx_pcalloc(cf->pool, sizeof(ngx_http_upstream_rr_peers_t) + sizeof(ngx_http_upstream_rr_peer_t) * (n - 1)); if (peers == NULL) { return NGX_ERROR; } //非后备服务器列表头中各属性设置 peers->single = (n == 1); peers->number = n; peers->weighted = (w != n); peers->total_weight = w; peers->name = &us->host; n = 0; //后备服务器列表中各服务器项设置 for (i = 0; i < us->servers->nelts; i++) { for (j = 0; j < server[i].naddrs; j++) { if (server[i].backup) { continue; } peers->peer[n].sockaddr = server[i].addrs[j].sockaddr; peers->peer[n].socklen = server[i].addrs[j].socklen; peers->peer[n].name = server[i].addrs[j].name; peers->peer[n].max_fails = server[i].max_fails; peers->peer[n].fail_timeout = server[i].fail_timeout; peers->peer[n].down = server[i].down; peers->peer[n].weight = server[i].weight; peers->peer[n].effective_weight = server[i].weight; peers->peer[n].current_weight = 0; n++; } } //非后备服务器列表挂载的位置 us->peer.data =http://www.mamicode.com/ peers; /* backup servers */ //后备服务器 n = 0; w = 0; for (i = 0; i < us->servers->nelts; i++) { if (!server[i].backup) { continue; } //后备服务器地址数量统计 n += server[i].naddrs; //后备服务器总权重计算 w += server[i].naddrs * server[i].weight; } if (n == 0) { return NGX_OK; } //后备服务器列表地址空间分配 backup = ngx_pcalloc(cf->pool, sizeof(ngx_http_upstream_rr_peers_t) + sizeof(ngx_http_upstream_rr_peer_t) * (n - 1)); if (backup == NULL) { return NGX_ERROR; } peers->single = 0; //后备服务器列表头中各属性设置 backup->single = 0; backup->number = n; backup->weighted = (w != n); backup->total_weight = w; backup->name = &us->host; n = 0; //后备服务器列表中各服务器项设置 for (i = 0; i < us->servers->nelts; i++) { for (j = 0; j < server[i].naddrs; j++) { if (!server[i].backup) { continue; } backup->peer[n].sockaddr = server[i].addrs[j].sockaddr; backup->peer[n].socklen = server[i].addrs[j].socklen; backup->peer[n].name = server[i].addrs[j].name; backup->peer[n].weight = server[i].weight; backup->peer[n].effective_weight = server[i].weight; backup->peer[n].current_weight = 0; backup->peer[n].max_fails = server[i].max_fails; backup->peer[n].fail_timeout = server[i].fail_timeout; backup->peer[n].down = server[i].down; n++; } } //后备服务器挂载 peers->next = backup; return NGX_OK; } //us参数中服务器指针为空,例如用户直接在proxy_pass等指令后配置后端服务器地址 /* an upstream implicitly defined by proxy_pass, etc. */ if (us->port == 0) { ngx_log_error(NGX_LOG_EMERG, cf->log, 0, "no port in upstream \"%V\" in %s:%ui", &us->host, us->file_name, us->line); return NGX_ERROR; } ngx_memzero(&u, sizeof(ngx_url_t)); u.host = us->host; u.port = us->port; //IP地址解析 if (ngx_inet_resolve_host(cf->pool, &u) != NGX_OK) { if (u.err) { ngx_log_error(NGX_LOG_EMERG, cf->log, 0, "%s in upstream \"%V\" in %s:%ui", u.err, &us->host, us->file_name, us->line); } return NGX_ERROR; } n = u.naddrs; peers = ngx_pcalloc(cf->pool, sizeof(ngx_http_upstream_rr_peers_t) + sizeof(ngx_http_upstream_rr_peer_t) * (n - 1)); if (peers == NULL) { return NGX_ERROR; } peers->single = (n == 1); peers->number = n; peers->weighted = 0; peers->total_weight = n; peers->name = &us->host; for (i = 0; i < u.naddrs; i++) { peers->peer[i].sockaddr = u.addrs[i].sockaddr; peers->peer[i].socklen = u.addrs[i].socklen; peers->peer[i].name = u.addrs[i].name; peers->peer[i].weight = 1; peers->peer[i].effective_weight = 1; peers->peer[i].current_weight = 0; peers->peer[i].max_fails = 1; peers->peer[i].fail_timeout = 10; } us->peer.data =http://www.mamicode.com/ peers; /* implicitly defined upstream has no backup servers */ return NGX_OK;}
针对一个客户端请求的初始化工作
//函数://功能:针对每个请求选择后端服务器前做一些初始化工作ngx_int_tngx_http_upstream_init_round_robin_peer(ngx_http_request_t *r, ngx_http_upstream_srv_conf_t *us){ ngx_uint_t n; ngx_http_upstream_rr_peer_data_t *rrp; rrp = r->upstream->peer.data; if (rrp == NULL) { rrp = ngx_palloc(r->pool, sizeof(ngx_http_upstream_rr_peer_data_t)); if (rrp == NULL) { return NGX_ERROR; } r->upstream->peer.data =http://www.mamicode.com/ rrp; } rrp->peers = us->peer.data; rrp->current = 0; //n取值为:非后备服务器和后备服务器列表中个数较大的那个值 n = rrp->peers->number; if (rrp->peers->next && rrp->peers->next->number > n) { n = rrp->peers->next->number; } //如果n小于一个指针变量所能表示的范围 if (n <= 8 * sizeof(uintptr_t)) { //直接使用已有的指针类型的data变量做位图(tried是位图,用来标识在一轮选择中,各个后端服务器是否已经被选择过) rrp->tried = &rrp->data; rrp->data = http://www.mamicode.com/0; } else { //否则从内存池中申请空间 n = (n + (8 * sizeof(uintptr_t) - 1)) / (8 * sizeof(uintptr_t)); rrp->tried = ngx_pcalloc(r->pool, n * sizeof(uintptr_t)); if (rrp->tried == NULL) { return NGX_ERROR; } } //回调函数设置 r->upstream->peer.get = ngx_http_upstream_get_round_robin_peer; r->upstream->peer.free = ngx_http_upstream_free_round_robin_peer; r->upstream->peer.tries = rrp->peers->number;#if (NGX_HTTP_SSL) r->upstream->peer.set_session = ngx_http_upstream_set_round_robin_peer_session; r->upstream->peer.save_session = ngx_http_upstream_save_round_robin_peer_session;#endif return NGX_OK;}
对后端服务器进行一次选择
对后端服务器做一次选择的逻辑在函数ngx_http_upstream_get_round_robin_peer内,流程图如下:
代码如下:
//函数://功能:对后端服务器做一次选择ngx_int_tngx_http_upstream_get_round_robin_peer(ngx_peer_connection_t *pc, void *data){ ngx_http_upstream_rr_peer_data_t *rrp = data; ngx_int_t rc; ngx_uint_t i, n; ngx_http_upstream_rr_peer_t *peer; ngx_http_upstream_rr_peers_t *peers; ngx_log_debug1(NGX_LOG_DEBUG_HTTP, pc->log, 0, "get rr peer, try: %ui", pc->tries); /* ngx_lock_mutex(rrp->peers->mutex); */ pc->cached = 0; pc->connection = NULL; //如果只有一台后端服务器,Nginx直接选择并返回 if (rrp->peers->single) { peer = &rrp->peers->peer[0]; if (peer->down) { goto failed; } } else { //有多台后端服务器 /* there are several peers */ //按照各台服务器的当前权值进行选择 peer = ngx_http_upstream_get_peer(rrp); if (peer == NULL) { goto failed; } ngx_log_debug2(NGX_LOG_DEBUG_HTTP, pc->log, 0, "get rr peer, current: %ui %i", rrp->current, peer->current_weight); } //设置连接的相关属性 pc->sockaddr = peer->sockaddr; pc->socklen = peer->socklen; pc->name = &peer->name; /* ngx_unlock_mutex(rrp->peers->mutex); */ if (pc->tries == 1 && rrp->peers->next) { pc->tries += rrp->peers->next->number; } return NGX_OK; //选择失败,转向后备服务器failed: peers = rrp->peers; if (peers->next) { /* ngx_unlock_mutex(peers->mutex); */ ngx_log_debug0(NGX_LOG_DEBUG_HTTP, pc->log, 0, "backup servers"); rrp->peers = peers->next; pc->tries = rrp->peers->number; n = (rrp->peers->number + (8 * sizeof(uintptr_t) - 1)) / (8 * sizeof(uintptr_t)); for (i = 0; i < n; i++) { rrp->tried[i] = 0; } rc = ngx_http_upstream_get_round_robin_peer(pc, rrp); if (rc != NGX_BUSY) { return rc; } /* ngx_lock_mutex(peers->mutex); */ } /* all peers failed, mark them as live for quick recovery */ for (i = 0; i < peers->number; i++) { peers->peer[i].fails = 0; } /* ngx_unlock_mutex(peers->mutex); */ pc->name = peers->name; //如果后备服务器也选择失败,则返回NGX_BUSY return NGX_BUSY;}
后端服务器权值计算在函数ngx_http_upstream_get_peer中:
//按照当前各服务器权值进行选择static ngx_http_upstream_rr_peer_t *ngx_http_upstream_get_peer(ngx_http_upstream_rr_peer_data_t *rrp){ time_t now; uintptr_t m; ngx_int_t total; ngx_uint_t i, n; ngx_http_upstream_rr_peer_t *peer, *best; now = ngx_time(); best = NULL; total = 0; for (i = 0; i < rrp->peers->number; i++) { //计算当前服务器的标记位在位图中的位置 n = i / (8 * sizeof(uintptr_t)); m = (uintptr_t) 1 << i % (8 * sizeof(uintptr_t)); //已经选择过,跳过 if (rrp->tried[n] & m) { continue; } //当前服务器对象 peer = &rrp->peers->peer[i]; //当前服务器已宕机,排除 if (peer->down) { continue; } //根据指定一段时间内最大失败次数做判断 if (peer->max_fails && peer->fails >= peer->max_fails && now - peer->checked <= peer->fail_timeout) { continue; } peer->current_weight += peer->effective_weight; total += peer->effective_weight; if (peer->effective_weight < peer->weight) { peer->effective_weight++; } if (best == NULL || peer->current_weight > best->current_weight) { best = peer; } } if (best == NULL) { return NULL; } //所选择的服务器在服务器列表中的位置 i = best - &rrp->peers->peer[0]; rrp->current = i; n = i / (8 * sizeof(uintptr_t)); m = (uintptr_t) 1 << i % (8 * sizeof(uintptr_t)); //位图相应位置置位 rrp->tried[n] |= m; best->current_weight -= total; best->checked = now; return best;}
整个加权轮询的流程图如下: