首页 > 代码库 > Nginx缓存功能、防盗链、URL重写
Nginx缓存功能、防盗链、URL重写
nginx做为反向代理时,能够将来自upstream的响应缓存至本地,并在后续的客户端请求同样内容时直接从本地构造响应报文。
nginx的缓存数据结构:
共享内存:存储键和缓存对象元数据
磁盘空间:存储数据
- 用法:
Syntax: |
proxy_cache_path path [levels=levels] [use_temp_path=on|off] keys_zone=name:size [inactive=time] [max_size=size] [manager_files=number] [manager_sleep=time] [manager_threshold=time] [loader_files=number] [loader_sleep=time] [loader_threshold=time] [purger=on|off] [purger_files=number] [purger_sleep=time] [purger_threshold=time]; |
Default: |
— |
Context: |
http |
proxy_cache zone|off:定义一个用于缓存的共享内存区域,其可被多个地方调用;缓存将遵从upstream服务器的响应报文首部中关于缓存的设定,如 "Expires"、"Cache-Control: no-cache"、 "Cache-Control: max-age=XXX"、"private"和"no-store" 等,但nginx在缓存时不会考虑响应报文的"Vary"首部。为了确保私有信息不被缓存,所有关于用户的私有信息可以upstream上通过"no-cache" or "max-age=0"来实现,也可在nginx设定proxy_cache_key必须包含用户特有数据如$cookie_xxx的方式实现,但最后这种方式在公共缓存上使用可能会有风险。因此,在响应报文中含有以下首部或指定标志的报文将不会被缓存。
Set-Cookie
Cache-Control containing "no-cache", "no-store", "private", or a "max-age" with a non-numeric or 0 value
Expires with a time in the past
X-Accel-Expires: 0
proxy_cache_key:设定在存储及检索缓存时用于“键”的字符串,可以使用变量为其值,但使用不当时有可能会为同一个内容缓存多次;另外,将用户私有信息用于键可以避免将用户的私有信息返回给其它用户;
proxy_cache_lock:启用此项,可在缓存未命令中阻止多个相同的请求同时发往upstream,其生效范围为worker级别;
proxy_cache_lock_timeout:proxy_cache_lock功能的锁定时长;
proxy_cache_min_uses:某响应报文被缓存之前至少应该被请求的次数;
proxy_cache_path:定义一个用记保存缓存响应报文的目录,及一个保存缓存对象的键及响应元数据的共享内存区域(keys_zone=name:size),其可选参数有:
levels:每级子目录名称的长度,有效值为1或2,每级之间使用冒号分隔,最多为3级;
inactive:非活动缓存项从缓存中剔除之前的最大缓存时长;
max_size:缓存空间大小的上限,当需要缓存的对象超出此空间限定时,缓存管理器将基于LRU算法对其进行清理;
loader_files:缓存加载器(cache_loader)的每次工作过程最多为多少个文件加载元数据;
loader_sleep:缓存加载器的每次迭代工作之后的睡眠时长;
loader_threashold:缓存加载器的最大睡眠时长;
例如: proxy_cache_path /data/nginx/cache/one levels=1 keys_zone=one:10m;
proxy_cache_path /data/nginx/cache/two levels=2:2 keys_zone=two:100m;
proxy_cache_path /data/nginx/cache/three levels=1:1:2 keys_zone=three:1000m;
proxy_cache_use_stale:在无法联系到upstream服务器时的哪种情形下(如error、timeout或http_500等)让nginx使用本地缓存的过期的缓存对象直接响应客户端请求;其格式为:
proxy_cache_use_stale error | timeout | invalid_header | updating | http_500 | http_502 | http_503 | http_504 | http_404 | off
proxy_cache_valid [ code ...] time:用于为不同的响应设定不同时长的有效缓存时长,例如:proxy_cache_valid 200 302 10m;
proxy_cache_methods [GET HEAD POST]:为哪些请求方法启用缓存功能;
proxy_cache_bypass string:设定在哪种情形下,nginx将不从缓存中取数据;例如:
proxy_cache_bypass $cookie_nocache $arg_nocache $arg_comment;
proxy_cache_bypass $http_pragma $http_authorization;
http {
proxy_cache_path /data/nginx/cache levels=1:2 keys_zone=STATIC:10m
inactive=24h max_size=1g;
server {
location / {
proxy_pass http://www.magedu.com;
proxy_set_header Host $host;
proxy_cache STATIC;
proxy_cache_valid 200 1d;
proxy_cache_valid 301 302 10m;
proxy_cache_vaild any 1m;
proxy_cache_use_stale error timeout invalid_header updating
http_500 http_502 http_503 http_504;
}
}
}
- 压缩
nginx将响应报文发送至客户端之前可以启用压缩功能,这能够有效地节约带宽,并提高响应至客户端的速度。通常编译nginx默认会附带gzip压缩的功能,因此,可以直接启用之。
http {
gzip on;
gzip_http_version 1.0;
gzip_comp_level 2;
gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript application/javascript application/json;
gzip_disable msie6;
}
gzip_proxied指令可以定义对客户端请求哪类对象启用压缩功能,如“expired”表示对由于使用了expire首部定义而无法缓存的对象启用压缩功能,其它可接受的值还有“no-cache”、“no-store”、“private”、“no_last_modified”、“no_etag”和“auth”等,而“off”则表示关闭压缩功能。
- 配置示例
反向代理启用upstream和缓存:
http {
include mime.types;
default_type application/octet-stream;
sendfile on;
keepalive_timeout 65;
proxy_cache_path /nginx/cache/first levels=1:2 keys_zone=first:10m max_size=512m;
upstream websrv {
server 172.16.100.11 weight=1;
server 172.16.100.12 weight=1;
server 127.0.0.1:8080 backup;
}
server {
listen 80;
server_name www.magedu.com;
add_header X-Via $server_addr;
add_header X-Cache-Status $upstream_cache_status;
location / {
proxy_pass http://websrv;
proxy_cache first;
proxy_cache_valid 200 1d;
proxy_cache_valid 301 302 10m;
proxy_cache_valid any 1m;
index index.html index.htm;
if ($request_method ~* "PUT") {
proxy_pass http://172.16.100.12;
break;
}
}
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root html;
}
}
server {
listen 8080;
server_name localhost;
root /nginx/htdocs;
index index.html;
}
}
加入头信息:
add_header X-Via $server_addr;
add_header X-Cache-Status $upstream_cache_status;
配置缓存:
proxy_cache_path /nginx/cache/first levels=1:2 keys_zone=first:10m max_size=512m;
启用:
proxy_cache first;
proxy_cache_valid 200 1d;
proxy_cache_valid 301 302 10m;
proxy_cache_valid any 1m;
- 启用Nginx日志缓存:
设定错误日志格式及级别:
http {
log_format combined ‘$remote_addr - $remote_user [$time_local] ‘
‘"$request" $status $body_bytes_sent ‘
‘"$http_referer" "$http_user_agent"‘;
access_log /var/log/nginx/access.log combined;
error_log /var/log/nginx/error.log crit;
...
}
记录类似apache格式的日志:
log_format main ‘$remote_addr - $remote_user [$time_local] ‘
‘"$request" $status $body_bytes_sent "$http_referer" ‘
‘"$http_user_agent" "$http_x_forwarded_for"‘;
access_log /var/log/nginx/access.log main;
启用日志缓存:
http {
...
open_log_file_cache max=1000 inactive=20s min_uses=2 valid=1m;
...
}
- URL重写
实现域名跳转
server
{
listen 80;
server_name jump.magedu.com;
index index.html index.php;
root /www/htdocs;
rewrite ^/ http://www.magedu.com/;
}
实现域名镜像
server
{
listen 80;
server_name mirror.magedu.com;
index index.html index.php;
root /www/htdocs;
rewrite ^/(.*)$ http://www.magedu.com/$1 last;
}
- 防盗链功能
简单的防盗链配置:
location ~* \.(gif|jpg|png|swf|flv)$ {
valid_referers none blocked www.magedu.com;
if ($invalid_referer) {
rewrite ^/ http://www.magedu.com/403.html;
# return 404
}
}
第一行:gif|jpg|png|swf|flv
表示对gif、jpg、png、swf、flv后缀的文件实行防盗链
第二行:www.magedu.com
表示对www.magedu.com这个来路进行判断if{}里面内容的意思是,如果来路不是指定来路就跳转到错误页面,当然直接返回404也是可以的。
- if语句中的判断条件
正则表达式匹配:
~:与指定正则表达式模式匹配时返回“真”,判断匹配与否时区分字符大小写;
~*:与指定正则表达式模式匹配时返回“真”,判断匹配与否时不区分字符大小写;
!~:与指定正则表达式模式不匹配时返回“真”,判断匹配与否时区分字符大小写;
!~*:与指定正则表达式模式不匹配时返回“真”,判断匹配与否时不区分字符大小写;
文件及目录匹配判断:
-f, !-f:判断指定的路径是否为存在且为文件;
-d, !-d:判断指定的路径是否为存在且为目录;
-e, !-e:判断指定的路径是否存在,文件或目录均可;
-x, !-x:判断指定路径的文件是否存在且可执行;
- if设定限速
为某个特定路径限速:
server {
server_name www.magedu.com;
location /downloads/ {
limit_rate 20k;
root /web/downloads/;
}
..
}
限制搜索引擎的bot速度:
if ($http_user_agent ~ Google|Yahoo|MSN|baidu) {
limit_rate 20k;
}
- nginx常用的全局变量
下面是nginx常用的全局变量中的一部分,它们经常用于if语句中实现条件判断。
$arg_PARAMETER This variable contains the value of the GET request variable PARAMETER if present in the query string.
$args This variable contains the query string in the URL, for example foo=123&bar=blahblah if the URL is http://example1. com/? foo=123&bar=blahblah
$binary_remote_addr The address of the client in binary form.
$body_bytes_sent The bytes of the body sent.
$content_length This variable is equal to line Content-Length in the header of request.
$content_type This variable is equal to line Content-Type in the header of request.
$document_root This variable is equal to the value of directive root for the current request.
$document_uri The same as $uri.
$host This variable contains the value of the ‘Host‘ value in the request header, or the name of the server processing if the ‘Host‘ value is not available.
$http_HEADER The value of the HTTP header HEADER when converted to lowercase and with "dashes" converted to "underscores", for example, $http_user_agent, $http_referer.
$is_args Evaluates to "?" if $args is set, returns "" otherwise.
$request_uri This variable is equal to the *original* request URI as received from the client including the args. It cannot be modified. Look at $uri for the post-rewrite/altered URI. Does not include host name. Example: "/foo/bar.php?arg=baz".
$scheme The HTTP scheme (that is http, https). Evaluated only on demand, for example: rewrite ^(.+)$ $scheme://example.com$1 redirect;
$server_addr This variable contains the server address. It is advisable to indicate addresses correctly in the listen directive and use the bind parameter so that a system call is not made every time this variable is accessed.
$server_name The name of the server.
$server_port This variable is equal to the port of the server, to which the request arrived.
$server_protocol This variable is equal to the protocol of request, usually this is HTTP/1.0 or HTTP/1.1.
$uri This variable is equal to current URI in the request (without arguments, those are in $args.) It can differ from $request_uri which is what is sent by the browser. Examples of how it can be modified are internal redirects, or with the use of index. Does not include host name. Example: "/foo/bar.html"
Nginx缓存功能、防盗链、URL重写