首页 > 代码库 > gre网络细节

gre网络细节

一、OpenStack网络设备的命名规律:

1、TenantA的router和Linux网络命名空间qrouter名称

技术分享
root@controller:~# neutron --os-tenant-name TenantA --os-username UserA --os-password password --os-auth-url=http://localhost:5000/v2.0 router-list --field id --field name
+--------------------------------------+-----------+
| id                                   | name      |
+--------------------------------------+-----------+
| 680944ad-679c-4fe8-ae4b-258cd8ac337f | tenant-R1 |
+--------------------------------------+-----------+
技术分享
root@network:~# ip netns
qdhcp-7c22bbd9-166c-4610-9a3d-3b8b92c77518
qrouter-680944ad-679c-4fe8-ae4b-258cd8ac337f

即租户的虚拟路由器ID号和qrouter命名相对应。

2、TenantA的network和Linux网络命名空间qdhcp名称

技术分享
root@controller:~# neutron --os-tenant-name TenantA --os-username UserA --os-password password --os-auth-url=http://localhost:5000/v2.0 net-list  --field id --field name
+--------------------------------------+-------------+
| id                                   | name        |
+--------------------------------------+-------------+
| 7c22bbd9-166c-4610-9a3d-3b8b92c77518 | tenantA-Net |
| c8699820-7c6d-4441-9602-3425f2c630ec | Ext-Net     |
+--------------------------------------+-------------+
技术分享
root@network:~# ip netns
qdhcp-7c22bbd9-166c-4610-9a3d-3b8b92c77518
qrouter-680944ad-679c-4fe8-ae4b-258cd8ac337f

租户虚拟网络的ID号,与qdhcp命名相对应。

3、TenantA网络端口和其它的网络设备的名称

技术分享
root@controller:~# neutron --os-tenant-name TenantA --os-username UserA --os-password password --os-auth-url=http://localhost:5000/v2.0 port-list
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------------+
| id                                   | name | mac_address       | fixed_ips                                                                        |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------------+
| 1653ec91-ad7d-40d9-b777-f74aec697026 |      | fa:16:3e:51:a2:97 | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.9"}  |
| 2df7c3ed-dfbb-480d-9cd3-fdefa079e66a |      | fa:16:3e:da:41:49 | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.3"}  |
| 81388454-30e0-45e4-b3dd-b7b2e8dbf067 |      | fa:16:3e:f7:e6:9c | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.1"}  |
| d7233b80-9d4b-4ef6-a60d-19b3be661069 |      | fa:16:3e:75:e0:5a | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.10"} |
+--------------------------------------+------+-------------------+----------------------------------------------------------------------------------+
技术分享

IP地址为10.0.0.9的虚拟机(ID为bec0b963-99c0-4a56-ae04-936d47e173eb)端口为1653ec91-ad7d-40d9-b777-f74aec697026,那么与之相连的网络设备tab ,qbr,qvb,qvo的命名都是加上port ID的前缀11个字符。

验证:

通过查看libvirt XML定义文件/var/lib/nova/instances/<instance-id>/libvirt.xml可以看到qbr和tap。

技术分享
 <interface type="bridge">
      <mac address="fa:16:3e:51:a2:97"/>
      <model type="virtio"/>
      <driver name="qemu"/>
      <source bridge="qbr1653ec91-ad"/>  //虚机TAP设备所挂接的linux bridge
      <target dev="tap1653ec91-ad"/>  //虚机所连接的interface 
</interface>
技术分享

通过virsh list查看qbr连接qvb和tap

技术分享
root@compute1:~# brctl show
bridge name     bridge id               STP enabled     interfaces
qbr1653ec91-ad          8000.22ca68904e2f       no              qvb1653ec91-ad
                                                        tap1653ec91-ad
qbrd7233b80-9d          8000.964cf783c9e1       no              qvbd7233b80-9d
                                                        tapd7233b80-9d
virbr0          8000.000000000000       yes
技术分享

同理,qr加上内部网关IP10.0.0.1的端口ID号前缀就是qrouter下的设备名了。

qg加上路由网关10.1.101.80端口号的前缀就是qrouter下的qg设备名了。

tap加上内网dhcp10.0.0.3的端口ID号前缀就是qdhcp下的设备名了。

可以使用下面这些命令验证:

技术分享
root@controller:~# neutron port-list
+--------------------------------------+------+-------------------+-------------------------------------------------------------------------------------+
| id                                   | name | mac_address       | fixed_ips                                                                           |
+--------------------------------------+------+-------------------+-------------------------------------------------------------------------------------+
| 1653ec91-ad7d-40d9-b777-f74aec697026 |      | fa:16:3e:51:a2:97 | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.9"}     |
| 2df7c3ed-dfbb-480d-9cd3-fdefa079e66a |      | fa:16:3e:da:41:49 | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.3"}     |
| 81388454-30e0-45e4-b3dd-b7b2e8dbf067 |      | fa:16:3e:f7:e6:9c | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.1"}     |
| accd8dbf-0f16-4aec-b797-bbb33abcdc83 |      | fa:16:3e:97:ee:cb | {"subnet_id": "ef86e785-8cec-486a-b67f-dcbba5311293", "ip_address": "10.100.0.103"} |
| bfe7eaa4-26bc-4fe9-9da2-550abf44beaa |      | fa:16:3e:e1:00:41 | {"subnet_id": "2c4155c9-5a2e-471c-a4d8-40a86b45ab0a", "ip_address": "10.1.101.83"}  |
| d7233b80-9d4b-4ef6-a60d-19b3be661069 |      | fa:16:3e:75:e0:5a | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.10"}    |
| eb60f9c4-2ddb-49ee-8b78-2fc2564a7600 |      | fa:16:3e:78:39:e9 | {"subnet_id": "2c4155c9-5a2e-471c-a4d8-40a86b45ab0a", "ip_address": "10.1.101.80"}  |
| f6812a11-c4ce-4880-8566-2206afcc612a |      | fa:16:3e:9e:75:a2 | {"subnet_id": "2c4155c9-5a2e-471c-a4d8-40a86b45ab0a", "ip_address": "10.1.101.82"}  |
+--------------------------------------+------+-------------------+-------------------------------------------------------------------------------------+
技术分享
技术分享
root@network:~# ip netns exec qrouter-680944ad-679c-4fe8-ae4b-258cd8ac337f ifconfig
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

qg-eb60f9c4-2d Link encap:Ethernet  HWaddr fa:16:3e:78:39:e9  
          inet addr:10.1.101.80  Bcast:10.1.101.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fe78:39e9/64 Scope:Link
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:31953 errors:0 dropped:0 overruns:0 frame:0
          TX packets:372 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:4158911 (4.1 MB)  TX bytes:40876 (40.8 KB)

qr-81388454-30 Link encap:Ethernet  HWaddr fa:16:3e:f7:e6:9c  
          inet addr:10.0.0.1  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fef7:e69c/64 Scope:Link
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:882 errors:0 dropped:0 overruns:0 frame:0
          TX packets:832 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:93440 (93.4 KB)  TX bytes:96206 (96.2 KB)
技术分享
技术分享
root@network:~# ip netns exec qdhcp-7c22bbd9-166c-4610-9a3d-3b8b92c77518 ifconfig
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:6 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:3456 (3.4 KB)  TX bytes:3456 (3.4 KB)

tap2df7c3ed-df Link encap:Ethernet  HWaddr fa:16:3e:da:41:49  
          inet addr:10.0.0.3  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:feda:4149/64 Scope:Link
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:117 errors:0 dropped:0 overruns:0 frame:0
          TX packets:48 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:11176 (11.1 KB)  TX bytes:5865 (5.8 KB)
技术分享

二、系统环境

环境见OpenStack三个节点icehouse-gre模式部署

1、系统中的网络设备:

技术分享
root@controller:~# nova list --all-tenant
+--------------------------------------+-------+--------+------------+-------------+-----------------------+
| ID                                   | Name  | Status | Task State | Power State | Networks              |
+--------------------------------------+-------+--------+------------+-------------+-----------------------+
| f467ba96-09c4-4eb7-b79c-5391f326c7d1 | vm001 | ACTIVE | -          | Running     | tenantA-Net=10.0.0.10 |
| bec0b963-99c0-4a56-ae04-936d47e173eb | vm002 | ACTIVE | -          | Running     | tenantA-Net=10.0.0.9  |
+--------------------------------------+-------+--------+------------+-------------+-----------------------+
root@controller:~# neutron net-list   
+--------------------------------------+-------------+----------------------------------------------------+
| id                                   | name        | subnets                                            |
+--------------------------------------+-------------+----------------------------------------------------+
| 7c22bbd9-166c-4610-9a3d-3b8b92c77518 | tenantA-Net | c37d8ed0-372e-4b24-9ba2-897c38c6ddbf 10.0.0.0/24   |
| c8699820-7c6d-4441-9602-3425f2c630ec | Ext-Net     | 2c4155c9-5a2e-471c-a4d8-40a86b45ab0a 10.1.101.0/24 |
+--------------------------------------+-------------+----------------------------------------------------+
root@controller:~# neutron subnet-list
+--------------------------------------+------+---------------+-------------------------------------------------+
| id                                   | name | cidr          | allocation_pools                                |
+--------------------------------------+------+---------------+-------------------------------------------------+
| 2c4155c9-5a2e-471c-a4d8-40a86b45ab0a |      | 10.1.101.0/24 | {"start": "10.1.101.80", "end": "10.1.101.100"} |
| c37d8ed0-372e-4b24-9ba2-897c38c6ddbf |      | 10.0.0.0/24   | {"start": "10.0.0.2", "end": "10.0.0.254"}      |
+--------------------------------------+------+---------------+-------------------------------------------------+
root@controller:~# neutron router-list
+--------------------------------------+-----------+-----------------------------------------------------------------------------+
| id                                   | name      | external_gateway_info                                                       |
+--------------------------------------+-----------+-----------------------------------------------------------------------------+
| 680944ad-679c-4fe8-ae4b-258cd8ac337f | tenant-R1 | {"network_id": "c8699820-7c6d-4441-9602-3425f2c630ec", "enable_snat": true} |
+--------------------------------------+-----------+-----------------------------------------------------------------------------+
root@controller:~# neutron port-list
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------------+
| id                                   | name | mac_address       | fixed_ips                                                                          |
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------------+
| 1653ec91-ad7d-40d9-b777-f74aec697026 |      | fa:16:3e:51:a2:97 | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.9"}    |
| 2df7c3ed-dfbb-480d-9cd3-fdefa079e66a |      | fa:16:3e:da:41:49 | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.3"}    |
| 81388454-30e0-45e4-b3dd-b7b2e8dbf067 |      | fa:16:3e:f7:e6:9c | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.1"}    |
| bfe7eaa4-26bc-4fe9-9da2-550abf44beaa |      | fa:16:3e:e1:00:41 | {"subnet_id": "2c4155c9-5a2e-471c-a4d8-40a86b45ab0a", "ip_address": "10.1.101.83"} |
| d7233b80-9d4b-4ef6-a60d-19b3be661069 |      | fa:16:3e:75:e0:5a | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.10"}   |
| eb60f9c4-2ddb-49ee-8b78-2fc2564a7600 |      | fa:16:3e:78:39:e9 | {"subnet_id": "2c4155c9-5a2e-471c-a4d8-40a86b45ab0a", "ip_address": "10.1.101.80"} |
| f6812a11-c4ce-4880-8566-2206afcc612a |      | fa:16:3e:9e:75:a2 | {"subnet_id": "2c4155c9-5a2e-471c-a4d8-40a86b45ab0a", "ip_address": "10.1.101.82"} |
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------------+
技术分享
root@network:~# ip netns
qdhcp-7c22bbd9-166c-4610-9a3d-3b8b92c77518
qrouter-680944ad-679c-4fe8-ae4b-258cd8ac337f

一个外部网络Ext-Net,它的子网是2c4155c9-5a2e-471c-a4d8-40a86b45ab0a,网段为10.1.101.0/24,分配池是10.1.101.80到10.1.101.100。

有一个租户网络tenantA-Net(TenantA的网络,ID号为7c22bbd9-166c-4610-9a3d-3b8b92c77518,对应着qdhcp-7c22bbd9-166c-4610-9a3d-3b8b92c77518),它的子网是c37d8ed0-372e-4b24-9ba2-897c38c6ddbf,网段为10.0.0.0/24,分配池为10.0.0.2到10.0.0.254。

 TenantA有一个私有路由器tenant-R1(ID号为680944ad-679c-4fe8-ae4b-258cd8ac337f,对应着qrouter-680944ad-679c-4fe8-ae4b-258cd8ac337f)

2、系统中的端口号

技术分享
root@controller:~# neutron port-list
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------------+
| id                                   | name | mac_address       | fixed_ips                                                                          |
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------------+
| 1653ec91-ad7d-40d9-b777-f74aec697026 |      | fa:16:3e:51:a2:97 | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.9"}    |
| 2df7c3ed-dfbb-480d-9cd3-fdefa079e66a |      | fa:16:3e:da:41:49 | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.3"}    |
| 81388454-30e0-45e4-b3dd-b7b2e8dbf067 |      | fa:16:3e:f7:e6:9c | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.1"}    |
| bfe7eaa4-26bc-4fe9-9da2-550abf44beaa |      | fa:16:3e:e1:00:41 | {"subnet_id": "2c4155c9-5a2e-471c-a4d8-40a86b45ab0a", "ip_address": "10.1.101.83"} |
| d7233b80-9d4b-4ef6-a60d-19b3be661069 |      | fa:16:3e:75:e0:5a | {"subnet_id": "c37d8ed0-372e-4b24-9ba2-897c38c6ddbf", "ip_address": "10.0.0.10"}   |
| eb60f9c4-2ddb-49ee-8b78-2fc2564a7600 |      | fa:16:3e:78:39:e9 | {"subnet_id": "2c4155c9-5a2e-471c-a4d8-40a86b45ab0a", "ip_address": "10.1.101.80"} |
| f6812a11-c4ce-4880-8566-2206afcc612a |      | fa:16:3e:9e:75:a2 | {"subnet_id": "2c4155c9-5a2e-471c-a4d8-40a86b45ab0a", "ip_address": "10.1.101.82"} |
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------------+
技术分享
技术分享
root@network:~# ip netns
qdhcp-7c22bbd9-166c-4610-9a3d-3b8b92c77518
qrouter-680944ad-679c-4fe8-ae4b-258cd8ac337f
root@network:~# ip netns exec qrouter-680944ad-679c-4fe8-ae4b-258cd8ac337f ifconfig
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

qg-eb60f9c4-2d Link encap:Ethernet  HWaddr fa:16:3e:78:39:e9  
          inet addr:10.1.101.80  Bcast:10.1.101.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fe78:39e9/64 Scope:Link
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:32619 errors:0 dropped:0 overruns:0 frame:0
          TX packets:374 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:4280629 (4.2 MB)  TX bytes:40960 (40.9 KB)

qr-81388454-30 Link encap:Ethernet  HWaddr fa:16:3e:f7:e6:9c  
          inet addr:10.0.0.1  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:fef7:e69c/64 Scope:Link
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:1012 errors:0 dropped:0 overruns:0 frame:0
          TX packets:914 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:106266 (106.2 KB)  TX bytes:108626 (108.6 KB)

root@network:~# ip netns exec qdhcp-7c22bbd9-166c-4610-9a3d-3b8b92c77518 ifconfig
lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:6 errors:0 dropped:0 overruns:0 frame:0
          TX packets:6 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:3456 (3.4 KB)  TX bytes:3456 (3.4 KB)

tap2df7c3ed-df Link encap:Ethernet  HWaddr fa:16:3e:da:41:49  
          inet addr:10.0.0.3  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::f816:3eff:feda:4149/64 Scope:Link
          UP BROADCAST RUNNING  MTU:1500  Metric:1
          RX packets:126 errors:0 dropped:0 overruns:0 frame:0
          TX packets:50 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:12344 (12.3 KB)  TX bytes:6595 (6.5 KB)
技术分享

neutron port-list出来一共7个端口

显然1653ec91-ad7d-40d9-b777-f74aec697026(10.0.0.9)和d7233b80-9d4b-4ef6-a60d-19b3be661069(10.0.0.10)是虚拟机vm002和vm001的私有IP地址端口(虚拟机tap网络设备端口)。

端口f6812a11-c4ce-4880-8566-2206afcc612a(10.1.101.82)和bfe7eaa4-26bc-4fe9-9da2-550abf44beaa(10.1.101.83)是两个浮动IP。

端口81388454-30e0-45e4-b3dd-b7b2e8dbf067(10.0.0.1)和端口eb60f9c4-2ddb-49ee-8b78-2fc2564a7600(10.1.101.80)是qrouter上面的网络端口。分别作TenantA的网络环境中,子网(c37d8ed0-372e-4b24-9ba2-897c38c6ddbf,网段为10.0.0.0/24)的网关qr-81388454-30和外网通道qg-eb60f9c4-2d。【多个网络对应多个qrouter,即qr和qg设备】

 端口2df7c3ed-dfbb-480d-9cd3-fdefa079e66a(10.0.0.3)是qdhcp上面的网络端口tap2df7c3ed-df,为TenantA的网络环境中,子网(c37d8ed0-372e-4b24-9ba2-897c38c6ddbf,网段为10.0.0.0/24)动态分配私有IP地址,提供子网dhcp服务。【多个子网对应多个qdhcp,即tap设备】

3、网络节点上的linux网桥和OVS网桥:

技术分享
root@network:~# brctl show                  
bridge name     bridge id               STP enabled     interfaces
root@network:~# ovs-vsctl show 1c921779-83ff-4493-8def-df53783ebae2 Bridge br-ex Port "qg-eb60f9c4-2d" Interface "qg-eb60f9c4-2d" type: internal Port "eth2" Interface "eth2" Port br-ex Interface br-ex type: internal Bridge br-int fail_mode: secure Port br-int Interface br-int type: internal Port "tap2df7c3ed-df" tag: 10 Interface "tap2df7c3ed-df" type: internal Port patch-tun Interface patch-tun type: patch options: {peer=patch-int} Port "qr-81388454-30" tag: 10 Interface "qr-81388454-30" type: internal Bridge br-tun Port patch-int Interface patch-int type: patch options: {peer=patch-tun} Port "gre-0a00011f" Interface "gre-0a00011f" type: gre options: {in_key=flow, local_ip="10.0.1.21", out_key=flow, remote_ip="10.0.1.31"} Port "gre-0a000129" Interface "gre-0a000129" type: gre options: {in_key=flow, local_ip="10.0.1.21", out_key=flow, remote_ip="10.0.1.41"} Port br-tun Interface br-tun type: internal ovs_version: "2.0.2"
技术分享

可以看出网络节点没有运行虚拟机,所以linux网桥为空。
OVS网桥br-int上面有qrouter的qr端口和qdhcp的tap端口;

OVS网桥br-ex上面有qrouter的qg端口,并且br-ex与物理网卡eth2相连;

OVS网桥br-tun只是patch网桥br-int和构建隧道平面。

4、compute节点上的linux网桥和OVS网桥:

技术分享
root@compute1:~# virsh list
 Id    Name                           State
----------------------------------------------------
 2     instance-00000029              running
 3     instance-00000028              running

root@compute1:~# brctl show 
bridge name     bridge id               STP enabled     interfaces
qbr1653ec91-ad          8000.22ca68904e2f       no              qvb1653ec91-ad
                                                        tap1653ec91-ad
qbrd7233b80-9d          8000.964cf783c9e1       no              qvbd7233b80-9d
                                                        tapd7233b80-9d
virbr0          8000.000000000000       yes
root@compute1:~# ovs-vsctl show   ///查询和更新ovs-vswitchd的配置
14b9e1b3-2d80-4380-92b0-f585cf9f74f7
    Bridge br-tun   //OVS Tunnel 桥br-tun
        Port "gre-0a000129"  //端口,连接GRE Tunnel
            Interface "gre-0a000129"
                type: gre
                options: {in_key=flow, local_ip="10.0.1.31", out_key=flow, remote_ip="10.0.1.41"} //GRE Tunnel 是点到点之间建立的,这头的IP是10.0.1.31,那头的IP地址是10.0.1.41
        Port "gre-0a000115"  //端口,连接GRE Tunnel
            Interface "gre-0a000115"
                type: gre
                options: {in_key=flow, local_ip="10.0.1.31", out_key=flow, remote_ip="10.0.1.21"}  //GRE Tunnel 是点到点之间建立的,这头的IP是10.0.1.31,那头的IP地址是10.0.1.21
        Port br-tun
            Interface br-tun
                type: internal
        Port patch-int  //端口patch-int,用来连接桥br-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
    Bridge br-int   //OVS integration网桥 br-int
        fail_mode: secure
        Port "qvod7233b80-9d"  //端口,用来连接【一个虚拟网卡的TAP设备连接的linux bridge】
            tag: 1
            Interface "qvod7233b80-9d"
        Port "qvo1653ec91-ad"   //端口,用来连接【一个虚拟网卡的TAP设备连接的linux bridge】
            tag: 1
            Interface "qvo1653ec91-ad"
        Port patch-tun  //端口,用来连接br-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}  //和桥 br-tun上的patch-int是对等端口
        Port br-int
            Interface br-int
                type: internal
    ovs_version: "2.0.2"

root@compute1:~# ovs-ofctl show br-tun  ///查询和更新OpenFlow交换机和控制器
OFPT_FEATURES_REPLY (xid=0x2): dpid:0000d63ebd331948
n_tables:254, n_buffers:256
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: OUTPUT SET_VLAN_VID SET_VLAN_PCP STRIP_VLAN SET_DL_SRC SET_DL_DST SET_NW_SRC SET_NW_DST SET_NW_TOS SET_TP_SRC SET_TP_DST ENQUEUE
 1(patch-int): addr:9a:0f:cb:ab:46:7a //端口 patch-int的ID 是 1
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 2(gre-0a000115): addr:e2:01:f1:7d:a5:af //端口 gre-0a000115的ID 是 2
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 3(gre-0a000129): addr:8e:b1:ce:5f:51:9b
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 LOCAL(br-tun): addr:d6:3e:bd:33:19:48
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
 
技术分享

可以看出计算节点compute节点上面运行2个虚拟机。

可以看到qbr1653ec91-adlinux网桥,qvb1653ec91-ad端口和tap1653ec91-ad端口。

OVS网桥br-int上有qvo端口,

OVS网桥br-tun只是patch网桥br-int和构建隧道平面。

三、虚拟机中数据流

下图是典型的Neutron-OVS-GRE网络模式图。

有两个计算节点Compute-01和Compute-02和一个网络节点。

技术分享

1、网络设备简介

tap:是vm连接qbr的接口,在qbr上。与此对应的是vm上的虚拟网卡。

qbr:就是Linux bridge

qvb:veth pari bridge side。qvb和qvo是qbr和ovs之间连通的一对接口。qvb是这对接口中在qbr那边的接口。

qvo:veth pair openvswitch side。qvb和qvo是qbr和ovs之间连通的一对接口。qvb是这对接口中在ovs那边的接口。

qr:l3 agent managed port,是router端的port。

qg:l3 agent managed port,是gateway端的port。

2、虚拟机通外网(虚拟机数据从计算节点到网络节点到外网)

 假设物理计算节点Compute-02上面的虚拟机VM-003的网卡eth0上有网络数据包向外部物理路由器网关10.1.101.254发出,那么数据流如下:

数据依次经过tap设备;Linux Bridge设备qbr;qvb和qvo虚拟网络设备;到达物理计算节点的OVS网桥br-int上,被打上VLAN ID Tag;br-int将数据包attach到计算节点Compute02的OVS网桥br-tun上,将VLAN ID转化为Tunnel ID;数据包再从计算节点Compute-02的OVS网桥br-tun与网络节点Network-node上的OVS网桥br-tun构成的网络隧道穿过(要通过物理网卡)将Tunnel ID转化为VLAN ID,交付到网络节点的OVS网桥br-int上;网络节点上的br-int通过qr设备借助Linux网络命名空间qrouter连通br-ex上的qg设备(这个过程router的NAT表将fixed IP地址转化为floating IP地址),将数据包交付到OVS网桥br-ex上;最后br-ex通过网络节点的外部物理网卡eth2把数据包送达到外部路由器网关。

3、计算节点中虚拟机之间数据流

(1) 同一个host上同一个子网内虚拟机之间的通信:

因为br-int是个虚拟的二层交换机,比如TenantA的两个虚拟机vm001和vm002可以经过br-int桥直接通信,不需要通过br-tun。

(2)不同主机上同一个子网内虚拟机通信:

Compute1的虚拟机发出的数据包,经过qbr到达br-int,被打上vlan ID;到达br-tun,将VLAN ID转化为Tunnel ID,从GRE Tunnel发出,到达compute2节点。

(3)虚拟机发送DHCP请求

compute节点数据包从br-int到br-tun通过GRE隧道到network节点br-tun,再经过br-int到qdhcp,qdhcp返回其fixed IP地址,原路返回。

4、分别介绍计算节点和网络节点的网络设备

计算节点:

(1)与虚拟机相连的tap设备

每个虚拟机都有一个虚拟网卡eth0,eth0和主机上的一个TAP设备连接,该TAP设备直接挂载在一个linux bridge qbr上,qbr和br-int相连。其实理想的情况下,tap设备能和br-int直接相连就好了,如图中绿色框所示。因为OpenStack要借助TAP设备的iptables rules实现安全组,但是TAP和OVS网桥br-int直接连接的话,br-int不兼容iptables规则,所以OpenStack就用了一个变通的权宜之计,多加了一层linux bridge。导致OVS br-int和linux 网桥都是二层桥,但同时出现了。

Neutron使用tap设备的iptables来实现Security groups

查看虚拟机vm002的tap设备上的iptables:

root@compute1:~# iptables -S |grep tap1653ec91-ad
-A neutron-openvswi-FORWARD -m physdev --physdev-out tap1653ec91-ad --physdev-is-bridged -j neutron-openvswi-sg-chain
-A neutron-openvswi-FORWARD -m physdev --physdev-in tap1653ec91-ad --physdev-is-bridged -j neutron-openvswi-sg-chain
-A neutron-openvswi-INPUT -m physdev --physdev-in tap1653ec91-ad --physdev-is-bridged -j neutron-openvswi-o1653ec91-a
-A neutron-openvswi-sg-chain -m physdev --physdev-out tap1653ec91-ad --physdev-is-bridged -j neutron-openvswi-i1653ec91-a
-A neutron-openvswi-sg-chain -m physdev --physdev-in tap1653ec91-ad --physdev-is-bridged -j neutron-openvswi-o1653ec91-a

OpenStack Neutron在neutron-openvswi-sg-chain上实现security groups。

使用默认安全组的情况:

neutron-openvswi-i1653ec91-a控制进入虚拟机的traffic

技术分享
root@compute1:~# iptables -S |grep neutron-openvswi-i1653ec91-a
-N neutron-openvswi-i1653ec91-a
-A neutron-openvswi-i1653ec91-a -m state --state INVALID -j DROP
-A neutron-openvswi-i1653ec91-a -m state --state RELATED,ESTABLISHED -j RETURN
-A neutron-openvswi-i1653ec91-a -p udp -m udp -m multiport --dports 1:65535 -j RETURN
-A neutron-openvswi-i1653ec91-a -s 10.0.0.10/32 -j RETURN
-A neutron-openvswi-i1653ec91-a -p icmp -j RETURN
-A neutron-openvswi-i1653ec91-a -p tcp -m tcp -m multiport --dports 1:65535 -j RETURN
-A neutron-openvswi-i1653ec91-a -s 10.0.0.3/32 -p udp -m udp --sport 67 --dport 68 -j RETURN
-A neutron-openvswi-i1653ec91-a -j neutron-openvswi-sg-fallback
-A neutron-openvswi-sg-chain -m physdev --physdev-out tap1653ec91-ad --physdev-is-bridged -j neutron-openvswi-i1653ec91-a
技术分享

neutron-openvswi-o1653ec91-a控制从虚拟机出去的traffic

技术分享
root@compute1:~# iptables -S |grep neutron-openvswi-o1653ec91-a
-N neutron-openvswi-o1653ec91-a
-A neutron-openvswi-INPUT -m physdev --physdev-in tap1653ec91-ad --physdev-is-bridged -j neutron-openvswi-o1653ec91-a
-A neutron-openvswi-o1653ec91-a -p udp -m udp --sport 68 --dport 67 -j RETURN
-A neutron-openvswi-o1653ec91-a -j neutron-openvswi-s1653ec91-a
-A neutron-openvswi-o1653ec91-a -p udp -m udp --sport 67 --dport 68 -j DROP
-A neutron-openvswi-o1653ec91-a -m state --state INVALID -j DROP
-A neutron-openvswi-o1653ec91-a -m state --state RELATED,ESTABLISHED -j RETURN
-A neutron-openvswi-o1653ec91-a -j RETURN
-A neutron-openvswi-o1653ec91-a -j neutron-openvswi-sg-fallback
-A neutron-openvswi-sg-chain -m physdev --physdev-in tap1653ec91-ad --physdev-is-bridged -j neutron-openvswi-o1653ec91-a
技术分享

 添加一条security group规则允许使用TCP 22端口

技术分享
root@controller:~# neutron --os-tenant-name TenantA --os-username UserA --os-password password --os-auth-url=http://localhost:5000/v2.0 security-group-rule-create --protocol tcp --port-range-min 22 --port-range-max 22 --direction ingress default
Created a new security_group_rule:
+-------------------+--------------------------------------+
| Field             | Value                                |
+-------------------+--------------------------------------+
| direction         | ingress                              |
| ethertype         | IPv4                                 |
| id                | be3d6a06-be6b-4f51-b1a5-294ad2a0a261 |
| port_range_max    | 22                                   |
| port_range_min    | 22                                   |
| protocol          | tcp                                  |
| remote_group_id   |                                      |
| remote_ip_prefix  |                                      |
| security_group_id | 8bd8fb6b-7141-4900-8321-390cc1a5d999 |
| tenant_id         | 60a10cd7a61b493d910eabd353c07567     |
+-------------------+--------------------------------------+
技术分享

那么tap设备的iptables会出现下面变化:

Connection to neutron failed: [Errno 111] Connection refused
root@compute1:~# iptables -S | grep 22
-A FORWARD -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 192.168.122.0/24 -i virbr0 -j ACCEPT
-A neutron-openvswi-i1653ec91-a -p tcp -m tcp --dport 22 -j RETURN
-A neutron-openvswi-id7233b80-9 -p tcp -m tcp --dport 22 -j RETURN

(2)OVS一体化网桥br-int

br-int是OpenvSwitch创建的虚拟网桥,但在实际运行中它充当着虚拟交换机的角色。br-int上的端口tap设备将宿主机上的虚拟机连接到同一网络交换层上。再透过本机OVS网桥br-tun的互联协议可以将OpenStack系统架构中所有节点的br-int组织成一个更大的虚拟交换机BR-INT{compuer-01-br-int + compuer-02-br-int….}。

每一个使用neutron net-create 命令创建的network都有一个新的vlan ID。见ovsl-vsctl show命令显示结果中的Port的tag值。

br-int处理从VM进出的traffic的vlan ID。

(3)OVS通道网络br-tun

br-tun是OVS创建的虚拟网桥,它的作用是向下直接与br-int连接作为网络数据的进出口;对上通过特定的通信协议与各个节点上的br-tun相连构成一个扁平的通信/通道层。如果把所有的br-int构建的抽象层定义为虚拟二层网络,那么所有的br-tun构成的抽象层便是虚拟三层网络了。

br-tun使用OpenFlow规则处理vlan ID和Tunnel ID 的转换

从下面OpenFlow rule tables可见两种ID的转化过程:

技术分享
root@compute1:~# ovs-ofctl show br-tun
OFPT_FEATURES_REPLY (xid=0x2): dpid:0000d63ebd331948
n_tables:254, n_buffers:256
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: OUTPUT SET_VLAN_VID SET_VLAN_PCP STRIP_VLAN SET_DL_SRC SET_DL_DST SET_NW_SRC SET_NW_DST SET_NW_TOS SET_TP_SRC SET_TP_DST ENQUEUE
 1(patch-int): addr:9a:0f:cb:ab:46:7a //端口patch-int的ID是 1
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 2(gre-0a000115): addr:e2:01:f1:7d:a5:af //端口gre-0a000115的ID是 2
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 3(gre-0a000129): addr:8e:b1:ce:5f:51:9b //端口gre-0a000129的ID是 3
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
 LOCAL(br-tun): addr:d6:3e:bd:33:19:48
     config:     0
     state:      0
     speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0
root@compute1:~# ovs-ofctl dump-flows br-tun NXST_FLOW reply (xid=0x4): cookie=0x0, duration=99058.105s, table=0, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=1,in_port=3 actions=resubmit(,2)//从端口3即gre-0a000129进来的traffic会被重新执行table 2的rule cookie=0x0, duration=164986.43s, table=0, n_packets=303, n_bytes=29712, idle_age=7626, hard_age=65534, priority=1,in_port=1 actions=resubmit(,1)//从端口1即patch-int进来的traffic重新执行table1 cookie=0x0, duration=164981.72s, table=0, n_packets=188, n_bytes=28694, idle_age=7626, hard_age=65534, priority=1,in_port=2 actions=resubmit(,2)//从端口2即gre-0a000115进来的traffic重新执行table2 cookie=0x0, duration=164986.109s, table=0, n_packets=4, n_bytes=320, idle_age=65534, hard_age=65534, priority=0 actions=drop cookie=0x0, duration=164985.783s, table=1, n_packets=257, n_bytes=25328, idle_age=7626, hard_age=65534, priority=1,dl_dst=00:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,20)//重新执行table 20的rule cookie=0x0, duration=164985.31s, table=1, n_packets=46, n_bytes=4384, idle_age=7631, hard_age=65534, priority=1,dl_dst=01:00:00:00:00:00/01:00:00:00:00:00 actions=resubmit(,21) cookie=0x0, duration=164979.109s, table=2, n_packets=188, n_bytes=28694, idle_age=7626, hard_age=65534, priority=1,tun_id=0x2 actions=mod_vlan_vid:1,resubmit(,10)//从neutron node来的traffic,打上vlan ID 1,重新执行table 10 的rule cookie=0x0, duration=164984.991s, table=2, n_packets=8, n_bytes=648, idle_age=65534, hard_age=65534, priority=0 actions=drop cookie=0x0, duration=164984.676s, table=3, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=0 actions=drop cookie=0x0, duration=164984.395s, table=10, n_packets=188, n_bytes=28694, idle_age=7626, hard_age=65534, priority=1 actions=learn(table=20,hard_timeout=300,priority=1,NXM_OF_VLAN_TCI[0..11],NXM_OF_ETH_DST[]=NXM_OF_ETH_SRC[],load:0->NXM_OF_VLAN_TCI[],load:NXM_NX_TUN_ID[]->NXM_NX_TUN_ID[],output:NXM_OF_IN_PORT[]),output:1///学习规则table20,从port1即patch-int发出 cookie=0x0, duration=164984.067s, table=20, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=0 actions=resubmit(,21)//重新执行table 21的rule cookie=0x0, duration=164979.293s, table=21, n_packets=36, n_bytes=3576, idle_age=7631, hard_age=65534, dl_vlan=1 actions=strip_vlan,set_tunnel:0x2,output:3,output:2//去掉vlan ID,打上tunnel ID 2即neutron节点的tunnel ID,从端口2即gre端口发出 cookie=0x0, duration=164983.75s, table=21, n_packets=10, n_bytes=808, idle_age=65534, hard_age=65534, priority=0 actions=drop
技术分享

网络节点:

(1)OVS通道网桥br-tun

它与计算节点上的br-tun作用相同,只是作为通道层用于连接别的物理节点。唯一不同的是这个br-tun连接的是网络节点的br-int,网络节点br-int与计算节点的br-int区别较大。

(2)OVS一体化网桥br-int

br-int是OVS创建的虚拟网桥,也起到了虚拟交换机的作用。上面主要有两类设备:一类是tap设备,另一类是qr设备。

linux网络命名空间qdhcp和qrouter均由l3-agent所创建,用来隔离管理租户的虚拟网络和路由。

br-int的tap设备,ip地址为xxx.xxx.xxx.3与dnsmasq进程构成dhcp,为新创建的虚拟机动态分配私有IP地址。

br-int上的qr设备,IP地址一般为xxx.xxx.xxx.1与br-ex的qg设备构成qrouter,为租户网络做路由转发,通过qg打通租户内部的虚拟网络和外部的物理网络。

(3)OVS外部网桥

br-ex是OVS创建的虚拟网桥,网桥上有qg设备端口,它是打通租户网络和外部网络的重要通道。另外br-ex与物理网卡(图中是eth2)相连,通往internet网络。

http://docs.openstack.org/admin-guide-cloud/content/under_the_hood_openvswitch.html

gre网络细节