VXLAN BGP EVPN - 3 - Inter VNI routing
In this post, we will concentrate on the BGP EVPN Control plane and data plane needed to accomplish inter-VNI routing.
Introduction
In this post, we will concentrate on the BGP EVPN Control plane and data plane needed to accomplish inter-VNI routing.
Topology
Configuration
I’m gonna assume you’ve followed the series until now and the below steps need to be performed after the completion of configs from the previous blogs in this series.
Since we intend to perform inter-VNI routing, we need a minimum of 2 L2VNI in our fabric. Let’s create a new L2VNI on Leaf 102
Leaf 102
vlan 200
name vlan200-VNI10020
vn-segment 10020
evpn
vni 10020 l2
rd auto
route-target import auto
route-target export auto
interface nve1
member vni 10020
mcast-group 224.1.1.10
interface Ethernet1/4
switchport access vlan 200
Now let’s perform the configuration needed for inter-VNI routing
- Define a VLAN for L3VNI and assign a VNI to it.
- Map L3VNI to the tenant VRF.
- Define an L3VNI SVI for VXLAN traffic forwarding
- Associate L3VNI to the VRF in the NVE interface.
Leaf 101 (do the same on Leaf 102)
vlan 99
name PRD_Tenant
vn-segment 10099
vrf context PRD_Tenant
vni 10099
rd auto
address-family ipv4 unicast
route-target both auto
route-target both auto evpn
interface Vlan99
no shutdown
mtu 9216
vrf member PRD_Tenant
ip forward
interface nve1
member vni 10099 associate-vrf
Let’s also enable a couple of add-on features that would make our fabric forwarding more efficient
Anycast GW
This allows all clients (behind different leaves) into believing that they are all connected to a single switch since we use the same IP and virtual MAC for the SVI on all leaves
Leaf 101 and 102
interface Vlan10
no shutdown
vrf member PRD_Tenant
ip address 192.168.11.1/24
fabric forwarding mode anycast-gateway
Leaf 102
interface Vlan200
no shutdown
vrf member PRD_Tenant
ip address 192.168.22.1/24
fabric forwarding mode anycast-gateway
ARP suppression
This feature allows the Local VTEP to reply to the ARP requests (using the ARP cache) without flooding the request across the fabric.
Leaf 101
interface nve1
member vni 10010
suppress-arp
Leaf 102
interface nve1
member vni 10010
suppress-arp
member vni 10020
suppress-arp
Inter-VNI Control plane
There are many moving parts here so I’ve numbered the above workflow for ease of understanding.
Let’s assume Alice ( 192.168.11.11, 0000.000a.11ce ) with a default GW of 192.168.11.1 comes online. When a client comes online, it generally announces its presence using Gratuitous ARP (GARP) We have already learned how local VTEP learns the MAC address and exchanges it with remote VREP in the previous blog so we will jump directly to the IP learning here
- Leaf 101 scrubs the GARP and stores the MAC-IP of Alice binding in its ARP table.
leaf-101# sh ip arp vrf PRD_Tenant
<snipped>
IP ARP Table for context PRD_Tenant
Total number of entries: 1
Address Age MAC Address Interface Flags
192.168.11.11 00:12:15 0000.000a.11ce Vlan10
2a. Host mobility manager ( HMM) learns MAC-IP as a local route in its local host DB, then sends it to L2RIB.
- Observe the route is learned as /32 in the DB. The DB also has MAC, SVI, and a local interface.
- We used Mac VRF in L2RIB in the previous blog for intra-VNI. Now we will use IP VRF within L2RIB to store Mac-IP info from the local host DB.
leaf-101# show fabric forwarding ip local-host-db vrf PRD_Tenant
HMM host IPv4 routing table information for VRF PRD_Tenant
Status: *-valid, x-deleted, D-Duplicate, DF-Duplicate and frozen,
c-cleaned in 00:07:35
Host MAC Address SVI Flags Physical Inter
face
* 192.168.11.11/32 0000.000a.11ce Vlan10 0x420201 Ethernet1/3
- Observe the contents of the IP VRF below.
leaf-101# show l2route mac-ip topology 10 detail
Flags -(Rmac):Router MAC (Stt):Static (L):Local (R):Remote (V):vPC link
(Dup):Duplicate (Spl):Split (Rcv):Recv(D):Del Pending (S):Stale (C):Clear
(Ps):Peer Sync (Ro):Re-Originated (Orp):Orphan
Topology Mac Address Host IP Prod Flags
Seq No Next-Hops
----------- -------------- --------------------------------------- ------ ------
---- ---------- ---------------------------------------
10 0000.000a.11ce 192.168.11.11 HMM L,
0 Local
L3-Info: 10099
2b. We have enabled ARP suppression per VNI on VTEPs. Hence the ARP cache is updated with the MAC-IP information.
leaf-101# sh ip arp suppression-cache detail
<snipped>
Ip Address Age Mac Address Vlan Physical-ifindex Flags Remote Vtep Addrs
192.168.11.11 00:16:22 0000.000a.11ce 10 Ethernet1/3 L
2c. HMM installs MAC-IP information from the ARP-Table into L3RIB
leaf-101# show ip route 192.168.11.11 vrf PRD_Tenant
<snipped>
192.168.11.11/32, ubest/mbest: 1/0, attached
*via 192.168.11.11, Vlan10, [190/0], 06:19:12, hmm
- Leaf 101 installs MAC-IP route from L2RIB into BGP L2 EVPN. Here we apply the necessary BGP PA ( ex: RT) before sending it out as a Type 2 MAC + IP route to iBGP neighbors.
Pay attention to the following fields in the BGP update :
- Mac address = 00:00:00:0a:11:ce
- IP = 192.168.11.11
- RD = 172.16.50.101:32777
- l2VNI = 10010
- l3VNI = 10099
- Encap type = VXLAN
- Router MAC = 50:01:00:00:1b:08 (Used by the VTEP for the inner ethernet frame when it does VXLAN encapsulation)
leaf-101# sh bgp l2vpn evpn 192.168.11.11
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 172.16.50.101:32777 (L2VNI 10010)
BGP routing table entry for [2]:[0]:[0]:[48]:[0000.000a.11ce]:[32]:[192.168.11.11]/272, version 6
Paths: (1 available, best #1)
Flags: (0x000102) (high32 00000000) on xmit-list, is not in l2rib/evpn
Advertised path-id 1
Path type: local, path is valid, is best path, no labeled nexthop
AS-Path: NONE, path locally originated
172.16.100.101 (metric 0) from 0.0.0.0 (172.16.50.101)
Origin IGP, MED not set, localpref 100, weight 32768
Received label 10010 10099
Extcommunity: RT:65501:10010 RT:65501:10099 ENCAP:8 Router MAC:5001.0000.1b08
Path-id 1 advertised to peers:
172.16.50.11 172.16.50.12
- Leaf 102 receives the route in its BGP process without modification.
5a. Next the route is imported into BGP RIB based on EVPN import RT and then into L2RIB (IP VRF)
-
Also during this import RD might be changed. Since VLAN = 10 on Leaf 101 but 100 on Leaf 102, received RD = 172.16.50.102:32777 was changed to 192.168.77.102:32787 (32767 + 100)
-
Observe that the route also appears against L3VNI = 10099 in the BGP EVPN table. The import RD is changed here too based on RD = BGP RID : VRF id = 172.16.50.102:3.
leaf-101# sh bgp l2vpn evpn 192.168.11.11
<snipped>
Route Distinguisher: 172.16.50.102:32867 (L2VNI 10010)
BGP routing table entry for [2]:[0]:[0]:[48]:[0000.000a.11ce]:[32]:[192.168.11.11]/272, version 7
Paths: (1 available, best #1)
Flags: (0x000212) (high32 00000000) on xmit-list, is in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop, in rib
Imported from 172.16.50.101:32777:[2]:[0]:[0]:[48]:[0000.000a.11ce]:[32]:[192.168.11.11
]/272
AS-Path: NONE, path sourced internal to AS
172.16.100.101 (metric 81) from 172.16.50.11 (172.16.50.11)
Origin IGP, MED not set, localpref 100, weight 0
Received label 10010 10099
Extcommunity: RT:65501:10010 RT:65501:10099 ENCAP:8 Router MAC:5001.0000.1b08
Originator: 172.16.50.101 Cluster list: 172.16.50.11
Path-id 1 not advertised to any peer
Route Distinguisher: 172.16.50.102:3 (L3VNI 10099)
BGP routing table entry for [2]:[0]:[0]:[48]:[0000.000a.11ce]:[32]:[192.168.11.11]/272, version 8
Paths: (1 available, best #1)
Flags: (0x000202) (high32 00000000) on xmit-list, is not in l2rib/evpn, is not in HW
Advertised path-id 1
Path type: internal, path is valid, is best path, no labeled nexthop
Imported from 172.16.50.101:32777:[2]:[0]:[0]:[48]:[0000.000a.11ce]:[32]:[192.168.11.11]/272
AS-Path: NONE, path sourced internal to AS
172.16.100.101 (metric 81) from 172.16.50.11 (172.16.50.11)
Origin IGP, MED not set, localpref 100, weight 0
Received label 10010 10099
Extcommunity: RT:65501:10010 RT:65501:10099 ENCAP:8 Router MAC:5001.0000.1b08
Originator: 172.16.50.101 Cluster list: 172.16.50.11
Path-id 1 not advertised to any peer
MAC-IP in L2RIB
leaf-102# show l2route mac-ip topology 100 detail
Flags -(Rmac):Router MAC (Stt):Static (L):Local (R):Remote (V):vPC link
(Dup):Duplicate (Spl):Split (Rcv):Recv(D):Del Pending (S):Stale (C):Clear
(Ps):Peer Sync (Ro):Re-Originated (Orp):Orphan
Topology Mac Address Host IP Prod Flags Seq No N
ext-Hops
----------- -------------- --------------------------------------- ------ ---------- ---------- ----
-----------------------------------
100 0000.000a.11ce 192.168.11.11 BGP -- 0 17
2.16.100.101 (Label: 10010)
Sent To: ARP
encap-type:1
5b. The route is now installed into VRF’s L3RIB from the above BGP RIB as we have mapped the L3VNI = 10099 to the VRF.
leaf-102# show ip route 192.168.11.11 vrf PRD_Tenant
<snipped>
192.168.11.11/32, ubest/mbest: 1/0
*via 172.16.100.101%default, [200/0], 06:04:38, bgp-65501, internal, tag 65501, segid: 10099 tun
nelid: 0xac106465 encap: VXLAN
- Since ARP suppression is enabled on VTEP switches, L2RIB updates the ARP cache
leaf-102# show ip arp suppression-cache detail
<snipped>
192.168.11.11 06:13:39 0000.000a.11ce 100 (null) R 172.16.100.101
192.168.22.22 00:16:27 0000.0000.7011 200 Ethernet1/4 L
Inter-VNI Data plane
Check if you can ping tom from Alice to tom.
alice#ping 192.168.22.22
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.22.22, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 7/15/44 ms
We are using symmetric IRB in our setup here so the data plane works as follows
-
Switching: Since Alice and tom are in different subnets, Alice sends the ICMP request to anycast GW on leaf 101 ( VLAN 10)
-
Routing: Leaf 101 receives the frame and finds that 192.168.22.22 is learned via BGP ( as seen in RIB) with leaf 102 as the next hop. It encapsulates in VXLAN with L3VNI = 10099 and routes the packet to Leaf 102. Observe the inner frame destination MAC = Router MAC.
-
Routing: Leaf 102 removes the VXLAN header and makes a routing decision using VRF RIB by looking at VNI = 10099.
-
Switching: Finally the frame is switched out of VLAN 200.
ICMP request encapsulated by Leaf 101
ICMP reply encapsulated by Leaf 102