In this post I’ll demonstrate how to build a simple OpenStack lab with OpenDaylight-managed virtual networking and integrate it with a Cisco IOS-XE data centre gateway using EVPN.
For the last 5 years OpenStack has been the training ground for a lot of emerging DC SDN solutions. OpenStack integration use case was one of the most compelling and easiest to implement thanks to the limited and suboptimal implementation of the native networking stack. Today, in 2017, features like L2 population, local ARP responder, L2 gateway integration, distributed routing and service function chaining have all become available in vanilla OpenStack and don’t require a proprietary SDN controller anymore. Admittedly, some of the features are still not (and may never be) implemented in the most optimal way (e.g. DVR). This is where new opensource SDN controllers, the likes of OVN and Dragonflow, step in to provide scalable, elegant and efficient implementation of these advanced networking features. However one major feature still remains outside of the scope of a lot of these new opensource SDN projects, and that is data centre gateway (DC-GW) integration. Let me start by explain why you would need this feature in the first place.
Optimal forwarding of North-South traffic
OpenStack Neutron and VMware NSX, both being pure software solutions, rely on a special type of node to forward traffic between VMs and hosts outside of the data centre. This node acts as a L2/L3 gateway for all North-South traffic and is often implemented as either a VM or a network namespace. This kind of solution gives software developers greater independence from the underlying networking infrastructure which makes it easier for them to innovate and introduce new features.
However, from the traffic forwarding point of view, having a gateway/network node is not a good solution at all. There is no technological reason for a packet to have to go through this node when after all it ends up on a DC-GW anyway. In fact, this solution introduces additional complexity which needs to be properly managed (e.g. designed, configured and troubleshooted) and a potential bottleneck for high-throughput traffic flows.
It’s clear that the most optimal way to forward traffic is directly from a compute node to a DC-GW. The only question is how can this optimal forwarding be achieved? SDN controller needs to be able to exchange reachability information with DC-GW using a common protocol understood by most of the existing routing stacks. One such protocol, becoming very common in DC environments, is BGP, which has two address families we can potentially use:
- VPNv4/6 will allow routes to be exchanged and the dataplance to use MPLSoGRE encapsulation. This should be considered a “legacy” approach since for a very long time DC-GWs did not have the VXLAN ecap/decap capabilities.
- EVPN with VXLAN-based overlays. EVPN makes it possible to exchange both L2 and L3 information under the same AF, which means we have the flexibility of doing not only a L3 WAN integration, but also a L2 data centre interconnect with just a single protocol.
In OpenStack specifically, BGPVPN project was created to provide a pluggable driver framework for 3rd party BGP implementations. Apart from a reference BaGPipe driver (BaGPipe is an ExaBGP fork with lightweight implementation of BGP VPNs), which relies on a default
openvswitch ML2 mechanism driver, only Nuage, OpenDaylight and OpenContrail have contributed their drivers to this project. In this post I will focus on OpenDaylight and show how to install containerised OpenStack with OpenDaylight and integrate it with Cisco CSR using EVPN.
OpenDaylight integration with OpenStack
Historically, OpenDaylight has had multiple projects implementing custom OpenStack networking drivers:
- VTN (Virtual Tenant Networking) - spearheaded by NEC was the first project to provide OpenStack networking implementation
- GBP (Group Based Policy) - a project led by Cisco, one of the first (if not THE first) commercial implementation of Intent-based networking
- NetVirt - currently a default Neutron plugin from ODL, developed jointly by Brocade (RIP), RedHat, Ericsson, Intel and many others.
NetVirt provides several common Neutron services including L2 and L3 forwarding, ACL and NAT, as well as advanced services like L2 gateway, QoS and SFC. To do that it assumes full control over an OVS switch inside each compute node and implements the above services inside a single
br-int OVS bridge. L2/L3 forwarding tables are built based on tenant IP/MAC addresses that have been allocated by Neutron and the current network topology. For high-level overview of NetVirt’s forwarding pipeline you can refer to this document.
It helps to think of an ODL-managed OpenStack as a big chassis switch. NetVirt plays the role of a supervisor by managing control plane and compiling RIB based on the information received from Neutron. Each compute node running an OVS is a linecard with VMs connected to its ports. Unlike the distributed architecture of OVN and Dragonflow, compute nodes do not contain any control plane elements and each OVS gets its FIB programmed directly by the supervisor. DC underlay is a backplane, interconnecting all linecards and a supervisor.
OpenDaylight BGP VPN service architecture
In order to provide BGP VPN functionality, NetVirt employs the use of three service components:
- FIB service - maintains L2/L3 forwarding tables and reacts to topology changes
- BGP manager - provides a translation of information sent to and received from an external BGP stack (Quagga BGP)
- VPN Manager - ties together the above two components, creates VRFs and keeps track of RD/RT values
In order to exchange BGP updates with external DC-GW, NetVirt requires a BGP stack with EVPN and VPNV4/6 capabilities. Ideally, internal ODL BGP stack could have been used for that, however it didn’t meet all the performance requirements (injecting/withdrawing thousand of prefixes at the same time). Instead, an external Quagga fork with EVPN add-ons is connected to BGP manager via a high-speed Apache Thrift interface. This interface defines the format of data to be exchanged between Quagga (a.k.a QBGP) and NetVirt’s BGP Manager in order to do two things:
- Configure BGP settings like ASN and BGP neighbors
- Read/Write RIB entries inside QBGP
BGP session is established between QBGP and external DC-GW, however next-hop values installed by NetVirt and advertised by QBGP have IPs of the respective compute nodes, so that traffic is sent directly via the most optimal path.
Enough of the theory, let’s have a look at how to configure a L3VPN between QBGP (advertising ODL’s distributed router subnets) and IOS-XE DC-GW using EVPN route type 5 or, more specifically, Interface-less IP-VRF-to-IP-VRF model:
My lab environment is still based on a pair of nested VMs running containerised Kolla OpenStack I’ve described in my earlier post. A few months ago OpenDaylight role has been added to kolla-ansible so now it is possible to install OpenDaylight-intergrated OpenStack automatically. However, there is no option to install QBGP so I had to augment the default Kolla and Kolla-ansible repositories to include the QBGP Dockerfile template and QBGP ansible role. So the first step is to download my latest automated installer and make sure
enable_opendaylight global variable is set to
1 2 3
At the time of writing I was relying on a couple of latest bug fixes inside OpenDaylight, so I had to modify the default ODL role to install the latest master-branch ODL build. Make sure the link below is pointing to the latest
zip file in
1 2 3 4
The next few steps are similar to what I’ve described in my Kolla lab post, will create a pair of VMs, build all Kolla containers, push them to a local Docker repo and finally deploy OpenStack using Kolla-ansible playbooks:
1 2 3 4
4-deploy.sh script will also create a simple
init.sh script inside the controller VM that can be used to setup a test topology with a single VM connected to a
1 2 3
Of course, another option to build a lab is to follow the official Kolla documentation to create your own custom test environment.
Assuming the test topology was setup with no issues and a test VM can ping its default gateway
10.0.0.1, we can start configuring BGP VPNs. Unfortunately, we won’t be able to use OpenStack BGPVPN API/CLI, since ODL requires an extra parameter (L3 VNI for symmetric IRB) which is not available in OpenStack BGPVPN API, but we still can configure everything directly through ODL’s API. My interface of choice is always REST, since it’s easier to build it into a fully programmatic plugin, so even though all of the below steps can be accomplished through karaf console CLI, I’ll be using cURL to send and retrieve data from ODL’s REST API.
1. Source admin credentials and setup ODL’s REST variables
1 2 3
2. Configure local BGP settings and BGP peering with DC-GW
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
3. Define L3VPN instance and associate it with OpenStack
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
4. Inject prefixes into L3VPN by associating the previously created L3VPN with a
1 2 3 4 5 6 7 8 9 10 11 12 13
5. Configure DC-GW VTEP IP
ODL cannot automatically extract VTEP IP from updates received from DC-GW, so we need to explicitly configure it:
1 2 3 4 5 6 7 8 9 10
6. DC-GW configuration
That is all what needs to be configured on ODL. Although I would consider this to be outside of the scope of the current post, for the sake of completeness I’m including the relevant configuration from the DC-GW:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
For detailed explanation of how EVPN RT5 is configured on Cisco CSR refer to the following guide.
There are several things that can be checked to verify that the DC-GW integration is working. One of the first steps would be to check if BGP session with CSR is up. This can be done from the CSR side, however it’s also possible to check this from the QBGP side. First we need to get into the QBGP’s interactive shell from the controller node:
From here, we can check that the BGP session has been established:
1 2 3 4 5
We can also check the contents of EVPN RIB compiled by QBGP
1 2 3 4 5 6 7 8 9 10 11 12 13
Finally, we can verify that the prefix
22.214.171.124/24 advertised from DC-GW is being passed by QBGP and accepted by NetVirt’s FIB Manager:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
The last output confirms that the prefix is being received and accepted by ODL. To do a similar check on CSR side we can run the following command:
1 2 3 4 5 6 7 8 9 10 11 12
This confirms that the control plane information has been successfully exchanged between NetVirt and Cisco CSR.
At the time of writing, there was an open bug in ODL master branch that prevented the forwarding entries from being installed in OVS datapath. Once the bug is fixed I will update this post with the dataplance verification, a.k.a ping
OpenDaylight is a pretty advanced OpenStack SDN platform. Its functionality includes clustering, site-to-site federation (without EVPN) and L2/L3 EVPN DC-GW integration for both IPv4 and IPv6. It is yet another example of how an open-source platform can match even the most advanced proprietary SDN solutions from incumbent vendors. This is all thanks to the companies involved in OpenDaylight development. I also want to say special thanks to Vyshakh Krishnan, Kiran N Upadhyaya and Dayavanti Gopal Kamath from Ericsson for helping me clear up some of the questions I posted on netvirt-dev mailing list.