The EVPN technology in the data centre to manage overlay networks has become one of the most popular and widely adopted technologies in the past few years. One recurring topic of this technology, which I have been encountering, is how to connect multiple data centres together in a secure, resilient and scalable manner which is commonly known as Data Centre Interconnect (DCI).
Project: data centre redesign
I’m currently working on a project with a customer which involves a redesign of their data centres which are used to provide compute and storage services to customers. This involves being able to securely segment multiple customers within a single data centre as well as being able to stretch workloads across data centres for high availability failover and scaling.
The design will use a folded 3-stage Clos which is also known as a spine and leaf topology. The underlay will use eBGP and the overlay networks will be using a VXLAN data plane with an EVPN control plane. Using BGP for the underlay and overlay simplifies the configuration and reduces the number of routing protocols that need to be troubleshooted.
There are three main DCI design goals for this project which are:
- Limit the blast radius of any failures to a single data centre.
- Control which overlay networks are advertised to other data centres.
- Support stretching of networks across different network vendors. (Single vendor per DC).
There are a few DCI approaches to connect EVPN data centres together, and in my view, they can be summarised into one of two design goals:
- Single control plane
- Distributed control plane
Single control plane
This single EVPN control plane diagram is the most straightforward design supported by all network vendors which implement EVPN. It is essentially just stretching the same fabric to your remote data centres by peering the EVPN address family between border leaf nodes. This requires the underlay routes to be advertised to all other data centres so the leaf nodes can reach VTEP loopbacks.
This option is likely to be fine for most environments and allows you to easily stretch overlay networks between data centres as the EVPN routes are advertised to all leaf nodes across your data centres. This is the simplest, most documented and well-supported approach to EVPN DCI.
However, as all nodes in the EVPN domain will have updates for hosts and networks from all other nodes, regardless of if they are participating in these networks, this option may not be suitable for very large scale environments as the number of generated EVPN routes will increase and eventually consume leaf nodes’ routing tables.
Distributed control plane
This distributed EVPN control plane design involves breaking up the EVPN control planes and explicitly defining which overlay networks should be interconnected between domains.
Using this approach we can reduce the number of EVPN routes that are advertised outside of a data centre which reduces control plane flooding updates and allows us to scale further. By breaking up the EVPN control plane into many smaller domains we are also limiting the blast radius of any possible control plane failure. In this design, the border leaf nodes act as gateways for all overlay networks in a data centre and remove the need for all VTEPs to be globally reachable.
This option, although appealing on paper, has been regarded as a more complicated option that is still not supported by all network vendors. This option is more commonly documented between data plane technologies such as VXLAN within the DC to MPLS in the DCI rather than VXLAN to VXLAN which has traditionally not been possible due to split horizon. This feature is usually referred to as VXLAN to VXLAN stitching and one vendor Arista, that I know of at the time of writing, supports this feature (Multi-Domain EVPN using VTEP to VTEP bridging).
Using MPLS in the DCI may not be something that organisations already have or are willing to introduce due to cost and expertise, so environments which have IP only connectivity between sites have been forced to stick with a single EVPN control plane design. Environments that have VPLS or other Ethernet WAN services do have an option to trunk decapsulated traffic across the DCI but this would add additional configuration and complexity as any inter VLAN traffic would require that both source and destination VLAN’s be present at both data centres or VRF’s peered over a separate VLAN.
Best DCI design for the customer
Now that we have discussed the two most common DCI design options we can refer back to the design goals for this customer project to decide the most appropriate design. Based on the three design goals, the DCI design which best matches these requirements is a distributed control plane.
A distributed control plane will increase the number of failure domains which will meet the goal to limit the blast radius. It will also allow granular control of which overlay networks are advertised out of each DC so that only desired networks are stretched and reduce unnecessary routing updates. Finally using a distributed control plane will facilitate integration with mixed network vendors by allowing each vendor environment to operate as its own EVPN domain and see all announcements from remote data centres summarised at the border nodes without the need to be aware of vendors used at remote data centres.
I hope this has been useful to understand the different DCI deployment designs and which one might be right for your environment. Do you want to know more about such a design or do you have related questions? Please get in touch with us!