Note: This post started as an answer I gave on the Cisco Support Forum. This version is slightly expanded with pictures and examples.
In this post I will examine the roles of three very important protocols that exist in the ACI environment.
I will explain
- that IS-IS is the underlying routing protocol that is used by the leaves and spines to learn where they sit in the topology in relation to each other
- how Leaf switches use COOP to report local station information to the Spine (Oracle) switches
- how BGP and MP-BGP is used to redistribute routes from external sources to leaf switches.
Let me start with a picture. Imagine a simple 2leaf/2spine topology with HostA attached to to Leaf1 and with HostB attached to to Leaf2.
- Leaf1 has a VTEP address of 10.0.1.101
- Leaf2 has a VTEP address of 10.0.1.102
- Spine1 has a VTEP address of 10.0.1.201
- Spine2 has a VTEP address of 10.0.1.202
- HostA has a MAC address of A and an IP address of 192.168.1.1 and is attached to port 1/5 on Leaf1
- HostB has a MAC address of B and an IP address of 192.168.1.2 and is attached to port 1/6 on Leaf2
The leaves and spines will exchange IS-IS routing updates with each other so that Leaf1 sees that it has two equally good paths to reach Leaf2, and Leaf2 sees that it has two equally good paths to reach Leaf1.
Leaf1# show ip route vrf overlay-1 10.0.1.102
IP Route Table for VRF "overlay-1"
10.0.1.102/32, ubest/mbest: 2/0
*via 10.0.1.201, eth1/51.2, [115/3], 6d20h, isis-isis_infra, L1
*via 10.0.1.202, eth1/52.2, [115/3], 6d20h, isis-isis_infra, L1
For now, that’s all we need to know about IS-IS – it is the routing protocol used by the VTEPs to learn how to reach the other VTEPs.
Now think about the hosts.
This is where COOP comes in.
When Leaf1 learns about HostA because, say HostA sent an ARP request seeking the MAC address of 192.168.1.2 (which you know is HostB, but that’s not relevant at the moment), Leaf1 looks at that ARP request, and just like a normal switch, learns that MAC A is present on port 1/5. But the leaf is a bit more clever than that, and looks INSIDE the payload of the ARP packet and learns that Host1 also has an IP address of 192.168.1.1 and records all this information in its Local Station Table.
Leaf1#show endpoint interface ethernet 1/5
VLAN/Domain Encap VLAN MAC/IP Address Interface
65 vlan-2051 a036.9f86.e94e L eth1/5
Tenant1:VRF1 vlan-2051 192.168.1.1 L eth1/5
AND THEN reports this information to one of the spine switches (chosen at random) using the Council Of Oracles Protocol (COOP). The spine switch (oracle) that was chosen then relays this information to all the other spines (oracles) so that every spine (oracle) has a complete record of every end point in the system.
The spines (oracles) record the information learned via the COOP in the Global Proxy Table, and this information is used to resolve unknown destination MAC/IP addresses when traffic is sent to the Proxy address.
Note that all of this happens without anything to do with BGP.
But to round off the COOP story, we would assume that at some stage Leaf2 (a citizen) will also learn HostB‘s MAC and IP and also inform one of the spines (oracles) at random of this information using the COOP.
Spine1#show coop internal info repo ep | egrep -i "mac|real|-"
EP mac : A0:36:9F:86:E9:4E
MAC Tunnel : 10.0.1.101
Real IPv4 EP : 192.168.1.1
EP mac : A0:36:9F:61:88:FD
MAC Tunnel : 10.0.1.102
Real IPv4 EP : 192.168.1.2
So COOP is used solely for the purpose of distributing endpoint information to spine switches (oracles). As far as I know, spine switches never use COOP to distribute end host information to leaf switches.
So where does BGP fit in?
BGP is not needed until an external router is connected. So now imagine that Leaf2 has had a router connected and has learned some routes from that external router for a particular VRF for a particular Tenant.
How can Leaf2 pass this information on to Leaf1 where HostA is trying to send packets to one of these external networks? For Leaf2 to be able to pass routing information on to Leaf1 and keep that information exclusive to the same VRF, we need a routing protocol that is capable of exchanging routing information for multiple VRFs across an underlay network
Which is exactly what MP-BGP was invented for – to carry routing information across MPLS underlay networks. In the case of ACI, BGP is configured by choosing an Autonomous System number and nominating one of the spine switches to be a route reflector. MP-BGP is self configuring, you don’t need to do anything to make it work!
(Although you will have to configure your Tenant to exchange routes with the external router.)
Leaf1# show ip route vrf Tenant1:VRF1
192.168.1.0/24, ubest/mbest: 1/0, attached, direct, pervasive
*via 10.0.1.102%overlay-1, [1/0], 04:43:32, static, tag 4294967295
192.168.1.10/32, ubest/mbest: 1/0, attached, pervasive
*via 192.168.1.10, vlan25, [1/0], 03:52:23, local, local
220.127.116.11/8, ubest/mbest: 1/0
*via 10.0.1.102%overlay-1, [200/5], 00:11:41, bgp-1, internal, tag 1
aka Chris Welsh