ACI Version mismatch Alert. Don’t use v5 on APIC and v14 on Leaves


No Problem

First of all – if you follow best practices, THERE IS NO PROBLEM

This problem I am about to describe is NOT a deficiency in the Cisco software, just an incompatibility between versions that you might not notice.

The Problem

If you are stuck with some first-generation switches in your ACI fabric, you might be tempted to upgrade your APIC to version 5.x – maybe even attempt to upgrade your leaf switches to the companion v15.x.

But of course, the first-generation switches (that DON’T have a -EX or -FX or -FX2  at the end of the model number) don’t support version 15.x firmware. But you knew that already from reading the release notes right!

Now if you DO decide to ignore my advice, then most things may well continue as normal. But I accidentally discovered a corner case that turns a filter based on port 22 into a filter based on unspecified. (=all traffic)

So, any contract that has a filter based on port 22, when pushed to the switches is transformed into a filter on unspecified. I.E. ALL TRAFFIC.

Now let me clarify “when pushed to the switches

Any EXISTING contracts and filters (for port 22) for existing stable EPGs will continue to work.

But if you create a filter for port 22 and use it or provide/consume a contract to an EPG using a filter on port 22, or create a new attachment on a 1st gen switch that causes policy for the filter to be pushed, this is what will happen!

Let’s say you create a filter called MgmtServices_Fltr and add two entries. One for port 22 and one for port 23 (Destination ports of course)

Note that the GUI show ssh rather than port 22 which you entered when you created the filter.  This fact is indeed the crux of the problem.

Now say you create a contract called MgmtServices_Ct, and allocate the MgmtServices_Fltr, to the contract.

Have the contract Provided/Consumed by two EPGs that have endpoints on one of your 1st gen switches.

Check out the MgmtServices_Fltr, in the object browser to learn the fwdId value (you’ll need this later)

Now check the entries of the filter with the ID you just determined on the Gen1 switch.

apic1# fabric 2201 show zoning-filter filter 161
----------------------------------------------------------------
 Node 2201 (Leaf2201)
----------------------------------------------------------------
+----------+-------+--------+-------------+------+-------------+----------+-------------+-------------+-------------+-------------+-------+-------------+-------------+----------+
| FilterId |  Name | EtherT |    ArpOpc   | Prot | ApplyToFrag | Stateful |  SFromPort  |   SToPort   |  DFromPort  |   DToPort   |  Prio |   Icmpv4T   |   Icmpv6T   | TcpRules |
+----------+-------+--------+-------------+------+-------------+----------+-------------+-------------+-------------+-------------+-------+-------------+-------------+----------+
|   161    | 161_1 |   ip   | unspecified | tcp  |      no     |   yes    | unspecified | unspecified | unspecified | unspecified | proto | unspecified | unspecified |          |
|   161    | 161_0 |   ip   | unspecified | tcp  |      no     |   yes    | unspecified | unspecified |      23     |      23     | dport | unspecified | unspecified |          |
+----------+-------+--------+-------------+------+-------------+----------+-------------+-------------+-------------+-------------+-------+-------------+-------------+----------+

WOW – your port 22 filter has been magically transformed to allow all traffic!

So what’s going on?

To understand what the problem is, you’ll need to look at one the changes made to the APIC GUI between v4 and v5.  It’s not listed in the Release Notes (although given the consequences, it should be.)

Start with a visit to https://developer.cisco.com/site/apic-mim-ref-api/ and check out the details for the object vz:Entry for APIC version 4.2. Or just trust that I have it right below.

Then check out the same thing for v5.x (Note: At the time of writing, the https://developer.cisco.com/site/apic-mim-ref-api/ v5.0(1) Model did NOT reflect what I found on a real APIC, as shown below from v5.0(2h) – so the change may have come between v5.0(1) and v5.0(2))

I think you can spot the difference. I’ve made it pretty obvious.

What you may not have realised is that when the filter information gets pushed to the leaves, it is the textual Constant value (i.e., the ssh) that gets pushed in the filter, rather than the numeric value (stupid idea in my opinion, but I didn’t write the code so my opinion doesn’t count)

When the switches still running v14 (the switch equivalent of APIC v4) code see the textual ssh, they look up the list of constants from the first list above and don’t find it, so use the default instead.

Conclusion

This is a bad thing. This will happen again if there is ever another port added to the list of constants. Cisco should do something about it.

What should Cisco do?

The way I see it, Cisco should do both of these things to avoid further problems in the future.

  1. Have the APIC always send filters as port numbers. Why it is any different I’ll never understand.
  2. Not have the default as unspecified(0) – instead make it 65535 – at least that would change the filter to allow only one port through.

Side Issue

I first discussed this in a Facebook post where Daniel Pita picked up an error in the GUI related to this change (and had it filed as bug CSCvv49124 – visible only to internals).  If you try to edit the filter later in the filter view, you see red boxes around the letters SSH, and if you try to edit it and select SSH from the drop down, it won’t let you!

So, I hope I save someone from grief with this post, and maybe even spur Cisco on to improving their code.

RedNectar

And thanks to Daniel for his help. You should check his blog

About RedNectar Chris Welsh

Professional IT Instructor. All things TCP/IP, Cisco or Data Centre
This entry was posted in ACI, Cisco, Data Center, Data Centre. Bookmark the permalink.

2 Responses to ACI Version mismatch Alert. Don’t use v5 on APIC and v14 on Leaves

  1. Roger says:

    Hi Chris,

    I am running a home lab with 2x 9336PQ Spines, 2x 93108TC-EX leafs, and APIC-L2 on a Cisco UCS C220 M2 server. I’d like to run everything on ACI 5 – but as noted the PQ spine isnt supported – can only do 4. I’m thinking that if APIC and Leafs can get to 5x that the spines **shouldn’t** be an issue? I don’t plan on running Multi-Site or doing any L3Outs off the Spines.

    • Hi Roger,
      If it’s a home lab I think you should give it a try – I wouldn’t recommend it in a production network. You may get some weird errors that will be a challenge to debug, but that’s what home networks are for. But you can be pretty sure that if it works on your v5 home network, it would also work in production (no guarantees of course). And good luck.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.