Cisco Aci Best Practices Quick Summary
Cisco Aci Best Practices Quick Summary
Public
Where
2.2(2q): Fabric > Access Policies > Global Policies > Fabric Wide Setting Policy
3.0(2) – latest: System > System Settings > Fabric Wide Setting
Options/Notes
This will only work on -EX, -FX, or later leaf nodes. When running older leaf node models, use
Limit IP Learning to Subnet instead on all bridge domains.
Endpoint IP Aging
By default, each endpoint (one MAC address and one or more IP addresses) has only one aging timer,
which is called the endpoint retention timer. IP Aging enables each IP address to maintain its own timer so
that it can age out individually. Without this feature enabled, as long as a MAC address remains active in
the fabric, all associated IP addresses that were learned, even if those IP addresses are no longer
originating traffic, will remain learned in the fabric associated to that MAC address.
Where
2.1(1) – 2.3(1): Fabric > Access Policies > Global Policies > IP Aging Policy
3.0(1) – latest: System > System Settings > Endpoint Controls > IP Aging
The best practice is to enable this option (potentially also with "Enable MCP PDU per VLAN") on leaf node
ports that are connected to external Layer 2 networks that may introduce loops.
Where
1.1(1) – 3.1(2): Fabric > Access Policies > Global Policies > MCP Instance Policy default
3.2(1) – latest: Fabric > Access Policies > Policies > Global > MCP Instance Policy default
Options/Notes
The “Enable MCP PDU per VLAN” option (available after 2.0(2)) enables MCP to send packets on a
per-VLAN basis. Otherwise, these packets will only be sent on untagged VLANs and loops will be
detected only on those VLANs. Per VLAN MCP has a scalability limit of 256 VLANs per interface.
Cisco ACI has a per leaf node scalability limit of 2,000 logical ports (VLANs x ports).
If your system's scale might exceed these limits, make sure to be cautious when enabling MCP,
especially with per VLAN, because handling the MCP PDUs per VLAN can be CPU intensive.
See the Verified Scalability Guide for up-to-date scalability numbers for each firmware version.
Rogue Endpoint Control identifies an endpoint (MAC/IP address) as rogue when the same endpoint is
learned on different interfaces multiple times within the configured interval. The misbehaving rogue
endpoint is pinned down to the interface on which it was last learned to prevent the further move and will
be deleted after the configured hold interval. This protects the Cisco ACI fabric from constantly having to
update the devices in the fabric regarding the new endpoint location, allowing for a more stable Cisco ACI
environment.
A fault is also raised for both options, which can then be sent to your syslog/SNMP trap, if configured.
The best practice is to enable Rogue Endpoint Control, which acts per endpoint instead of per port or
bridge domain as with EP Loop Protection.
3.0(1) – latest: System > System Settings > Endpoint Controls > Ep Loop Protection
Options/Notes
When Rogue Endpoint Control is enabled, EP Loop Protection does not take effect. Choose one
or the other after understanding the pros and cons of each option to mitigate the impact of loops.
See the Cisco ACI Design Guide and Cisco ACI Endpoint Learning for details.
When enabling Rogue Endpoint Control or EP Loop Protection in the existing fabric, ensure that
there are no loops or flaps currently happening in the fabric. Otherwise, the error actions will take
place immediately.
The best practice is not to enable this option when the default gateway for endpoints is not the bridge
domain SVI.
Where
Tenant > Networking > Bridge Domains > Policy > L3 Configurations
Options/Notes
When the default gateway for endpoints is not the bridge domain switch virtual interface (SVI), the
bridge domain only does switching. If Unicast Routing is enabled in this case and IP addresses are
learned on the bridge domain, this configuration may lead to a packet forwarding issues. See Cisco
ACI Endpoint Learning whitepaper for details.
L2 Unknown Unicast
L2 Unknown Unicast decides whether the bridge domain should flood packets that are destined to an
unknown MAC address (Flood) or should send it to a spine node for COOP database lookup (Hardware
Proxy).
The best practice is to set this option to Flood in either of the following scenarios:
Where
Tenant > Networking > Bridge Domains > Policy > General
ARP Flooding
ARP Flooding decides whether the bridge domain should flood ARP requests all the time (Enabled) or
should look up the target IP address in the ARP header and perform unicast routing (Disabled).
The best practice is set this option to Enabled when there are clustered servers, firewalls, or load
balancers so that GARP is flooded.
Where
Tenant > Networking > Bridge Domains > Policy > General
Options/Notes
See the Cisco ACI Design Guide for details.
QoS Settings
DSCP Translation
DSCP Translation translates Cisco ACI QoS classes into DSCP in the outer IP address header of VXLAN
packets to ensure that the classes are preserved when traffic is traversing across pods or sites. Without
this option, Cisco ACI QoS classes are carried as CoS by way of the outer Dot1Q header, which has a
higher risk of being changed or removed in IPN/ISN.
The best practice is to enable DSCP Translation and assign DSCP classes that are not used in IPN/ISN to
Cisco ACI QoS classes, which ensures that those DSCP values are not overwritten by IPN/ISN.
Where
Tenant > infra > Policies > Protocol > DSCP class-CoS translation policy for L3 traffic
Options/Notes
DSCP Translation and Preserve CoS cannot be used at the same time.
Preserve CoS also translates Cisco ACI QoS classes along with the original CoS from the ingress
leaf node into DSCP. However, Preserve CoS uses non-configurable internal DSCP mappings,
which means that users do not have the flexibility of choosing which DSCP values to trust and to be
untouched in IPN/ISN, while DSCP Translation enables you to map DSCP values of your choice to
Cisco ACI QoS classes with a trade-off of not being able to perverse the original CoS.
If your Cisco ACI fabric is not using neither Cisco ACI Multi-Pod and Cisco ACI Multi-Site, you may
use Preserve CoS.
The best practice is to enable this option with zero active fabric ports as the threshold.
Where
1.2(2) – 3.2(1): Fabric > Access Policies > Policies > Global > Port Tracking
Options/Notes
If all of your non-Cisco ACI devices are connected to two or more leaf nodes for redundancy with
an appropriate failover mechanism, such as vPC, you may configure more than zero as the
threshold.
If all of your APICs are connected to two leaf nodes for redundancy, you may enable the Include
APIC ports option.
Where
Prior to 4.0(1): Admin > AAA > AES Encryption Passphrase and Keys for Config Export (and Import)
From 4.0(1): System > System Settings > Global AES Passphrase Encryption Settings
Options/Notes
If you forget the passphrase, reconfigure AES Encryption with a new passphrase and export the
configuration again.
VLAN Pool
A VLAN Pool decides which VXLAN ID (VNID) is assigned to each VLAN. For example, VLAN 10 from VLAN
pool A and VLAN pool B will be assigned different VNID. AEPs represent a group of interfaces on Cisco ACI
switches. The Cisco APICs decide which VLAN pool to use for which VLAN on which interface based on
domains such as physical domain that tie a VLAN pool and AEPs.
The best practice is to configure minimum number of VLAN pools to avoid overlapping VLAN ranges.
Options/Notes
When there are multiple VLAN pools with overlapping VLAN ID ranges tied to the same AEP, VNID
assignments may be indeterministic and cause various issues to endpoint learnings, STP BPDU
flooding, and so on.
Ultimately, one or two VLAN pools for the entire fabric may be enough if you do not need features
that require different VLAN pools, such as per-port-VLAN.
Consider Enforce EPG VLAN Validation under System > System Settings > Fabric Wide Setting
(available starting in the 3.2(6) release), which prevents two domains containing overlapping VLAN
pools from being associated to the same EPG. If you are familiar with the VNID assignment logic
and need to use overlapping VLAN pools on purpose, you do not need this validation. Otherwise,
we recommend that you enable this option.
The best practice is set this metric to 62 or lower as opposed to the maximum 63, which is the default.
Where
Prior to 5.0(1): Fabric > Fabric Policies > Policies > Pod > ISIS Policy Default > ISIS metric for
redistributed routes
From 5.0(1): System > System Settings > ISIS Policy > ISIS metric for redistributed routes
Options/Notes
When a spine node reboots or newly joins a fabric, until the spine node stabilizes and completes
the policy download from the Cisco APIC, the node tries to advertise ISIS redistributed routes with
the higher metric. This is known as "overload mode." If the ISIS Redistribution Metric is kept at
the default value of 63, which is the maximum, the overload functionality is ineffective, since the
metric for overload and non-overload is the same. This results in potential longer convergence
times after a spine node reboots in a Cisco ACI Multi-Pod setup. By lowering the value, leaf nodes
can prefer other stable spine nodes to reach the other pods.
COOP Group
Setting COOP Group to Strict enables the Cisco ACI switch nodes to use MD5 authentication for all COOP
communication to ensure that Cisco ACI switch nodes will exchange COOP database information only
between the switches in the same fabric.
Options/Notes
The MD5 token is automatically updated every hour by the Cisco APICs and is sent to the switches
managed by the Cisco APICs.