F5 Failover in AWS

F5 requires IMDSv1 in order to initiate failover between two F5 devices. IMDSv1 is susceptible to SSRF vulnerabilities as indicated in the AWS document. If IMDSv1 is disabled in AWS environment for security reasons, F5 failover will not be seamless and the F5 logs will have errors like this:

err logger[15542]: /usr/libexec/aws/aws-failover-tgactive.sh (traffic-group-1): Instance sanity check failed with error:

F5 is tracking support for IMDSv2 in AWS internally using ID 968657

F5 Code Upgrade Steps

This is a rough template of F5 Code Upgrade steps that could be of help for your maintenance work.

  1. Before performing any F5 code upgrade, make sure that the “Service Check Date” on the device is AFTER the License Check Date for the new code version as listed here in SOL7727
  2. Upload the new code to the partition that you prefer on the F5.
  3. cpcfg to the new code version location – Example: cpcfg HD1.2

    Although “cpcfg HD1.x” has worked most of the times, I would recommend backing up the .UCS file in a remote location and also saving a copy in “/shared/tmp/<UCS File>“. After saving the UCS file in the “/shared/tmp/” location, you can utilize “load /sys ucs <path/to/UCS> no-license” to load the configuration as noted in SOL12880

  4. Reboot.This will take about 5-10 minutes for Hotfix updates and about 15-20 minutes when migrating major code version.

Recommended maintenance window is about 1 hour. This could change depending on any application level testing that you would like to incorporate within your maintenance window.

Reference:

F5 Code Upgrade – 10.x to 11.x

F5 v11.x Device Trust Group

A week ago, I was upgrading HA F5 pair from 11.5.1 to 11.5.3 and noticed the existence of default “device_trust_group” in sync-only mode in GUI. I did not create it but it just showed up and there wasn’t a way to delete it. Apparently, this always existed in the background but was exposed via GUI in the later 11.x versions. Based on my experience, it wasn’t exposed via GUI in 11.5.1 but was exposed via GUI from 11.5.6

Device_Trust_Group

Reference: DevCentral

F5 Upgrade from v10 to v11 – Lessons Learned

Pre-Maintenance Checks:

Make sure that the F5 is running in “Volume Partition” mode. “lvscan” within bash should provide output like this:

config # lvscan
ACTIVE ‘/dev/vg-db-sda/dat.share.1’ [30.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/dat.log.1’ [7.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/dat.swapvol.1’ [1.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.1.root’ [392.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.1._usr’ [2.48 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.1._config’ [3.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.1._var’ [3.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.2.root’ [256.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.2._usr’ [1.34 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.2._config’ [512.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.2._var’ [3.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.3.root’ [256.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.3._usr’ [1.34 GB] normal
ACTIVE ‘/dev/vg-db-sda/set.3._config’ [512.00 MB] normal
ACTIVE ‘/dev/vg-db-sda/set.3._var’ [3.00 GB] normal
ACTIVE ‘/dev/vg-db-sda/dat.maint.1’ [300.00 MB] normal

Make sure there are no HTTP Classes in your configuration, other than the default “httpclass” by checking Local Traffic  ››  Profiles : Protocol : HTTP Class from the F5 GUI.

Make sure that there are no spaces in profile naming. SOL15144

As a precaution, go through your configuration and remove any unwanted/unused configuration elements like a “Test Virtual Server” or configuration from the past that is not in use at the moment.

Load the code version to the F5 LTM via GUI, SCP or any other preferred method.

Maintenance:

Before performing any F5 code upgrade, make sure that the “Service Check Date” on the device is AFTER the License Check Date for the new code version as listed here in SOL7727

If not, the maintenance would include a license re-activation step before proceeding with code upgrade. This step would take about 10-20 minutes.

cpcfg to the new code version location – Example: cpcfg HD1.2

Although “cpcfg HD1.x” has worked most of the times, I would recommend backing up the .UCS file in a remote location and also saving a copy in “/shared/tmp/<UCS File>“. After saving the UCS file in the “/shared/tmp/” location, you can utilize “load /sys ucs <path/to/UCS> no-license” to load the configuration as noted in SOL12880

Reboot. This will take about 20 minutes for the device to load the new configuration and come back up. If you are using HA F5, upgrade the Standby F5 first. It will take a few minutes for the Standby F5 to become “Active”. So, be patient.

Conservative estimate for the maintenance window is about 1 hour. I would recommend giving yourself 90 minutes, if you are not familiar with F5 code upgrade. Downtime can be minimized if you have BigIP F5 in High Availability Active/Standby Pair.