Juniper SRX Cluster Failover Tuning

If you check Juniper configuration guide for SRX firewall clustering, there will be a default example of redundancy-group weight values which are fine if you have one Uplink towards outside and multiple inside interfaces on that firewall.

set chassis cluster redundancy-group 0 node 0 priority 100
set chassis cluster redundancy-group 0 node 1 priority 1
set chassis cluster redundancy-group 1 node 0 priority 100
set chassis cluster redundancy-group 1 node 1 priority 1
set chassis cluster redundancy-group 1 interface-monitor ge-0/0/5 weight 255
set chassis cluster redundancy-group 1 interface-monitor ge-0/0/4 weight 255
set chassis cluster redundancy-group 1 interface-monitor ge-5/0/5 weight 255
set chassis cluster redundancy-group 1 interface-monitor ge-5/0/4 weight 255

This is the one: https://www.juniper.net/documentation/en_US/junos/topics/topic-map/security-chassis-cluster-verification.html

But if!

If you get to a situation where you may have multiple outside interfaces which are giving you Internet access or WAN access redundancy then maybe you don’t want failover to secondary SRX box to occur when you lose one of those two uplinks. If that’s the case, you should follow this article and get your SRX cluster to behave as it should.

Juniper SRX cluster failover

In our example above, we have an SRX345 cluster of two nodes connected in between with interfaces ge-0/0/1 and ge-0/0/3 for fabric cluster and session sync.

ge-0/0/4 and ge-0/0/5 are two different outside interfaces both leading towards the Internet using different ISP. Note that ge-5/0/4 and ge-5/0/5 are those same interfaces on the secondary box but are automatically renamed to 5/0/… when the cluster is made.

ge-0/0/6 is our only inside interface on those boxes leading as access VLAN interface towards L2 Switch.

The story goes like this:

If you lose interface ge-0/0/4 or ge-0/0/5 of the primary node, you still want to leave that node active because those two interfaces are two redundant exits towards the Internet.

On the other side, when you lose ge-0/0/6 (LAN side) of the network, then you should failover immediately in order to keep your LAN segment up and running and connected to secondary Juniper firewall.

If the connection from LAN to secondary SRX fails secondary box will get out of production and if we lose primary box at that time we are out of service. But we are out of service as soon as we lose both connections to LAN so that’s ok.

If we loose connection from LAN to primary box, it is normal to expert the failover to happen imediatelly so that the secondary box can start forwarding traffic for that LAN.

Failover occurs when the priority of primary box redundancy-group 1 (the group that monitors interfaces) gets to 0. The priority is diminished by the weight value of each failed object (interface) that we are monitoring.

set chassis cluster redundancy-group 0 node 0 priority 250
set chassis cluster redundancy-group 0 node 1 priority 200
set chassis cluster redundancy-group 1 node 0 priority 250
set chassis cluster redundancy-group 1 node 1 priority 200
set chassis cluster redundancy-group 1 gratuitous-arp-count 4
set chassis cluster redundancy-group 1 interface-monitor ge-0/0/4 weight 150
set chassis cluster redundancy-group 1 interface-monitor ge-0/0/5 weight 150
set chassis cluster redundancy-group 1 interface-monitor ge-0/0/6 weight 254
set chassis cluster redundancy-group 1 interface-monitor ge-5/0/4 weight 150
set chassis cluster redundancy-group 1 interface-monitor ge-5/0/5 weight 150
set chassis cluster redundancy-group 1 interface-monitor ge-5/0/6 weight 254

commit

In our configuration, the primary box is configured with priority 250 and outside interfaces are carrying the weight of 150. When one of the outside interfaces fails it will diminish primary SRX box priority to 100 but it will still be more than 0 so it will not failover. If for some reason, both outside interfaces from primary SRX are down, the priority of that primary box will get to 0 and the failover will occur.

If on the other side, primary SRX loses the connection to switch (inside) the weight of interface ge-0/0/6 is 254 so it will diminish the primary SRX priority to 0 and SRX will failover imediatelly.

 

 

Leave a Reply