Brocade Virtual Traffic Manager with IP Transparency on multiple ethernet interfaces and linux source routing

Brocade Virutal Traffic Manager

Brocade Virtual Traffic Manager formerly call Riverbed SteelApp or even Stingray Load Balancer.

I used ZeusTM103Linux-x86_64.tgz installer and deployed it on CentOS 6.7 with selinux disabled.

This post was written to describe a quite complex solution for a very unnatural design.

These days most modern web application should be X-Forwarder-For aware. Simply said web apps should have the ability to read host headers to find out who the caller is.

Most like not all of the Load Balancers can inject into the host header portion of the request the X-Forwarder-For which will include the sender IP.

Unfortunately, not all web apps are so modern and minority of them does not have the ability to read host header or simply to extract X-Forwarder-For injected by Load Balancer from the http packets.

With this architecture, the back-end server views the client’s request as originating from the Load Balancer Brocade Virtual Traffic Manager machine, not from the remote client. This can be a disadvantage if the back-end server performs access control based on the client’s IP address, or if the server wishes to log the remote IP address.

Brocade vTM inject X-Cluster-Client-IP to the header portion to helps identify the server what the client IP is. But as we said to use X-Cluster-Client-IP the would need to be rewritten which in most case is not achievable.

If customer however wishes to know who the caller is...

In situations where these workarounds are not appropriate, the Traffic Manager can spoof the source IP address of the server-side connection so that it appears to originate from the client’s remote IP address. This capability is known as IP transparency.

For the purpose of this post I will use IP transparency however in a more complex design and explain how it all works.

The crucial to know about IP transparency to know before you begin is:


IP Transparency can be used selectively.

For example, if the Traffic Manager was balancing traffic for a Web farm and a mail farm, you might want SMTP traffic to be IP transparent, but not require that the Web traffic is transparent.

Transparency is enabled on a per-pool basis; you can configure a Web pool that is not transparent, and an SMTP pool that is transparent. The Web pool and the SMTP pool can balance traffic onto the same back-end nodes, or different nodes.

IP Transparency is available by default.

On all versions of the Traffic Manager appliance image, virtual appliance, or cloud service Traffic Manager software variants can use native IP transparency functionality on Linux or UNIX hosts under the following conditions:

  • The Traffic Manager software is installed and running as the root user.
  • The host operating system uses a kernel at version 2.6.24 or later.
  • The host operating system uses iptables at version 1.4.11 or later (versions of iptables earlier than 1.4.11 are also supported provided the --transparent option is available).

For kernel versions between 2.6.18 and 2.6.24, Brocade provides support for IP Transparency through an additional dedicated kernel module.

Routing Configuration

Each server node that receives transparent traffic must route its responses back through the Traffic Manager that sent it. This is achieved by configuring the default route on the node to be one of the back-end IP addresses of the Traffic Manager. When the server node replies to a request, it will address its response to the remote client IP. Provided the routing is configured correctly, the originating Traffic Manager will intercept this traffic, terminate the connection and process it as normal.

Local Routing Problems

If you use IP transparency, clients on the same network as the back-end nodes will not be able to balance traffic through the Traffic Manager to those nodes.

This is because the back-end server nodes always attempt to reply to the source IP address of the connection. If the source IP address (the client’s IP) resides on a local network, the server nodes will attempt to contact the client directly rather than routing via the Traffic Manager system that originated the connection. The connection will appear to hang.
In this case, it might be appropriate to segment the back-end network so that, from the server nodes’ perspective, the clients appear to reside on a separate subnet, which must be reached via the default gateway (the Traffic Manager system).
Alternatively, a TrafficScript rule could selectively set the source IP address of the server connection to an IP on the Traffic Manager system if the client lies on the same network as the server nodes.[1]

[1]: The script I will use is shown below:

## TrafficScript Rule ##
internal_transparancy_avoidance-web  
$client = request.getRemoteIP();  
$return = "172.22.33.140";  
$net = "172.22.33.0/24";  
if ( string.ipmaskmatch($client, $net) ) {  
request.setRemoteIP( $stm );  
}

Scenario1 - Multiple Sites via Single IP with SSL-Offloading and Multiple Back-ends on the same subnet

We want to pass through the load balancer traffic designated to multiple sites (multiple back-end pools) which are on the same subnet but we also want to SSL-Offload and retain IP transparency.

We pointed all of our domain names to the same public IP.
nfsec.co.uk, hack.cx, evillain.com => 31.33.7.104
dev.nfsec.co.uk, dev.hack.cx, dev.evillain.com => 31.33.7.104

We have Virtual Servers configured with TrafficScript rule to read the host headers and redirect traffic to appropriate pools.

The pools (DEVELOPMENT, PRODUCTION, TEST) have IP transparency enabled.

Every server in the back-end have a default gateway configured to point the TrafficIP 172.22.33.140.

Scenario2 - Multiple Sites to Multiple Back-ends on Multiple Subnets

We want to pass through the load balancer traffic designated to couple back-end pools which are on separate subnets and retain IP Transparency.

We pointed admin.nfsec.co.uk to another public IP 31.33.7.106 which is NAT'ed to 172.22.22.140.

We can use Public DNS translation to resolve to 31.33.7.106. Let's say it works but we won't us it or block it and instead we will use Local DNS translation to resolve admin.nfsec.co.uk.local to 172.22.22.140 because its a test api call to ADMIN portal and we can only access it locally due to safety reasons.

Our another back-end pool ADMIN has IP Transparency enabled and back-end server has a default gateway configured for 172.22.22.140.

Scenario3 - Initiate traffic from back-end pool to another back-end pool on the same subnet using the same single TrafficIP for IP Transparency - transparency avoidance

We want to initiate traffic from back-end pool to a back-end pool on the same subnet. Both back-end pools share the same TrafficIP and have IP Transparency enabled. The traffic is send to an URL which is FQDN (test.nfsec.co.uk, test.hack.cx, test.evillain.com). Traffic then hits Cisco ASA 5515-X and because Translate DNS replies rule is enabled traffic is being send back down the ASA interface because ASA translate DNS request and learned that it has a NAT for this connection flow. Traffic then hits Brocade vTM. Here we have to use a TrafficScript Rule to spoof source IP for the destination pool. In other words we used the TrafficScript rule to do a "transparency avoidance" and break the rule described earlier in Local Routing Problems paragraph.
Otherwise, if we won't use that rule we will send a request to TEST back-end pool from source IP of PRODUCTION back-end pool. If this will happen the returning traffic would not traverse back via Brocade vTM TrafficIP but would be send directly to PRODUCTION pool server. Traffic will hang. Because we used the TrafficScript rule the connection flow is valid.

Scenario4 - Standard single 1:1 NAT and IP Transparency

This is just self explanatory.

Scenario5 - Standard load balancing without IP Transparency

Similarly to Scenario1 we want to make API call from PRODUCTION web servers to TEST pool however this time using Local DNS Translation to test.nfsec.co.uk.local (172.22.33.160) via another VIP TrafficIP on the load balancer however without IP Transparency enabled.

Traffic Flow Diagram illustrating all scenarios

IP Transparency Diagram

The diagram below illustrates our five scenarios including the one where were have IP Transparency disabled.

Linux Stingray IP Transparency

Linux Source Routing

We used multiple Ethernet Interfaces on our Brocade Virtual Traffic Manager because we have back-end servers in multiple subnets.

We want to be able to connect to our back-end servers using Public IP therefore NAT on ASA needs to be setup pointing directly to the back-end server Private IP. We want to be able then the back-end server to ping each other too.

Back-end Servers Access - NAT’s

LB NAT:       31.33.7.200 => 172.22.33.101  
DEV1 NAT:     31.33.7.201 => 172.22.33.20  
ADMIN NAT:    31.33.7.202 => 172.22.22.10  
DATABASE NAT: 31.33.7.203 => 172.22.44.200  

Note: Because default gateway is set to 172.22.33.254, ETH1 and ETH2 will route out via this GW and NAT (reverse path forwarding route) will not be created, so NAT will not work.

Solution: enable FIX SOURCE ROUTING configuration and point all back-end server GW to the LB ETH1 and ETH2 interfaces (Traffic IP) IP respectively.

Change: Now the green color routing configuration (arrows) will take advantage. Back-end servers if want to ping each other they will not ping itself via LB interfaces (default GW) but this traffic will be FORWARDED to ASA and routed there to directly connect interfaces (WEB, ADMIN, DATABASE). In result you will see something like this:

Configuration:

ip route add 172.22.33.0/24 dev eth0 table 1  
ip route add default via 172.22.33.254 dev eth0 table 1  
ip route add 172.22.44.0/24 dev eth1 table 2  
ip route add default via 172.22.44.254 dev eth1 table 2  
ip route add 172.22.22.0/24 dev eth2 table 3  
ip route add default via 172.22.22.254 dev eth2 table 3

ip rule add iif eth0 table 1  
ip rule add iif eth1 table 2  
ip rule add iif eth2 table 3  

Note: You need to add this to /etc/rc.d/rc.router for consistency across reboots.

Don't also forget to update kernel parameters:
net.ipv4.ipforward = 1
net.ipv4.conf.all.forwarding = 1
net.ipv4.conf.all.accept
redirects = 1
net.ipv4.conf.all.sendredirects = 1
net.ipv4.conf.eth0.send
redirects = 1
net.ipv4.ipnonlocalbind = 1
net.ipv4.default.rpfilter = 1
net.ipv4.conf.default.accept
source_route = 0

Ping from DATABASE to DEV1

[root@DATABASE ~]# ping 172.22.33.20
PING 172.22.33.20 (172.22.33.20) 56(84) bytes of data.  
64 bytes from 172.22.33.20: icmp_seq=1 ttl=63 time=1.00 ms  
From 172.22.44.101: icmp_seq=2 Redirect Host(New nexthop: 172.22.44.254)  
64 bytes from 172.22.33.20: icmp_seq=2 ttl=63 time=0.836 ms  
From 172.22.44.101: icmp_seq=3 Redirect Host(New nexthop: 172.22.44.254)  
64 bytes from 172.22.33.20: icmp_seq=3 ttl=63 time=0.858 ms  
From 172.22.44.101: icmp_seq=4 Redirect Host(New nexthop: 172.22.44.254)  
64 bytes from 172.22.33.20: icmp_seq=4 ttl=63 time=0.912 ms  
From 172.22.44.101: icmp_seq=5 Redirect Host(New nexthop: 172.22.44.254)  

Conclusion: without source routing configuration NAT and network visibility of the back-end server will not be possible.

We want IP Transparency to work.

All the above now works fine.

NAT from ASA to LB ETH0 Traffic IP 172.22.33.104 works fine too.

But when we wanted to create our website VIP NATs from ASA to LB ETH1 Traffic IP and ETH2 Traffic IP, the ASA started complaining that rpf-check failed. NAT failed as ASA could not find reverse path forwarding route for this traffic flow.

Even though the ASA have seen arp entry for these IP's on appropriate interfaces the NAT could not work.

Why? Because Load Balancer default GW was setup via ETH0.

[root@nfsec-lb1 ~]# route -n
Kernel IP routing table  
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface  
172.22.22.0     0.0.0.0         255.255.255.0   U     0      0        0 eth2  
172.22.33.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0  
172.22.44.0     0.0.0.0         255.255.255.0   U     0      0        0 eth1  
169.254.0.0     0.0.0.0         255.255.0.0     U     1002   0        0 eth0  
169.254.0.0     0.0.0.0         255.255.0.0     U     1003   0        0 eth1  
169.254.0.0     0.0.0.0         255.255.0.0     U     1004   0        0 eth2  
0.0.0.0         172.22.33.254   0.0.0.0         UG    0      0        0 eth0  

How we are going now find solution if the direct 1:1 NAT does not work?

Solution: We will create NAT's from ASA to LB ETH0. We will raise Traffic IP VIP on the LB ETH0. Our IP Transparency Traffic IP: 172.22.22.140 and 172.22.44.140 will not be attached to any back-end pool. Instead they will be just there - virtually created - unassigned - being just GW for their back-end server.

Change: Doing this will raise those IP on the LB interfaces so that they will exist and will allow for traffic flow.

Our WEBSITE VIP's will be as follows:

WEBSITE VIP NAT’s  
NAT: 31.33.7.104 => 172.22.33.104  
NAT: 31.33.7.106 => 172.22.33.105 (NOT 172.22.22.140 ETH1)  
NAT: 31.33.7.108 => 172.22.33.106 (NOT 172.22.44.140 ETH2)  
RAISED ON LB WEB ETH0  

Explanation: How the connection will flow?

Example1: Let's analyse this example.

  1. User will call dev.nfsec.co.uk.
  2. ASA will pickup the call and FORWARD it to its NAT 172.22.33.104 on WEB Gi0/1 interface.
  3. LB will receive this call and based on the http host header using its TrafficScript rule will redirect the connection to the back-end pool DEVELOPMENT where DEV1 server is.
  4. DEV1 will receive the call and originating IP will be your IP.
  5. DEV will respond exactly the same path back to the sender.

Example2: Let's analyse another example.

  1. User will call db.hack.co.uk.
  2. ASA will pickup the call and FORWARD it to its NAT 172.22.33.106 on WEB Gi0/1 interface.
  3. LB will receive this call on ETH0 VIP 172.22.33.106 and based on the http host header will redirect the connection to the back-end pool DATABASE where DATABASE server is 172.22.44.200. As you see the LB will pass traffic directly to the back-end server and not the Dummy (GW) TrafficIP 172.22.44.140.
  4. DATABASE will receive the call and originating IP will be your IP.
  5. DATABASE will respond to the sender IP via its GW 172.22.44.140.
  6. Response from DATABASE will arrive on LB ETH2 and via LB default GW 172.22.33.254 (eth0) will be send back to ASA to complete NAT rpf-check.
  7. Traffic flow with the sender is complete.

Overall Result: What you will see on the back-end DATABASE server?

IP Transparency enabled:

[root@DATABASE ~]# cat /var/log/nginx/access.log
77.213.126.200 - - [20/Mar/2016:13:15:48 +0000] "GET / HTTP/1.1" 304 0 "-"  
77.213.126.200 - - [20/Mar/2016:13:15:55 +0000] "GET / HTTP/1.1" 304 0 "-"  

IP Transparency disabled:

[root@DATABASE ~]# cat /var/log/nginx/access.log
77.213.126.200 - - [20/Mar/2016:13:15:48 +0000] "GET / HTTP/1.1" 304 0 "-"  
77.213.126.200 - - [20/Mar/2016:13:15:55 +0000] "GET / HTTP/1.1" 304 0 "-"  
172.22.44.101 - - [20/Mar/2016:13:10:46 +0000] "GET / HTTP/1.1" 304 0 "-"  
172.22.44.101 - - [20/Mar/2016:13:10:47 +0000] "GET / HTTP/1.1" 304 0 "-"  



Final conclusion: To make every of the above scenarios possible we have to configure routing on our load balancer to pass traffic from Ethernet directly connect subnets to the Cisco ASA which will then route it appropriately. We called it Linux Source Routing.

Without configuring source routing we will end up with asymmetric routing traversing load balancer only and all traffic sending via default gateway.

Linux Source Routing

Share this post:

by lo3k