14-SEP-2012: You Can't Get There From Here (Policy Based Routing)

Which way do I go ? Well, you see... the problem is... you can't get there from here

Occasionally in the course of server management it becomes desirable, or sometimes even necessary, to configure a UNIX or Linux system with network interfaces on multiple networks.

On the surface, the problem appears straight-forward. It may indeed be as simple as it sounds depending on what you do after configuring the network interface. Often times, however, after configuring the network interface and reaching that early success an enterprising system administrator will then say themselves "Excellent, I now have an address on this network now I want to be able to reach it from other networks ! I know, I'll add a route !" and off they go.

The problem is routing, standard routing, is based solely on the destination of the packet. Read the last sentence carefully and you will notice several things. First, and most obvious, of which is that the destination address is used to determine which route to take -- this seems obvious when stated plainly but the subtlety can often be missed. The second thing to notice is that it is the destination of the packet -- notice that there were no other qualifiers that might indicate that the packet is part of some higher level stream, since indeed that may not be the case.

You see, when most people add the route they think they want they are thinking of things like TCP sessions or connections being tracked and handled that way. But that's not what is being specified by that route.

As an example, let's say we have a machine that starts out like this:

  server# ifconfig eth0 netmask broadcast
  server# route add default gw

Then we add our second network interface

  server# ifconfig eth1 netmask broadcast

And then from our client (and presuming that we have a route through the network to we try to reach the box (server) on the IP from another host, let's say The results can vary from it works, it works sometimes, or it doesn't work at all. The reason for this lay in our host's routing table.

When we send the packets for our TCP session from our client to they take a particular path through the network and end up coming in on the server's "eth1" interface. Since the packets are associated with a TCP socket with the destination address of, the packets going back will have the source address set to However when the packets from that TCP session are sent back from the server to the client only the destination ( is used. Thus, the packets will leave the server on its default route and via the network interface "eth0". This can lead to it taking a different path back to the client than the packets from the client to the server.

This may be fine for example if no reverse-path/egress filtering is done by the router on that interface, and if no stateful firewalls exist along one path but not the other.

It also means that if the router becomes unavailable, you will not be able to receive packets from the system (unless you happen to be connected to one of the broadcast domains it is on).

This is very likely undesirable behavior.

To correct this the enterprising administrator will likely do something similar to:

  server# route add default gw

And think that because they have added a second default gateway via "eth1" that packets will start going out that interface if they have a source address of the IP of "eth1". But this is wrong. Again, normal routing is based only on the destination address of packets. Also the routing entries are processed from most-specific to least-specific so now packets may end up leaving the system via either "eth0" or "eth1" at random (this is indeed what happens on Solaris).

So how do we solve this problem ? How do we get packets to do what we want ? Well, first we have to define the problem. Up until now we've only defined the symptoms and the current behavior.

The problem is that packets are going out the "wrong" interface. How do we define which interface is the right interface ? It's easy -- the right interface is the interface with the IP of the source address of the packet. Looked at this way we can see that we want our routing to be source-based instead of destination-based. This is implemented by using the technique of Policy Based Routing.

So how does one actually implement this Policy Based Routing ("PBR") ? It depends on the platform. On Linux one would use the IP Advanced Routing features, on Solaris one would use "ipf".

To implement the above example using PBR on Linux, first we would remove the extraneous default route entry we just added, because it's wrong:

  server# route del default gw

The way Linux Advanced Routing handles policy based routing is through the use of multiple routing tables. This allows for very flexible, but very clearly defined, routes to be configured.

Just creating additional routing tables isn't sufficient, however, since we need to actually tell the Linux routing system when to use which routing table. This is done with rules.

Also, we can't simply get rid of the "default gateway" entry in the "main" routing table (the default name of the routing table routing which the "route" command manipulates) because it is used to determine the source IP address to use when creating sockets having not explicitly specified a source.

Alright, so we have our two concepts: routing tables (of which we can have several of) and rule entries (also of which we can have several of). To actually convey the configuration changes to the system we use the "ip" command.

First we create our new routing tables. Routing tables are identified by a number (names can be associated with these numbers for convenience, but for clarity here we will just use the numbers). To do this we would do something like:

  server# echo "Configuring eth0"
  server# ip route add dev eth0 table 100
  server# ip route add default via table 100
  server# echo "Configuring eth1"
  server# ip route add dev eth1 table 101
  server# ip route add default via table 101

Second we create our rules. Our rules implement our policy. Our policy is to classify routes based on their source address. Our rules would thus be something like:

  server# ip rule add from table 100
  server# ip rule add from table 101

We can then verify that our routes are being used by using the "ip route get" command:

  server# ip route get from from via dev eth0
      cache   mtu 1500 advmss 1460
  server# ip route get from from via dev eth1
      cache   mtu 1500 advmss 1460

It worked.