mailing list archives
PMTU-D: remember, your load balancer is broken
From: Marc Slemko <marcs () znep com>
Date: Tue, 13 Jun 2000 17:04:19 -0600 (MDT)
This is your monthly PMTU-D horkage rant.
Chances are that if you are using a load balancer for TCP connections,
then it does not properly handle Path MTU Discovery. Examples of devices
like the ones I am talking about that do not, last I knew, handle this
properly are localdirectors and arrowpoints. F5 claimed that they fixed
their big/ip product to do this properly some time ago (remember when they
broke NSI's whois service this way?), but I haven't seen it in action yet
or know what version is required, and their support channels don't seem to
know much about it when asked and give nonsensical answers like "it is
built into the BSD/OS system that our product is built on".
I would love to know about any such load balanceres that actually do
handle this right.
For an explanation of PMTU-D, see http://users.worldgate.ca/~marcs/mtu/
What happens with most load balancers is that when the server behind them
tries to use PMTU-D, the ICMP "can't fragment" that may come back from a
router between the server and the client will not make it to the
load-balanced server because the load balancer will throw it away.
The result is that most users with a path MTU that is less than min(client
MTU, server MTU) will be unable to receive data from the server.
The fix is to bitch at your vendor to fix their broken system and to
tell them to hire someone that knows something about how TCP works.
If you are a vendor, then make sure your load balancing software
works right. What it needs to do is either send the "can't fragment"
on to just the backend servers that have connections from the remote
IP, or to flood it to all of them.
The workaround for the person using such load balancers is to
disable PMTU-D on your backend servers. This is your only option
if the vendor of your load balancer doesn't care or takes a while
to release a fix.
If you have complaints that small subset of clients that can open a TCP
connection to your load balanced IP but can't receive any reponse to their
request, this could be what is up.
(yes, www.slashdot.org seems to be broken in this way as I type. Oh well,
slashdot isn't always a good thing...)
This has been your monthly PMTU-D horkage rant.