Implement proper multihoming support, second attempt.
This patch makes our HTTP client multihoming-aware, so if one server fails for
whatever reason (including timeout), we'll fall back to the next. It's a bit
more complex than the first attempt, but we're hopefully not breaking SSL
connection (incl. checkin) anymore.
Also includes one patch from upstream, in that timeouts are converted from
Java's exception hierarchy to our own exceptions.
Here's an example tcpdump from a fake checkin server with both AAAA and A records,
where the IPv6 connectivity is deliberately broken to demonstrate the effects of
this patch:
11:49:28.202620 IP6 2620:0:105f:a:223:76ff:fe8d:3a3c.37109 > 2001:700:300:1880::2.80: S 24035192:24035192(0) win 5760 <mss 1440,sackOK,timestamp 1110775 0,[|tcp]>
11:49:31.211370 IP6 2620:0:105f:a:223:76ff:fe8d:3a3c.37109 > 2001:700:300:1880::2.80: S 24035192:24035192(0) win 5760 <mss 1440,sackOK,timestamp 1111075 0,[|tcp]>
11:49:37.211186 IP6 2620:0:105f:a:223:76ff:fe8d:3a3c.37109 > 2001:700:300:1880::2.80: S 24035192:24035192(0) win 5760 <mss 1440,sackOK,timestamp 1111675 0,[|tcp]>
11:49:48.216299 IP 74.125.57.33.58205 > 129.241.93.35.80: S 2632654863:2632654863(0) win 5840 <mss 1372,sackOK,timestamp 1112775 0,nop,wscale 1>
11:49:48.216324 IP 129.241.93.35.80 > 74.125.57.33.58205: S 3149921981:3149921981(0) ack 2632654864 win 5792 <mss 1460,sackOK,timestamp 62633484 1112775,nop,wscale 8>
(...)
and then the HTTP connection proceeds as usual.
I intend to push this fix upstream once we get it reviewed and committed locally.
2 files changed