Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Sat Feb 08, 2014 1:40 pm 
Offline
Junior Member

Joined: Fri May 29, 2009 8:40 am
Posts: 41
Has anyone managed to create a high availability setup for IPv6 on Linode?

I've tried the basics of moving floating IPv4 addresses between Linodes - and that's easy enough. Just enable IP failover for the relevant IPs, reboot as needed, and use arping to force other hosts to update their ARP cache. It works nicely.

However I've had no such luck with doing the same with IPv6. It's possible to transfer IPv6 pool addresses to other servers, however it's impossible to get the IPv6 NDP cache updated on other servers. For other Linodes on the same network this isn't too bad - their neighbour cache doesn't last that long. However for external connectivity, this results in 20-30 minutes of traffic going to the old Linode.

I've tried to use arpsend/ndsend to get the neighbours caches updated:
Code:
root@devon:~# arpsend -U -i 2a01:7e00::2:9995 eth0

18:37:13.021656 IP6 fe80::f03c:91ff:fe6e:afd5 > ff02::1: ICMP6, neighbor advertisement, tgt is 2a01:7e00::2:9995, length 32
But sadly this traffic isn't being seen on other servers, probably similar to how ping6 ff02::1%eth0 doesn't work either.

Anyone with any experience on this?


Top
   
PostPosted: Wed May 28, 2014 7:16 pm 
Offline
Junior Member
User avatar

Joined: Tue Dec 27, 2005 12:33 am
Posts: 43
Location: USA
I've found a solution. If you ping the IPv6 default gateway from the pool address, it forces the router to refresh its NDP cache for the pool address, permitting connectivity from hosts that are off-subnet. This doesn't fix the NDP caches of other Linodes on the same subnet, but as you observed, their NDP caches time out much more quickly than the router's. Oddly, it takes about 5-10 pings to have any effect, but I've done extensive testing and this technique works every time so I'm deploying it to production.

This is the command I'm using:
Code:
ping6 -c 1 -w 15 -I $MY_POOL_ADDRESS fe80::1%eth0 > /dev/null
This pings the default gateway from the pool address until it gets a response, or 15 seconds elapse. It 15 seconds elapse without a response, there's probably some other problem with your connectivity.

You need a version of ping that supports the -I option. On Debian, this means you need the iputils-ping package rather than the inetutils-ping package.

Edit: I should mention that if you try to run ping6 immediately after adding the pool address to your interface, ping6 might fail to bind to the pool address because DAD hasn't completed yet. So you'll need to either wait until DAD completes before pinging or just disable DAD.


Top
   
PostPosted: Sat May 31, 2014 9:36 am 
Offline
Junior Member

Joined: Fri May 29, 2009 8:40 am
Posts: 41
As I forgot to reply to this post - I did contact support regarding this problem, and they confirmed that the all nodes multicast address is filtered, which is why it's impossible to get IPv6 high availability working the right way. I believe it's on their todo list - but there's no ETA.

Thank you for this workaround! I can finally do some high availability without having to worry about IPv6 traffic disappearing for up to 30 minutes.

The only downside is that I'll have to flush the NDP cache of other Linodes - but at least that's a solvable problem.


Top
   
PostPosted: Thu Mar 05, 2015 1:26 am 
Offline
Junior Member
User avatar

Joined: Tue Dec 27, 2005 12:33 am
Posts: 43
Location: USA
Unfortunately, my workaround doesn't work anymore. Although fe80::1 replies to the pings, Linode's routers continue to send traffic to the wrong host for about 30 minutes. -Alex-, have you found out anything new by chance?


Top
   
PostPosted: Sat Mar 07, 2015 3:09 pm 
Offline
Junior Member

Joined: Fri May 29, 2009 8:40 am
Posts: 41
Sadly I'm living with IPv6 being a second class citizen with Linode, and IPv6 being a bit of a pain as well.

I've got both servers with the relevant IPv6 addresses as additional IPs on the local interface:
Code:
iface lo inet6 loopback
	up ip -6 addr add 2a01:7e00:etc:etc/128 dev lo preferred_lft 0
Now the high availability monitor can either add the same address to eth0 on one of the servers, or the pool address if you've got a routed subnet to a pool IP.

The one quirk of this is that whilst traffic is going to the old server - the old server will keep on responding to existing traffic until the routers cache expires and points to the new server. A big disadvantage is that if you're doing this with pool addresses with other servers of your own, the NDP cache of servers will keep on sending traffic to the old server for quite a while! It won't stop until traffic has stopped flowing to that IP for a few minutes (enough to timeout), or until you flush the cache for that destination IP.

This allows me to perform scheduled maintenance by turning off keepalived on the server ~30 minutes in advance. Sadly it doesn't help for real high availability, traffic will eventually flow to the right server if there's unexpected downtime, but not immediately.

The only reason I'm tolerating this as a solution is that it should happen very rarely, IPv6 traffic is a small percentage of overall traffic, and happy eyeballs will hopefully favour IPv4 until the IPv6 address is reachable again.

I don't like it, and I hope that Linode will at some point take this issue seriously. IPv6 just feels like an afterthought in multiple ways.


Top
   
PostPosted: Mon Jul 04, 2016 12:18 am 
Offline
Newbie

Joined: Thu Aug 20, 2015 12:30 am
Posts: 4
I've been fighting all day to get an address from a /116 pool to quickly (or relatively quickly) fail over between two Linodes. It seems I've run into many of the same issues that have already been documented here, namely unsolicited NAs not propagating and failovers taking 30-45 seconds within a datacenter and 20+ minutes externally. Linode Support told me that IPv6 multicast is not supported (same as in 2014). The maintainers of keepalived suggested I use the VMAC feature, but I wasn't able to get it to work and I'm not sure Linode Support understood what I was asking. Has anyone made any progress on this? I can't believe it's been two years since this thread and the situation has not improved whatsoever!


Top
   
PostPosted: Tue Dec 06, 2016 5:51 am 
Offline
Junior Member

Joined: Fri May 29, 2009 8:40 am
Posts: 41
It's still a problem which hasn't been fixed by Linode.

IPv4 high availability - easy. IPv6 high availability - don't bother with Linode.


Top
   
PostPosted: Wed Dec 07, 2016 1:44 pm 
Offline
Senior Member

Joined: Mon Aug 29, 2011 2:34 am
Posts: 224
Linode presently prohibits sending traffic to the link-local all nodes address (ff02::1), which is why arpsend/ndsend does not work for IPv6. You can, however, send to the link-local all routers address (ff02::2), as well as the subnet all-routers address (this is host 0 for the subnet; eg, for Dallas, this would be 2600:3c00::). You would probably need to use a custom script (you can make something up in python with scapy), but sending override neighbor advertisements to the all routers address should solve the issue with external failover. You'd need to send unicast neighbor advertisements for anything in-datacenter, or deal with the 40 second delay.

I think that Linode should remove the restriction on the link-local all nodes address, as one can easily flood the broadcast domain by other methods which have to be allowed in order for the intended purpose of those protocols to work. Note, however, that even if they do, it won't solve the in-datacenter problem, because Linodes can only receive traffic on addresses assigned to them (the periodic router advertisements that allow SLAAC to work are the only exception to this, and it should stay that way).

The only reason you don't see these same problems with IPv4 is because it's highly unlikely that you'll have another Linode with an IP address in the same subnet as your failover IP address, so all in-datacenter consumers of your failover-protected service go through the routers to reach the failover IP, and sending the gratuitous ARP on a failover event does reach the routers just fine. If you were doing IP failover in the IPv4 private network, you'd have the same problem as with in-datacenter IPv6, and would have to use unicast unsolicited ARP replies in order for failover to work correctly in that case as well.


Top
   
PostPosted: Wed Nov 29, 2017 2:04 pm 
Offline
Senior Member
User avatar

Joined: Mon Sep 29, 2014 4:47 pm
Posts: 127
Website: https://Feliciano.Tech
Location: New York City, USA
Twitter: FelicianoTech
Any update on this Linode?

_________________
U.S. Navy Sailor and Developer Evangelist at CircleCI (formerly Linode). Write the Docs NYC organizer. Mets fan for life. Building the Linodians community.

Follow me on Twitter @FelicianoTech.


Top
   
PostPosted: Sat Dec 02, 2017 7:31 am 
Offline
Junior Member

Joined: Fri May 29, 2009 8:40 am
Posts: 41
Ah yes, this old one.

I ended up doing what dwfreed suggested - and created a Python/scapy script which sends NA packets to ff02::2 (all-routers) and unicast NA packets to all other servers. Seems to work reasonably well.

I'll try and get it up on GitHub and PyPI this weekend.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
cron
RSS

Powered by phpBB® Forum Software © phpBB Group