Thank you for posting this. We've also experienced problems with KVM that may relate to yours. We've been in regular contact with Linode Support who are actively working on this.
In short, correctly performing KVM nodes perform as well or better than Xen. But we found some KVM hosts perform
extremely poorly with our platform.
We managed to isolate HHVM as the 'source' of the problem, with the repro cases to 'prove' it. PHP is not affected, which probably explains the lack of KVM complaints. With the help of Linode devs, we ran Perf Top, which shows CPU time being completely consumed by kernel-related tasks:
Code:
7.91% [kernel] [k] trigger_load_balance
7.15% [kernel] [k] native_write_msr_safe
5.93% [kernel] [k] timerqueue_add
5.91% [kernel] [k] run_posix_cpu_timers
4.79% [kernel] [k] hrtimer_forward
4.63% libpthread-2.19.so [.] __pthread_disable_asynccancel
3.69% libpthread-2.19.so [.] __libc_send
3.63% [kernel] [k] profile_tick
3.43% [kernel] [k] task_cputime
2.81% [kernel] [k] scheduler_tick
2.73% [kernel] [k] hrtimer_interrupt
2.60% [kernel] [k] fetch_task_cputime
2.45% libmemcached.so.10.0.0 [.] 0x0000000000017552
2.41% [kernel] [k] perf_event_task_tick
2.09% [kernel] [k] __run_hrtimer
1.97% hhvm [.] vio_write
1.93% [kernel] [k] _raw_spin_lock
1.54% [kernel] [k] tick_sched_timer
1.48% [kernel] [k] _raw_spin_unlock
1.40% [kernel] [k] apic_timer_interrupt
1.37% [kernel] [k] x86_pmu_enable
1.12% [kernel] [k] intel_pmu_enable_all
On a properly performing KVM node, Perf Top is dominated by typical user tasks: HHVM, memcached, nginx, etc.
Stranger still, the problem is roughly 8 times more likely to occur in Newark and Dallas than in London or Atlanta. It's roughly 4 times more likely to occur in Fremont and Singapore, though we've not tested those datacentres as extensively.
We are talking about identical KVM nodes spun up from the same image here. I cannot imagine why it varies with location, which seems to be suggestive of a hardware or host configuration aspect to this. But it does seem to be host related. For example, dallas1063 is always good and dallas1069 is always bad.
Unlike your experience, our KVM fleet continues to perform well after 30 days uptime. However, we rely on automated creates via the API, so we can't risk landing on 'bad' hosts. Therefore we've switched back to Xen for new deployments, which performs impeccably as ever.
_________________
PlushForums - Beautiful, modern discussion forums.