Hi all,
Thanks for taking the time to read this post, and thanks in advance for any help you can offer. I opened a ticket with Linode Support, but they "politely" told me to come here for support. Without dwelling too much on it, I'll just say I was insulted and disappointed that they wouldn't take the time to assist a paying customer. That was not what I've come to expect from Linode.
So I'm hoping you guys can help pick up their slack. Again, any help you can provide is appreciated.
------
I don't recall exactly when this started, but if I had to guess I'd say sometime in the last month or so.
On reboot my Atlanta Linode comes up with the time (presumably) synchronized to the host system's clock and then ntpdate/ntp fires and corrects the time. This involves a backwards time jump which seems to affect Dovecot the most, but I've also had blocked SSH connections until the system settles out.
I have a second Linode in another data center which after checking I found is off by 2 minutes. I compared the ntp.conf files between the two and found them to be identifical. To prove to myself that the issue is not directly related to a ntp configuration I booted the Atlanta node to a configuration profile that has init=/bin/bash and ran 'date'. This was an attempt to boot the Linode without networking support and before other processes had loaded so I could see what the host clock was set to. The assumption on my part is that the VM and the host are synchronized at boot, although I have no proof either way.
To the point, I took the value given from running the
date command and compared it against my local laptop (which also uses NTP), several phones that sync against cell networks and my other Linode with a properly synchronized clock and found the Atlanta node to be off about 5 minutes.
From everything I've read this means the time has to be corrected in "BIOS", which for the Xen VPS setup I assume Linode runs would mean the host system that my Atlanta Linode runs on. Linode Support tells me that they've confirmed the host system's clock is correct.
By this point I've read various forum postings, searched via Google and read the official Dovecot wiki page here:
http://wiki2.dovecot.org/TimeMovedBackwardsI've been running ntpd for years now, presumably without this issue (although I wasn't running many services that would have immediately complained), so installing/configuring ntpd has already been taken care of.
Here's an original log snippet before I made changes to ntp.conf:
Code:
May 3 20:21:12 atlanta ntpd[1226]: ntpd 4.2.6p3@1.2290-o Tue Jun 5 20:12:11 UTC 2012 (1)
May 3 20:21:12 atlanta ntpd[1227]: proto: precision = 0.972 usec
May 3 20:21:12 atlanta ntpd[1227]: ntp_io: estimated max descriptors: 1024, initial socket boundary: 16
May 3 20:21:12 atlanta ntpd[1227]: unable to bind to wildcard address 0.0.0.0 - another process may be running - EXITING
May 3 20:21:12 atlanta dovecot: master: Dovecot v2.0.19 starting up (core dumps disabled)
May 3 20:21:13 atlanta dovecot: lmtp(1271): Connect from local
May 3 20:21:13 atlanta dovecot: auth-worker: mysql(127.0.0.1): Connected to database mailserver_v1
May 3 20:15:59 atlanta ntpdate[671]: step time server 50.116.38.157 offset -314.012644 sec
May 3 20:15:59 atlanta ntpd[1302]: ntpd 4.2.6p3@1.2290-o Tue Jun 5 20:12:11 UTC 2012 (1)
May 3 20:15:59 atlanta ntpd[1303]: proto: precision = 0.974 usec
May 3 20:15:59 atlanta ntpd[1303]: ntp_io: estimated max descriptors: 1024, initial socket boundary: 16
May 3 20:15:59 atlanta ntpd[1303]: Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
May 3 20:15:59 atlanta ntpd[1303]: Listen and drop on 1 v6wildcard :: UDP 123
May 3 20:15:59 atlanta ntpd[1303]: Listen normally on 2 lo 127.0.0.1 UDP 123
May 3 20:15:59 atlanta ntpd[1303]: Listen normally on 3 eth0 74.207.228.106 UDP 123
May 3 20:15:59 atlanta ntpd[1303]: Listen normally on 4 eth0 fe80::fcfd:4aff:fecf:e46a UDP 123
May 3 20:15:59 atlanta ntpd[1303]: Listen normally on 5 lo ::1 UDP 123
May 3 20:15:59 atlanta ntpd[1303]: peers refreshed
May 3 20:15:59 atlanta ntpd[1303]: Listening on routing socket on fd #22 for interface updates
May 3 20:15:59 atlanta dovecot: log: Warning: Time moved backwards by 314 seconds.
May 3 20:15:59 atlanta dovecot: master: Warning: Time moved backwards by 314 seconds, waiting for 180 secs until new services are launched again.
May 3 20:15:59 atlanta dovecot: ssl-params: Warning: Time moved backwards by 313 seconds.
May 3 20:16:00 atlanta dovecot: anvil: Warning: Time moved backwards by 313 seconds.
May 3 20:16:00 atlanta dovecot: auth: Warning: Time moved backwards by 313 seconds.
May 3 20:16:00 atlanta dovecot: config: Warning: Time moved backwards by 313 seconds.
May 3 20:16:00 atlanta dovecot: lmtp(1271): Fatal: Time just moved backwards by 313 seconds. This might cause a lot of problems, so I'll just kill myself now. http://wiki2.dovecot.org/TimeMovedBackwards
I attempted to work around the early exit and restart of ntpd by adding a line to ignore the wildcard interface, but that caused synchronization to fail. I have confirmed that by reverting to the previous configuration that synchronization does work, even if the earlier exit and restart of ntpd returns.
root@atlanta:~# ntpq -pCode:
remote refid st t when poll reach delay offset jitter
==============================================================================
+palpatine.steve 231.146.174.254 3 u 9 64 377 61.853 0.154 1.585
+samur.ulak.net. 192.36.144.23 2 u 5 64 377 145.905 -1.958 1.479
*time3.chpc.utah 132.163.4.103 2 u 67 64 376 56.198 1.018 1.111
-time01.muskegon 64.113.32.5 2 u 13 64 377 60.461 3.472 1.545
-europium.canoni 140.203.204.77 2 u 1 64 377 87.009 -4.037 0.870
root@atlanta:~# grep -Ev '#|^$' /etc/ntp.confCode:
driftfile /var/lib/ntp/ntp.drift
statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable
server 0.ubuntu.pool.ntp.org
server 1.ubuntu.pool.ntp.org
server 2.ubuntu.pool.ntp.org
server 3.ubuntu.pool.ntp.org
server ntp.ubuntu.com
restrict -4 default nomodify nopeer noquery notrap
restrict -6 default nomodify nopeer noquery notrap
restrict 127.0.0.1
restrict ::1
I'm not sure what else to try. I found the
/etc/init/hwclock-save upstart job and ran it, but it error'd out. Light research seems to show that the
/dev/rtc0 device does not exist for Xen VMs. Too bad as I was hoping that I could save the updated time to a local clock (however it is virtualized) so the next boot would be current.