Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
PostPosted: Fri Apr 05, 2013 10:25 pm 
Offline
Newbie

Joined: Thu Apr 04, 2013 10:16 pm
Posts: 2
Hi everyone,

Linux newbie here, I need help debugging a random connection timeout between my app server and my database server.

Servers:
Linode 768, Debian 6 (64bit) - App server (www1)
Ruby 2
Rails 3 (with Rainbows! as server)
Sidekiq (async background message processor)
pgbouncer

Linode 512, Debian 6 (64bit) - DB server (db1)
Postgres 9.2
Sphinx Search
Redis 2.6.11 (with AOF persistence)

Both are talking through private ip. Redis is used as my main Rails cache storage.


Problem:
Sometimes my application server would throw error like these:
Redis::TimeoutError (Connection timed out)
ActionView::Template::Error (Connection timed out):

It happened randomly, it can happen whether there are <10 people or >60 people active on the site.
The strange thing is, my postgres connection NEVER had such problem (timing out).

Another things to note are:
When I was still using memcache instead of redis, I get the random connection timeout to memcached as well.
Same thing when I was still using MySQL, my database connection never timed out.


Things I've tried:
I've monitored my server using new relic. My CPU, memory, IO, and bandwith seems to be OK. Average response time is acceptable 133ms.
I've upgraded to latest gems, ruby, redis, etc.
I've set my redis timeout = 0, tcp-keepalive = 60. From redis "info", rejected_connection stats is at 0.
I've opened support ticket, and they suggested I did a mtr, which seems to be ok:
Code:
mtr --report db1
HOST: www1                        Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. db1                           0.0%    10    0.5   0.5   0.4   0.8   0.1

Code:
mtr --report www1
HOST: db1                         Loss%   Snt   Last   Avg  Best  Wrst StDev
  1. www1                          0.0%    10    0.5   0.6   0.4   1.0   0.2 


However, I can't do an mtr as the timeout happen, because it's so random I tend to only saw it via the Rails log.


I hope I didn't missed out any details. Any ideas where to start pinpointing where the problem is?


Top
   
PostPosted: Sat Apr 06, 2013 4:37 am 
Offline
Newbie

Joined: Thu Apr 04, 2013 10:16 pm
Posts: 2
I'm moving Redis to app server (localhost) and see whether it stops the problem.

EDIT:
I've been monitoring for 2 days so far, and the problem seems to magically goes away after restarting both server (both now Linode 1GB) for the Nextgen free upgrade.

I also did upgrade my Linux kernel to latest (3.8.4 x64), and aptitude safe-upgrade all of the installed packages.

So at this point of time I've no idea whether it's fixed because of the increased memory, or the new machine/infrastructure, or some other thing.


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group