Linode Forum
Linode Community Forums
 FAQFAQ    SearchSearch    MembersMembers      Register Register 
 LoginLogin [ Anonymous ] 
Post new topic  Reply to topic
Author Message
 Post subject: Random DNS problem?
PostPosted: Tue Dec 21, 2010 12:33 pm 
Offline
Senior Member

Joined: Wed Mar 03, 2010 2:04 pm
Posts: 111
I'm looking for troubleshooting suggestions. I have a Perl script on another VPS that runs every night and uses Net::ftp to transfer a zip file to my Linode.

This has been working perfectly every night for over a year.

In the last couple of weeks, it has started to fail about half the time. The script reports "Net::FTP: Bad hostname 'www.MYLINODE.com' (where MYLINODE.com is a domain with an A record pointing to my linode, and works fine all day long (I also have Apache & wordpress and email working fine on this domain). I am using Linode's nameservers for my DNS (ns1.linode.com, ns2.linode.com, etc.) and the TTL for my "www" record is 1 hour. On the remote side I am using my other VPS provider's DNS servers.

It seems that in the middle of the night (1am PT and recently tried switching to 3am PT), my other VPS sometimes can't resolve a DNS name that points to my linode. I can't tell if the remote DNS server is unresponsive or if my Linode is down, or nsX.linode.com is just not responding at that time of night. The script only tries once, since this has never been an issue until now.

I could simply plug in the Linode's static IP address to the script, but I kind of want to know why this is failing on principle. I'm also too old to stay up until 1am and do any "live" troubleshooting. Every time I run the script manually during the day, it works fine with no errors. I can't reproduce this from 7am to 11pm. I am running CSF/LFD firewall on both hosts, but that doesn't explain the random nature of this failure (and both IP addresses are whitelisted on each box anyway).

Any suggestions on where to start narrowing this down?


Top
   
 Post subject:
PostPosted: Tue Dec 21, 2010 2:50 pm 
Offline
Senior Member

Joined: Fri Dec 07, 2007 1:37 am
Posts: 385
Location: NC, USA
It is tough to troubleshoot when you are using a third-party recursive server. Sometimes DNS issues like this will go away if you try more than once, since the response to the original request comes in and gets cached by the recursive server after you have already timed out.

To see whether your linode DNS is causing the problems, you might run one of these before your script (they both bypass your recursive DNS server and talk directly to the authoritative nameservers):
Code:
 dig +trace www.MYLINODE.com
Code:
 dig +nssearch www.MYLINODE.com

(You might have to install dig)


Top
   
 Post subject:
PostPosted: Tue Dec 21, 2010 5:19 pm 
Offline
Senior Member

Joined: Wed Mar 03, 2010 2:04 pm
Posts: 111
Thanks Stever,

I will give that a try. I think you meant 'nosearch' rather than 'nssearch'. I haven't studied Net::ftp much to see how long it waits for a reply, etc. I also thought about just adding a quick sleep-and-try-again routine if the first connection fails. I'm guessing any of these might help.

It's more about "why did it start failing and then only some of the time". I can't stand things like that. :) I'd love to be able to point to one thing and say "here's what is happening and how to fix it".


Top
   
 Post subject:
PostPosted: Tue Dec 21, 2010 7:11 pm 
Offline
Senior Member

Joined: Fri Dec 07, 2007 1:37 am
Posts: 385
Location: NC, USA
haus wrote:
I think you meant 'nosearch' rather than 'nssearch'.

Nope, meant it exactly as it was typed. 'nssearch' hits ALL the authoritative nameservers and tells you how long they took to respond.
man dig wrote:
+[no]nssearch
When this option is set, dig attempts to find the authoritative name servers for the zone containing the name being looked up and display the SOA record that each name server has for the zone.
Code:
 $ dig +nssearch linode.com
SOA ns1.linode.com. dns.linode.com. 2010122118 7200 3600 604800 86400 from server ns3.linode.com in 33 ms.
SOA ns1.linode.com. dns.linode.com. 2010122118 7200 3600 604800 86400 from server ns4.linode.com in 36 ms.
SOA ns1.linode.com. dns.linode.com. 2010122118 7200 3600 604800 86400 from server ns1.linode.com in 71 ms.
SOA ns1.linode.com. dns.linode.com. 2010122118 7200 3600 604800 86400 from server ns2.linode.com in 103 ms.
SOA ns1.linode.com. dns.linode.com. 2010122118 7200 3600 604800 86400 from server ns5.linode.com in 113 ms.


Top
   
 Post subject:
PostPosted: Tue Dec 21, 2010 7:56 pm 
Offline
Senior Member

Joined: Wed Mar 03, 2010 2:04 pm
Posts: 111
Nevermind, didn't look hard enough. I see it now. Sorry!

When I do subdomain.MYLINODE.com I get nothing back. When I do MYLINODE.com I get results like the ones you posted.


Top
   
 Post subject:
PostPosted: Wed Dec 22, 2010 11:36 am 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
I'm going to be obvious here and point out that neither mylinode.com nor subdomain.mylinode.com are extant domains. As hoopycat would point out, it's difficult to help you when you're giving us fake information.


Top
   
 Post subject:
PostPosted: Wed Dec 22, 2010 12:02 pm 
Offline
Senior Member

Joined: Wed Mar 03, 2010 2:04 pm
Posts: 111
I'm not willing to provide my real domain name to an open troubleshooting forum.

Obviously there's a chance this is a configuration issue specific to the domain in question, but since this is a new intermittent failure I'm guessing it relates more to a broader issue (something changed elsewhere beyond my immediate control). Particularly as I haven't made any DNS changes in over 6 months for any of the VPS' or domains in question.

Anyway, I "solved" the issue by adding a quick loop in my script that tries the FTP connection up to 5 times with a short break in between. Last night it failed on the first try and then succeeded on the second attempt. So thanks again to Stever for the suggestions and helping me learn something new.

If I manage to sort out the issue for real someday I'll post the solution, but for now this will work.


Top
   
 Post subject:
PostPosted: Wed Dec 22, 2010 3:25 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
haus wrote:
I'm not willing to provide my real domain name to an open troubleshooting forum.

Obviously there's a chance this is a configuration issue specific to the domain in question, but since this is a new intermittent failure I'm guessing it relates more to a broader issue (something changed elsewhere beyond my immediate control). Particularly as I haven't made any DNS changes in over 6 months for any of the VPS' or domains in question.


Yes, but it also means that nobody else can reproduce the problem on their own linode.


Top
   
 Post subject:
PostPosted: Wed Dec 22, 2010 4:02 pm 
Offline
Senior Member

Joined: Wed Mar 03, 2010 2:04 pm
Posts: 111
Anyone can test this. It's a perl script. Plug in values for $ftp_hostname, $ftp_port, and $ftp_passive, and run from cron or the command line.

Code:
#!/usr/local/bin/perl 

my $ftp_hostname = ''; # ftp host name
my $ftp_port = '21'; # typical value
my $ftp_passive = 0; # change to 1 for passive mode

use Net::FTP;

my $ftp = Net::FTP->new($ftp_hostname, Port => $ftp_port, Passive => $ftp_passive);

print "Content-type: text/html\n\n";
if (!ftp) {
     print "FTP connection failed: $@";
} else {
     print "FTP connection successful.";
}
exit;


That's just a snippet pulled from my original code, which would result in a "bad hostname" error about half the time when run in the wee hours of the morning. Again just for clarity, this script is running on a different host, trying to connect via FTP to my linode.

The script only runs once per day, so I suspect this may relate to a DNS cache (which might explain why it works all day long when I try it at the command line; the lookup has already occurred so it is now cached for the day, even though the script may not have waited long enough for the query to finish).


Top
   
 Post subject:
PostPosted: Wed Dec 22, 2010 5:47 pm 
Offline
Senior Member

Joined: Sat Feb 14, 2009 1:32 am
Posts: 123
Out of curiosity, if you are giving the server a public DNS name or public IP address, then does it really matter whether you are posting in an open forum? The server is already public. Not posting the DNS name is simply security through obscurity.


Top
   
 Post subject:
PostPosted: Wed Dec 22, 2010 5:51 pm 
Offline
Senior Member

Joined: Wed Mar 03, 2010 2:04 pm
Posts: 111
Obscurity doesn't provide security but it does preclude identity.


Top
   
 Post subject:
PostPosted: Wed Dec 22, 2010 7:07 pm 
Offline
Senior Member
User avatar

Joined: Tue May 26, 2009 3:29 pm
Posts: 1691
Location: Montreal, QC
If it's on another box, it could be the DNS server used on that box is flaky. You may want to try changing the DNS server to something else (such as Google's Public DNS at 8.8.8.8 or 8.8.4.4) and seeing if the problem still occurs.


Top
   
 Post subject:
PostPosted: Wed Dec 22, 2010 7:14 pm 
Offline
Senior Member

Joined: Wed Mar 03, 2010 2:04 pm
Posts: 111
Yes, that's a great idea. Thank you. I'm also going to learn how to do more detailed logging of DNS queries on that box so hopefully I can get more info than just "bad hostname".


Top
   
Display posts from previous:  Sort by  
Post new topic  Reply to topic


Who is online

Users browsing this forum: No registered users and 2 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum

Search for:
Jump to:  
RSS

Powered by phpBB® Forum Software © phpBB Group