| Linode Forum https://forum.linode.com/ |
|
| Random DNS problem? https://forum.linode.com/viewtopic.php?f=19&t=6439 |
Page 1 of 1 |
| Author: | haus [ Tue Dec 21, 2010 12:33 pm ] |
| Post subject: | Random DNS problem? |
I'm looking for troubleshooting suggestions. I have a Perl script on another VPS that runs every night and uses Net::ftp to transfer a zip file to my Linode. This has been working perfectly every night for over a year. In the last couple of weeks, it has started to fail about half the time. The script reports "Net::FTP: Bad hostname 'www.MYLINODE.com' (where MYLINODE.com is a domain with an A record pointing to my linode, and works fine all day long (I also have Apache & wordpress and email working fine on this domain). I am using Linode's nameservers for my DNS (ns1.linode.com, ns2.linode.com, etc.) and the TTL for my "www" record is 1 hour. On the remote side I am using my other VPS provider's DNS servers. It seems that in the middle of the night (1am PT and recently tried switching to 3am PT), my other VPS sometimes can't resolve a DNS name that points to my linode. I can't tell if the remote DNS server is unresponsive or if my Linode is down, or nsX.linode.com is just not responding at that time of night. The script only tries once, since this has never been an issue until now. I could simply plug in the Linode's static IP address to the script, but I kind of want to know why this is failing on principle. I'm also too old to stay up until 1am and do any "live" troubleshooting. Every time I run the script manually during the day, it works fine with no errors. I can't reproduce this from 7am to 11pm. I am running CSF/LFD firewall on both hosts, but that doesn't explain the random nature of this failure (and both IP addresses are whitelisted on each box anyway). Any suggestions on where to start narrowing this down? |
|
| Author: | Stever [ Tue Dec 21, 2010 2:50 pm ] |
| Post subject: | |
It is tough to troubleshoot when you are using a third-party recursive server. Sometimes DNS issues like this will go away if you try more than once, since the response to the original request comes in and gets cached by the recursive server after you have already timed out. To see whether your linode DNS is causing the problems, you might run one of these before your script (they both bypass your recursive DNS server and talk directly to the authoritative nameservers): Code: dig +trace www.MYLINODE.com Code: dig +nssearch www.MYLINODE.com (You might have to install dig) |
|
| Author: | haus [ Tue Dec 21, 2010 5:19 pm ] |
| Post subject: | |
Thanks Stever, I will give that a try. I think you meant 'nosearch' rather than 'nssearch'. I haven't studied Net::ftp much to see how long it waits for a reply, etc. I also thought about just adding a quick sleep-and-try-again routine if the first connection fails. I'm guessing any of these might help. It's more about "why did it start failing and then only some of the time". I can't stand things like that. |
|
| Author: | Stever [ Tue Dec 21, 2010 7:11 pm ] |
| Post subject: | |
haus wrote: I think you meant 'nosearch' rather than 'nssearch'. Nope, meant it exactly as it was typed. 'nssearch' hits ALL the authoritative nameservers and tells you how long they took to respond. man dig wrote: +[no]nssearch When this option is set, dig attempts to find the authoritative name servers for the zone containing the name being looked up and display the SOA record that each name server has for the zone. Code: $ dig +nssearch linode.com |
|
| Author: | haus [ Tue Dec 21, 2010 7:56 pm ] |
| Post subject: | |
Nevermind, didn't look hard enough. I see it now. Sorry! When I do subdomain.MYLINODE.com I get nothing back. When I do MYLINODE.com I get results like the ones you posted. |
|
| Author: | Guspaz [ Wed Dec 22, 2010 11:36 am ] |
| Post subject: | |
I'm going to be obvious here and point out that neither mylinode.com nor subdomain.mylinode.com are extant domains. As hoopycat would point out, it's difficult to help you when you're giving us fake information. |
|
| Author: | haus [ Wed Dec 22, 2010 12:02 pm ] |
| Post subject: | |
I'm not willing to provide my real domain name to an open troubleshooting forum. Obviously there's a chance this is a configuration issue specific to the domain in question, but since this is a new intermittent failure I'm guessing it relates more to a broader issue (something changed elsewhere beyond my immediate control). Particularly as I haven't made any DNS changes in over 6 months for any of the VPS' or domains in question. Anyway, I "solved" the issue by adding a quick loop in my script that tries the FTP connection up to 5 times with a short break in between. Last night it failed on the first try and then succeeded on the second attempt. So thanks again to Stever for the suggestions and helping me learn something new. If I manage to sort out the issue for real someday I'll post the solution, but for now this will work. |
|
| Author: | Guspaz [ Wed Dec 22, 2010 3:25 pm ] |
| Post subject: | |
haus wrote: I'm not willing to provide my real domain name to an open troubleshooting forum.
Obviously there's a chance this is a configuration issue specific to the domain in question, but since this is a new intermittent failure I'm guessing it relates more to a broader issue (something changed elsewhere beyond my immediate control). Particularly as I haven't made any DNS changes in over 6 months for any of the VPS' or domains in question. Yes, but it also means that nobody else can reproduce the problem on their own linode. |
|
| Author: | haus [ Wed Dec 22, 2010 4:02 pm ] |
| Post subject: | |
Anyone can test this. It's a perl script. Plug in values for $ftp_hostname, $ftp_port, and $ftp_passive, and run from cron or the command line. Code: #!/usr/local/bin/perl That's just a snippet pulled from my original code, which would result in a "bad hostname" error about half the time when run in the wee hours of the morning. Again just for clarity, this script is running on a different host, trying to connect via FTP to my linode. The script only runs once per day, so I suspect this may relate to a DNS cache (which might explain why it works all day long when I try it at the command line; the lookup has already occurred so it is now cached for the day, even though the script may not have waited long enough for the query to finish). |
|
| Author: | carmp3fan [ Wed Dec 22, 2010 5:47 pm ] |
| Post subject: | |
Out of curiosity, if you are giving the server a public DNS name or public IP address, then does it really matter whether you are posting in an open forum? The server is already public. Not posting the DNS name is simply security through obscurity. |
|
| Author: | haus [ Wed Dec 22, 2010 5:51 pm ] |
| Post subject: | |
Obscurity doesn't provide security but it does preclude identity. |
|
| Author: | Guspaz [ Wed Dec 22, 2010 7:07 pm ] |
| Post subject: | |
If it's on another box, it could be the DNS server used on that box is flaky. You may want to try changing the DNS server to something else (such as Google's Public DNS at 8.8.8.8 or 8.8.4.4) and seeing if the problem still occurs. |
|
| Author: | haus [ Wed Dec 22, 2010 7:14 pm ] |
| Post subject: | |
Yes, that's a great idea. Thank you. I'm also going to learn how to do more detailed logging of DNS queries on that box so hopefully I can get more info than just "bad hostname". |
|
| Page 1 of 1 | All times are UTC-04:00 |
| Powered by phpBB® Forum Software © phpBB Group http://www.phpbb.com/ |
|