Previous Page
Next Page

14.7. Problem Symptoms

Some problems, unfortunately, aren't as easy to identify as the ones we listed. You'll experience some misbehavior but won't be able to attribute it directly to its cause, often because any of a number of problems can cause the symptoms you see. For cases like this, we'll suggest some of the common causes of these symptoms and ways to isolate them.

14.7.1. Local Name Can't Be Looked Up

The first thing to do when a program such as ssh or ftp can't look up a local domain name is to use nslookup or dig to try to look up the same name. When we say "the same name," we mean literally the same name: don't add labels and a trailing dot if the user didn't type them. Don't query a different nameserver than the user did.

As often as not, the user mistyped the name or doesn't understand how the search list works and just needs direction. Occasionally, you'll turn up real host configuration errors:

  • Syntax errors in resolv.conf (problem 11 in the earlier section "Potential Problem List")

  • An unset local domain name (problem 12)

You can check for either of these using nslookup's set all command.

If nslookup points to a problem with the nameserver rather than with the host configuration, check for the problems associated with the type of nameserver. If the nameserver is the primary for the zone, but it isn't responding with data you think it should:

  • Check that the zone datafile contains the data in question and that the nameserver has loaded it (problem 2). A database dump can tell you for sure whether the data was loaded.

  • Check the configuration file and the pertinent zone datafile for syntax errors (problem 5). Check the nameserver's syslog output for indications of those errors.

  • Ensure that the records have trailing dots, if they require them (problem 6).

If the nameserver is a slave server for the zone, you should first check whether its master has the correct data. If it does, and the slave doesn't:

  • Make sure you've incremented the serial number on the primary (problem 1).

  • Look for a problem on the slave in updating the zone (problem 3).

If the primary doesn't have the correct data, of course, diagnose the problem on the primary.

If the problem server is a caching-only nameserver:

  • Make sure it has its root hints (problem 7).

  • Check that your parent zone's delegation to your zone exists and is correct (problems 9 and 10). Remember that to a caching-only server, your zone looks like any other remote zone. Even though the host it runs on may be inside your zone, the caching-only nameserver must be able to locate an authoritative server for your zone from your parent zone's servers.

14.7.2. Remote Names Can't Be Looked Up

If your local lookups succeed but you can't look up domain names outside your local zones, there is a different set of problems to check:

  • First, did you just set up your nameservers? You might have omitted the root hints data (problem 7).

  • Can you ping the remote zone's nameservers? Maybe you can't reach the remote zone's servers because of connectivity loss (problem 8).

  • Is the remote zone new? Maybe its delegation hasn't yet appeared (problem 9). Or the delegation information for the remote zone may be wrong or out of date due to neglect (problem 10).

  • Does the domain name actually exist on the remote zone's servers (problem 2)? On all of them (problems 1 and 3)?

14.7.3. Wrong or Inconsistent Answer

If you get the wrong answer when looking up a local domain name, or an inconsistent answer depending on which nameserver you ask or when you ask, first check the synchronization between your nameservers:

  • Are they all holding the same serial number for the zone? Did you forget to increment the serial number on the primary after you made a change (problem 1)? If you did, the nameservers may all have the same serial number, but they will answer differently out of their authoritative data.

  • Did you roll the serial number back to 1 (problem 1 again)? Then the primary's serial number will appear much lower than the slaves' serial numbers.

  • Did you forget to reload the primary (problem 2)? Then the primary will return (via nslookup or dig, for example) a different serial number from the one in the zone datafile.

  • Are the slaves having trouble updating from their master(s) (problem 3)? If so, they should have syslogged appropriate error messages.

  • Is the nameserver's round-robin feature rotating the addresses of the domain name you're looking up?

If you get these results when looking up a domain name in a remote zone, you should check whether the remote zone's nameservers have lost synchronization. You can use tools such as nslookup and dig to determine whether the remote zone's administrator forgot to increment the serial number, for example. If the nameservers answer differently from their authoritative data but show the same serial number, the serial number probably wasn't incremented. If the primary's serial number is much lower than the slaves', the primary's serial number was probably accidentally reset. We usually assume a zone's primary nameserver is running on the host listed in the MNAME (first) field of the SOA record.

You probably can't determine conclusively that the primary hasn't been reloaded, though. It's also difficult to pin down updating problems between remote nameservers. In cases like this, if you've determined that the remote nameservers are giving out incorrect data, contact the zone administrator and (gently) relay what you've found. This will help the administrator track down the problem on the remote end.

If you can determine that a parent nameservera remote zone's parent, your zone's parent, or even one in your zoneis giving out a bad answer, check whether this is coming from old delegation information. Sometimes this requires contacting both the administrator of the remote zone and the administrator of its parent to compare the delegation and the current, correct list of authoritative nameservers.

If you can't induce the administrator to fix the data or if you can't track down the administrator, you can always use the bogus server substatement to instruct your nameserver not to query that particular server.

14.7.4. Lookups Take a Long Time

Slow name resolution is usually due to one of two problems:

  • Connectivity loss (problem 8), which you can diagnose with nameserver debugging output and tools such as ping

  • Incorrect delegation information (problem 10) pointing to the wrong nameservers or the wrong IP addresses

Usually, going over the debugging output and sending a few pings will point to one or the other: either you can't reach the nameservers at all, or you can reach the hosts but the nameservers aren't responding.

Sometimes, though, the results are inconclusive. For example, the parent nameservers delegate to a set of nameservers that don't respond to pings or queries, but connectivity to the remote network seems all right (a traceroute, for example, will get you to the remote network's "doorstep"the last router between you and the host). Is the delegation information so badly out of date that the nameservers have long since moved to other addresses? Are the hosts simply down? Or is there really a remote network problem? Usually, finding out requires a call or a message to the administrator of the remote zone. (Remember, whois gives you phone numbers!)

14.7.5. rlogin and rsh to Host Fails Access Check

This is a problem you expect to see right after you set up your nameservers. Users unaware of the change from the host table to domain name service won't know to update their .rhosts files. (We covered what needs to be updated in Chapter 6.) Consequently, rlogin's or rsh's access check will fail and deny the user access.

Other causes of this problem are missing or incorrect in-addr.arpa delegation (problems 9 and 10) or forgetting to add a PTR record for the client host (problem 4). If you've recently upgraded to BIND 4.9 or newer and have PTR data for more than one in-addr.arpa zone in a single zone datafile, your nameserver may be ignoring the out-of-zone data. Any of these situations will result in the same behavior:

% rlogin wormhole
Password:

In other words, the user is prompted for a password despite having set up password-less access with .rhosts or hosts.equiv. If you were to look at the syslog file on the destination host (wormhole.movie.edu, in this case), you'd probably see something like this:

May  4 18:06:22 wormhole inetd[22514]: login/tcp: Connection
       from unknown (192.249.249.213)

You can tell which problem it is by stepping through the resolution process with your favorite query tool. First, query one of your in-addr.arpa zone's parent nameservers for NS records for your in-addr.arpa zone. If these are correct, query the nameservers listed for the PTR record corresponding to the IP address of the rlogin or rsh client. Make sure they all have the PTR record and that the record maps to the right domain name. If not all the nameservers have the record, check for a loss of synchronization between the primary and the slaves (problems 1 and 3).

14.7.6. Access to Services Denied

Sometimes rlogin and rsh aren't the only services to go. Occasionally, you'll install BIND on your server and your diskless hosts won't boot, and hosts won't be able to mount disks from the server, either.

If this happens, make sure that the case of the domain names your nameservers return agrees with the case your previous name service returned. For example, if you are running NIS and your NIS host maps contain only lowercase names, you should make sure your nameservers also return lowercase domain names. Some programs are case-sensitive and won't recognize names in a different case in a datafile, such as /etc/bootparams or /etc/exports.

14.7.7. Can't Get Rid of Old Data

Sometimes, after decommissioning a nameserver or changing a server's IP address, you'll find the old address record lingering around. An old record may show up in a nameserver's cache or in a zone datafile weeks or even months later. The record clearly should have timed out of any caches by now. So why's it still there? Well, there are a few reasons this happens. We'll describe the simpler cases first.

14.7.7.1. Old delegation information

The first (and simplest) case occurs if a parent zone doesn't keep up with its children or if the children don't inform the parent of changes to the authoritative nameservers for the zone. If the edu administrators have this old delegation information for movie.edu:

$ORIGIN movie.edu.
@    86400    IN    NS    toystory
     86400    IN    NS    wormhole
toystory      86400    IN    A    192.249.249.3
wormhole      86400    IN    A    192.249.249.254 ; wormhole's former
                                                  ; IP address

the edu nameservers will give out the bogus old address for wormhole.movie.edu.

This is easily corrected once it's isolated to the parent zone's nameservers: just contact the parent zone's administrator and ask to have the delegation information updated. If your parent zone is one of the gTLDs, you may be able to fix the problem by filling out a form on your registrar's web site to modify the information about the nameserver. If any of the child zone's nameservers have cached the bad data, kill them (to clear out their caches), delete any backup zone datafiles that contain the bad data, and restart them.

14.7.7.2. Registration of a non-nameserver

This is a problem unique to the gTLD zones: com, net, and org. Sometimes, you'll find the gTLD nameservers giving out stale address information about a host in one of your zonesand not even a nameserver! But why would the gTLD nameservers have information about an arbitrary host in one of your zones?

Here's the answer: you can register hosts in the gTLD zones that aren't nameservers at all, such as your web server. For example, you can register an address for www.foo.com through a com registrar, and the com nameservers will give out that address. You shouldn't, though, because you'll lose a fair amount of control over the address. If you need to change the address, it could take a day or more to push the change through your registrar. If you run the foo.com primary nameserver, you can make the change almost instantly.

14.7.7.3. What have I got?

How do you determine which of these problems is plaguing you? Pay attention to which nameservers are distributing the old data and which zones the data relates to:

  • Is the nameserver a gTLD nameserver? Check for a stale, registered address.

  • Is the nameserver your parent nameserver but not a gTLD nameserver? Check the parent for old delegation information.

That's about all we can think to cover. It's certainly not a comprehensive list, but we hope it'll help you solve the more common problems you encounter with DNS and give you ideas about how to approach the rest. Boy, if we'd only had a troubleshooting guide when we started!


Previous Page
Next Page