Linux - ServerThis forum is for the discussion of Linux Software used in a server related context.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Hello! I'm attempting to debug what I think is a DNS issue with a server, and I'm unsure how to proceed.
Main information
The server can `ping 8.8.8.8`, but cannot `ping google.com` (`Name or service not known`). My `/etc/resolv.conf` contains:
Code:
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
# 127.0.0.53 is the systemd-resolved stub resolver.
# run "systemd-resolve --status" to see details about the actual nameservers.
nameserver 8.8.8.8
nameserver 127.0.0.53
options edns0
This is the contents when it is regenerated using `resolvconf -u`.
Previously when I had manually added `nameserver 8.8.8.8` to the top of `/etc/resolv.conf`, I was able to `ping google.com`, but other services (see long version) still seemed to be failing in some way. However, since I've attempted some other fixes, such as `sudo apt install --reinstall resolvconf network-manager libnss-resolve` and others, even the presence of `nameserver 8.8.8.8` in `/etc/resolv.conf` does not seem to allow `ping google.com` to work. I'm also now unsure of where `nameserver 8.8.8.8` is being added from during a `resolvconf -u`, as none of `/etc/systemd/resolved.conf`, `/etc/resolvconf/resolv.conf.d/head`, or `/etc/resolvconf/resolv.conf.d/base` seem to contain this entry, and `/etc/network/interfaces.d/` is empty.
My `systemd-resolve --status` appears as:
Code:
Global
DNS Servers: 8.8.8.8
8.8.4.4
DNSSEC NTA: 10.in-addr.arpa
16.172.in-addr.arpa
168.192.in-addr.arpa
17.172.in-addr.arpa
18.172.in-addr.arpa
19.172.in-addr.arpa
20.172.in-addr.arpa
21.172.in-addr.arpa
22.172.in-addr.arpa
23.172.in-addr.arpa
24.172.in-addr.arpa
25.172.in-addr.arpa
26.172.in-addr.arpa
27.172.in-addr.arpa
28.172.in-addr.arpa
29.172.in-addr.arpa
30.172.in-addr.arpa
31.172.in-addr.arpa
corp
d.f.ip6.arpa
home
internal
intranet
lan
local
private
test
Link 11 (veth399f989)
Current Scopes: none
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
Link 9 (vethc76fcf1)
Current Scopes: none
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
Link 7 (vethbb5aff2)
Current Scopes: none
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
Link 5 (br-ad7981a8fd08)
Current Scopes: none
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
Link 4 (docker0)
Current Scopes: none
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
Link 3 (eno2)
Current Scopes: DNS
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
DNS Servers: 134.74.128.7
134.74.192.2
DNS Domain: ~.
Link 2 (eno1)
Current Scopes: none
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
Additional information
I originally set up this server for my PhD advisor when I was student 5 or so years ago. Primarily, they use this server to host a WordPress site and a MediaWiki site that I set up at the time. This has continued to work fine for the last 5 years.
The original sign of some issue was that, recently, for both the WordPress site and the MediaWiki site, any page updates began to fail. For example, the MediaWiki pages can still be viewed, but upon attempting to submit an edit to a page, the user receives a timeout. On the server, Nginx receives the POST, PHP seems to execute the appropriate script for the post, but then the page is left un-updated. I'm not finding any errors in any of the Nginx, PHP, or database logs. Given that there are the other DNS issues made obvious from the above pinging, I suspect that the server is sending requests to itself, but due to these DNS issues, these requests are never really sent or received.
The server is behind a university controlled entry point, then a lab router. I no longer have physical access to the machine, but I can periodically have someone go in to physically access the machine when needed. As part of my attempts to fix it, at one point I had run a package update followed by a reboot. For one reason or another, the reboot did not complete, and the machine only shutdown. Someone had to be sent in physically to turn the machine back on for me. So with my other fixes, I would hope to avoid trying solutions that require reboots, though, I understand this often may be required.
I am far from an expert in either unix related topics or networking topics, so I apologize in advance for any obvious mistakes or troubleshooting that I haven't checked.
Any suggestions would be greatly appreciated. Thank you for your time.
I'm no networking expert either. /etc/resolv.conf is always my first thought when ping fails. Typically it's just fine, and the problem is there is no default route set up. Check yours with: ip route. If that's not it, then check your firewall for blockage. Any more than that and I would have to Google it.
#1 Ping is not a dependable test of network continuity if your are routing through devices that may block ECHO packets.
#2 a. Test your default nameserver using lookup utilities such as nslook, host, or dig.
Code:
nslookup www.google.com
__ b. Then test lookup using the external nameserver you WANT to have work using the IP address
Code:
nslookup www.google.com 8.8.8.8
Note the responding server names/addresses as the utility reports them in addition to the target information. It may be important.
#3 Whoever runs the lab network needs to be consulted. IF that fails you need to consult with whoever manages the U network. See if there is a required nameserver and if they are blocking lookups on port 53 to external nameservers.
Some requirements have changed, and some facilities now require nameserver encrupted lookups or restrict lookup traffic. If that is one of them, they may have hosed your operation without even being aware it existed.
IF they have a sysadmin or network admin that is familiar with the OS on that machine perhaps they will be willing to troubleshoot the network settings and document the fix.
Now we can talk about recovery. Without direct access to the machine, physical or network based, this will be tricky and I am not in a position to help. I suspect the updates failed in part or full because the update/repo servers got "lost" due to the name resolution issue. If you cannot check the logs to determine what failed and what worked, I am not sure how you will remote troubleshoot that part of the problem.
IS it an option to pull a backup of the data and rebuild the machine and reload the data? In a worst-case scenario you can still recover if that is an option and you have decent backups.
$ ip route
default via 134.74.113.100 dev eno2 proto static metric 100
134.74.113.0/24 dev eno2 proto kernel scope link src 134.74.113.131 metric 100
169.254.0.0/16 dev eno2 scope link metric 1000
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown
192.168.113.0/20 dev br-ad7981a8fd08 proto kernel scope link src 192.168.113.1
As far as the firewall, I'm sure exactly which commands would tell me what I need to know, or what exactly I should be looking for, but both `iptables -L` and `ufw status verbose` suggest things are more or less allowed in all cases.
Thank you for those initial checks though!
==========================================
@wpeckham
`nslookup www.google.com` and `nslookup www.google.com 8.8.8.8` both give the same result of
I'm not exactly sure how to interpret these results. Clearly it's pointing to some address with the nameserver used being 8.8.8.8. And a different site has a different address (still using the nameserver 8.8.8.8). But I don't know if this means that a route to the external address is being made, or if this is just in a name table somewhere more local.
Unfortunately, I'm still probably the most knowledgeable person about the lab network (though, obviously, I'm not particularly knowledgable). There are now other students who kind of fill the role of administrator temporarily. It's a small computer science research lab, so the students are computer science students, but none are network or sysadmin specific students. They can be there physically while I work with them remotely, though. And if I can't find the solution working with them, I will certainly contact the university department network administrator to see what I can find out.
As far as recovery, I think I said things in a way that might have been confusing. As far as I know, nothing on the server is broken (except the network setup). By page updates, I meant if a user of the MediaWiki attempted to update a page on the wiki, the sent update would seem to hang. But as far as I know, all the software on the machine is still functioning properly. When I ran the package updates, I had manually updated `/etc/resolv.conf` to include `nameserver 8.8.8.8`, which previously made connecting to the package repositories work. And the package updates seemed to run fine. But those updates, or one of my other attempted fixes now makes that no longer access the external network. I think when I ran `apt install --reinstall resolvconf network-manager libnss-resolve`, `/etc/resolv.conf` both started being generated with `nameserver 8.8.8.8` when the file is regenerated, but also made that no longer connect to the external sites. That said, I'm not at all certain. I tried to revert any changes I made that didn't improve the situation, but the package updates are the one case where I didn't revert changes.
Thank you much for your time!
Last edited by jackwayneright; 01-02-2024 at 12:14 AM.
Earlier today, I had one of the students work with me from in the lab. From the Ubuntu desktop, using the GUI interface to the network, we found that the DNS for the connection was set to manual with some university internal addresses, though external to the lab network (e.g. `134.74.128.7, 134.74.192.2`). After changing this to `8.8.8.8 8.8.4.4` and rebooting, the server was able to access external domain names fine. I'm a bit confused as to where the GUI network settings live compared to the settings from `/etc/resolv.conf` and what not, and why the GUI one was taking precedence. I'd be interested to know if someone has the explanation for that part.
However, submitting updates to our sites still seem to be failing. But now, useful messages were showing up in the Nginx logs. Notably, during HTTP POSTs to the sites, we were now receiving `499` or `408` codes. Except when the POST was sent from the server itself or from a machine on the lab network. We also found that the server itself was not on the lab network, but goes to what is presumably the department network. The lab router also next goes to the department network from there. Additionally, machines on the university wifi *do* encounter the issue with the POSTs. So the fact that machines on the lab network can successfully send POSTs, but machines on the university wifi cannot suggests to me that there is a conflict between what the department router expects and what the server expects. So I'm now talking with the professor that owns the lab and the department to get things sorted out.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.