Created attachment 2935 [details] Patch With empty /etc/resolv.conf we still have 10 sec. timeout while trying to resolve names. The problem is in poll logic. It is not sufficient to check poll retval. The revents field must be examined too. Here is the patch, that works for me: --- diff -Nur uClibc-0.9.31/libc/inet/resolv.c uClibc-0.9.31-poll/libc/inet/resolv.c --- uClibc-0.9.31/libc/inet/resolv.c 2010-04-02 19:34:27.000000000 +0400 +++ uClibc-0.9.31-poll/libc/inet/resolv.c 2011-02-08 17:38:28.000000000 +0300 @@ -1408,6 +1408,10 @@ * to next nameserver */ goto try_next_server; } + if (fds.revents & (POLLERR | POLLHUP | POLLNVAL)) { + DPRINTF("Bad event\n"); + goto try_next_server; + } /*TODO: better timeout accounting?*/ reply_timeout -= 1000; #endif
I can't reproduce the issue with empty /etc/resolv.conf. Probably I miss something? Have you some daemon listening 53/udp on the same host? Can you provide strace output? It works as designed for me (empty /etc/resolv.conf): ... socket(PF_INET, SOCK_DGRAM, IPPROTO_UDP) = 3 <0.000062> connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("0.0.0.0")}, 28) = 0 <0.000063> send(3, "\0\2\1\0\0\1\0\0\0\0\0\0\6google\3com\0\0\34\0\1"..., 28, 0) = 28 <0.000329> poll([{fd=3, events=POLLIN}], 1, 5000) = 1 ([{fd=3, revents=POLLERR}]) <0.000058> recv(3, 0x48c008, 512, MSG_DONTWAIT) = -1 ECONNREFUSED (Connection refused) <0.000037> close(3) = 0 <0.000068> ...
The only improvement that I could see, is getting rid of superfluous recv() call.
I am too sorry. I do not remember anything. :( Just remember, that after suffering long delays with empty resolv.conf and strace/ltrace i looked in uclibc code and glibc code. (glibc worked ok, with no delays at all). After some debugging and reading man i found solution and here is my patch. We had no daemon listening 53/udp on localhost. But, i am not sure, may be network must be down or like this? I am not sure. Currently, we have no problem. But the patch is included in our project.
I apologize that single reason that uClibc maintainers not include your patch into mainstream, is due you didn't provide complete test-case or debug trace. Hope you, like me, want to push patch into mainstream. In such case we should provide info, I mentioned above. Is your project open source? If yes, has it public code repository?
Project currently is not not ready, nor open. I hope, maintainers know uClibc better then me, so, i have no any claims at all, I just sent patch, that we are using. If it is useless for project, it's ok. May be the problem is appearing when there is lo interface down. Try to: # ip set link lo down # ping ya.ru. But the main idea is that we NEED to check POLLERR | POLLHUP | POLLNVAL status anyway. If we got POLLERR | POLLHUP | POLLNVAL status, why not skip this server, as glibc do?
I agree that revents check is a good behavior, but glibc is huge due it has as much checks as possible. uClibc must be small! Moreover, glibc trying to use nonblocking sockets for names resolve. Unfortunately, I still can't reproduce problem. Since we have connection-less datagram(UDP) socket, poll() can set POLLERR | POLLHUP in case of problems on local host only, IMHO. Will try to experiment with invalid interface routing, etc.