The problem occurs with DHCP servers that are not compliant with RFC2131. When Busybox sends a renewal request, it sends the request directly to the server it last had an assignment from, i.e. unicast not broadcast. The DHCP server should reply with a unicast response, i.e. directly addressed to the requesting Busybox, not broadcast. Unfortunately some DHCP servers are non compliant in this respect and send the reply back as a broadcast messages - Busybox is not set up to receive this message in this state and so they are discarded. If Busybox does not get a response to the unicast requests from the DHCP server it switches back to the broadcast of the request that it first used on startup, in this state it is able to receive broadcast responses from the DHCP server and the IP address is re-allocated. The net effect of this is that, with non-compliant servers, Busybox loses its IP address, the link drops out, and then within a few seconds it is back up again. This disrupts traffic on the system and can cause certain services to drop out (exit and/or restart). From our testing some examples of compliant behaviour are Dell (Linux) servers configured as DHCP servers, and Linksys routers are compliant. However some examples of non-compliant behaviour are Netopia (Motorola) routers and Draytek routers. Should Busybox be modified to cope with non-compliant servers, or should another work-around be sought?
To be more specific on this: In networking/udhcp/dhcpc.c, there is a section for the code that looks like this: case REBINDING: /* Lease is *really* about to run out, * try to find DHCP server using broadcast */ if (timeout > 0) { /* send a request packet */ send_renew(xid, 0 /*INADDR_ANY*/, requested_ip); /* broadcast */ timeout >>= 1; continue; } /* Timed out, enter init state */ bb_info_msg("Lease lost, entering init state"); udhcp_run_script(NULL, "deconfig"); change_listen_mode(LISTEN_RAW); state = INIT_SELECTING; The line "change_listen_mode(LISTEN_RAW);" should really be moved to just below "/* send a request packet */". This is because, if you send a request on broadcast, you would expect a reply on broadcast. Does this make sense? Have I missed anything?
We have done some tests on a non-compliant router and can confirm that putting "change_listen_mode(LISTEN_RAW);" below "/* send a request packet */" *does* fix the problem described. In fact, the person who tested this copied the line rather than moving it.
Created attachment 687 [details] Fix Thanks for excellent analysis! We can move change_listen_mode(LISTEN_RAW) above the (timeout > 0) ... as in an attached patch. This way we ensure raw listening is on even if timeout == 0, and we do not add the code.
Fixed, will be in 1.16.x