Analysis: When Busybox wget is compiled with CONFIG_FEATURE_WGET_STATUSBAR and CONFIG_FEATURE_WGET_TIMEOUT enabled it uses poll() and non-blocking fread() to retrieve data. If the HTTP server uses HTTP/1.1 chunked encoding wget further more uses fgets() to retrieve the content of the next chunk. With this combination wget reads chunk length (and all remaining data) with fgets() and afterwards hangs forever in the poll() on an empty socket. Suggested solution: Make sure that all data is retrieved with fread() before using poll() to wait for further data. Workaround: Disable CONFIG_FEATURE_WGET_STATUSBAR and CONFIG_FEATURE_WGET_TIMEOUT
Please try this easy fix: while (1) { if (safe_poll(&polldata, 1, 1000) != 0) break; /* error, EOF, or data is available */ # if ENABLE_FEATURE_WGET_TIMEOUT if (second_cnt != 0 && --second_cnt == 0) { progress_meter(PROGRESS_END); bb_error_msg_and_die("download timed out"); } # endif /* Needed for "stalled" indicator */ progress_meter(PROGRESS_BUMP); ADD THIS ==============> break; } Basically, this makes us try reading once a second, even if poll says there is no data. I think this should work. Please let me know whether it indeed does.
Created attachment 4514 [details] Wireshark trace that demonstrates problem
Thanks I have tried the fix and it stops wget from stalling. However, it is very slow. Retrieving a small file, which should be completed within a fraction of a second sometimes takes several seconds. Basically, I think you need to arrange the code in such a way that you are sure that the fread() buffer is empty before you call poll(). I have attached a Wireshark trace that demonstrates the problem.
(In reply to comment #3) > I have tried the fix and it stops wget from stalling. However, it is very slow. > Retrieving a small file, which should be completed within a fraction of a > second sometimes takes several seconds. The delay caused by additional poll(1000) can be one second maximum. > Basically, I think you need to arrange the code in such a way that you are sure > that the fread() buffer is empty before you call poll(). How can I know that the buffer is not empty? IIRC stdio.h doesn't provide mechanisms for knowing that.
It is correct that the delay is one second. However, if you are unlucky like the example PCAP I attached you can get the delay for every chunk. I suggest a solution where you always call fread() before calling poll() and only call poll() if the fread() returns EAGAIN. Below patch should do this: ================= diff --git a/networking/wget.c b/networking/wget.c index 3416636..74a2a5f 100644 --- a/networking/wget.c +++ b/networking/wget.c @@ -485,64 +485,62 @@ static void NOINLINE retrieve_file_data(FILE *dfp) } } -#if ENABLE_FEATURE_WGET_STATUSBAR || ENABLE_FEATURE_WGET_TIMEOUT - if (safe_poll(&polldata, 1, 1000) == 0) { -# if ENABLE_FEATURE_WGET_TIMEOUT - if (second_cnt != 0 && --second_cnt == 0) { - progress_meter(PROGRESS_END); - bb_error_msg_and_die("download timed out"); - } -# endif - /* Needed for "stalled" indicator */ - progress_meter(PROGRESS_BUMP); - /* - * We used to loop back to poll here, - * but in chunked case, we can be here after - * fgets and it could buffer some data in dfp... - * which poll knows nothing about! - * Therefore let's try fread'ing anyway. - */ - } - - /* fread internally uses read loop, which in our case - * is usually exited when we get EAGAIN. - * In this case, libc sets error marker on the stream. - * Need to clear it before next fread to avoid possible - * rare false positive ferror below. Rare because usually - * fread gets more than zero bytes, and we don't fall - * into if (n <= 0) ... - */ - clearerr(dfp); - errno = 0; -#endif n = fread(G.wget_buf, 1, rdsz, dfp); /* man fread: * If error occurs, or EOF is reached, the return value * is a short item count (or zero). * fread does not distinguish between EOF and error. */ - if (n <= 0) { -#if ENABLE_FEATURE_WGET_STATUSBAR || ENABLE_FEATURE_WGET_TIMEOUT - if (errno == EAGAIN) /* poll lied, there is no data? */ - continue; /* yes */ -#endif - if (ferror(dfp)) - bb_perror_msg_and_die(bb_msg_read_error); - break; /* EOF, not error */ - } - - xwrite(G.output_fd, G.wget_buf, n); + if (n > 0) { + xwrite(G.output_fd, G.wget_buf, n); #if ENABLE_FEATURE_WGET_TIMEOUT - second_cnt = G.timeout_seconds; + second_cnt = G.timeout_seconds; #endif #if ENABLE_FEATURE_WGET_STATUSBAR - G.transferred += n; - progress_meter(PROGRESS_BUMP); + G.transferred += n; + progress_meter(PROGRESS_BUMP); #endif - if (G.got_clen) { - G.content_len -= n; - if (G.content_len == 0) - break; + if (G.got_clen) { + G.content_len -= n; + if (G.content_len == 0) + break; + } + } +#if ENABLE_FEATURE_WGET_STATUSBAR || ENABLE_FEATURE_WGET_TIMEOUT + else if (errno == EAGAIN) { + if (safe_poll(&polldata, 1, 1000) == 0) { +# if ENABLE_FEATURE_WGET_TIMEOUT + if (second_cnt != 0 && --second_cnt == 0) { + progress_meter(PROGRESS_END); + bb_error_msg_and_die("download timed out"); + } +# endif + /* Needed for "stalled" indicator */ + progress_meter(PROGRESS_BUMP); + /* + * We used to loop back to poll here, + * but in chunked case, we can be here after + * fgets and it could buffer some data in dfp... + * which poll knows nothing about! + * Therefore let's try fread'ing anyway. + */ + } + + /* fread internally uses read loop, which in our case + * is usually exited when we get EAGAIN. + * In this case, libc sets error marker on the stream. + * Need to clear it before next fread to avoid possible + * rare false positive ferror below. Rare because usually + * fread gets more than zero bytes, and we don't fall + * into if (n <= 0) ... + */ + clearerr(dfp); + errno = 0; + } +#endif + else { + if (ferror(dfp)) + bb_perror_msg_and_die(bb_msg_read_error); } } #if ENABLE_FEATURE_WGET_STATUSBAR || ENABLE_FEATURE_WGET_TIMEOUT -- =================
Please pull current git and try it. Commit b7812ce0f7ee230330e18de3ac447b700311ab84 should have it fixed.
Thanks It fixes the problem.