Bug 3625 - Wget does not %-decode authorization credentials
Summary: Wget does not %-decode authorization credentials
Status: RESOLVED FIXED
Alias: None
Product: Busybox
Classification: Unclassified
Component: Networking (show other bugs)
Version: 1.18.x
Hardware: PC Linux
: P5 minor
Target Milestone: ---
Assignee: unassigned
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-04-14 04:40 UTC by Kevin Locke
Modified: 2011-09-11 19:05 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Kevin Locke 2011-04-14 04:40:18 UTC
Wget currently takes the string between the // and @ of the URL and base64 encodes this into the Authorization header.  This behavior differs from GNU wget and curl which both %-decode the username/password before sending.  The observed behavior is as follows:

Running the following commands in series:
busybox wget http://test:my%20pass@example.com
wget --auth-no-challenge http://test:my%20pass@example.com
curl --basic http://test:my%20pass@example.com

Results in the following queries to the server:
GET / HTTP/1.1
Host: digitalenginesoftware.com:8088
User-Agent: Wget
Connection: close
Authorization: Basic dGVzdDpteSUyMHBhc3M=

GET / HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */*
Authorization: Basic dGVzdDpteSBwYXNz
Host: digitalenginesoftware.com:8088
Connection: Keep-Alive

GET / HTTP/1.1
Authorization: Basic dGVzdDpteSBwYXNz
User-Agent: curl/7.21.4 (i486-pc-linux-gnu) libcurl/7.21.4 OpenSSL/0.9.8o zlib/1.2.3.4 libidn/1.20 libssh2/1.2.6
Host: digitalenginesoftware.com:8088
Accept: */*

Note that "dGVzdDpteSBwYXNz" base64 decodes to "test:my pass" and "dGVzdDpteSUyMHBhc3M=" base64 decodes to "test:my%20pass".

From an appeal to standards, section 3.1 of RFC 1738 specifies that 'Within the user and password field, any ":", "@", or "/" must be encoded.'  So any valid URL passed to wget which contains these characters must be %-encoded.

From an appeal to practicality, it's a inconvenient to detect whether the utility being called is BusyBox wget from scripts before deciding whether to escape these fields.  Parsing of URLs is also easier for the scripts if reserved characters only appear in URL strings when acting as their reserved meanings.

Thanks,
Kevin

P.S.  Apologies if this gets submitted multiple times.  I am receiving 500 errors during submit and my "My Bugs" list is still empty.
Comment 1 Denys Vlasenko 2011-09-11 19:05:34 UTC
Fixed in git:

commit dd1061b6a79b0161597799e825bfefc27993ace5
Author: Denys Vlasenko <vda.linux@googlemail.com>
Date:   Sun Sep 11 21:04:02 2011 +0200

    wget: URL-decode user:password before base64-encoding it into auth hdr.


Will go into 1.20.x