Bug 5804

Summary: unxz does not handle multiple xz streams on input
Product: Busybox Reporter: Michael Tokarev <mjt+busybox>
Component: Standard ComplianceAssignee: unassigned
Status: RESOLVED FIXED    
Severity: major CC: busybox-cvs
Priority: P5    
Version: unspecified   
Target Milestone: ---   
Hardware: All   
OS: All   
Host: Target:
Build:

Description Michael Tokarev 2012-12-20 08:14:21 UTC
According to xz format specs, http://tukaani.org/xz/xz-file-format-1.0.4.txt , it is explicitly allowed to have several complete xz streams (several xz files concatenated together) in a compressed xz file.  Busybox unxz stops decompression when seeing first end of stream.

 $ echo -n he | xz > file.xz
 $ echo llo  | xz >> file.xz
 $ unxz < file.xz
 hello
 $ busybox unxz < file.xz
 he$

Setting severity to "major" since it is a silent data loss - busybox unxz decompresses first stream and exits with successful exit status, but remaining data gets lost, silently.

There's at least one implementation of xz algorithm, pxz (parallel xz), which produces multiple streams in output file.  See http://bugs.debian.org/686502 (where the above testcase comes from).

busybox zcat does this correctly.
Comment 1 Denys Vlasenko 2013-02-27 17:21:35 UTC
Fixed in git:

commit 380c8a0763462692eef8d00df4872a561ff7aa7b
Author: Lasse Collin <lasse.collin@tukaani.org>
Date:   Wed Feb 27 17:26:40 2013 +0100

    xz: support concatenated .xz streams

    function                                             old     new   delta
    xz_dec_reset                                           -      77     +77
    unpack_xz_stream                                    2402    2397      -5

    Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
    Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>