Class BZip2CompressorOutputStream
- All Implemented Interfaces:
Closeable,Flushable,AutoCloseable,BZip2Constants
The compression requires large amounts of memory. Thus you should call the
close() method as soon as possible, to force
BZip2CompressorOutputStream to release the allocated memory.
You can shrink the amount of allocated memory and maybe raise the compression speed by choosing a lower blocksize, which in turn may cause a lower compression ratio. You can avoid unnecessary memory allocation by avoiding using a blocksize which is bigger than the size of the input.
You can compute the memory usage for compressing by the following formula:
<code>400k + (9 * blocksize)</code>.
To get the memory required for decompression by BZip2CompressorInputStream use
<code>65k + (5 * blocksize)</code>.
| Memory usage by blocksize | ||
|---|---|---|
| Blocksize | Compression memory usage | Decompression memory usage |
| 100k | 1300k | 565k |
| 200k | 2200k | 1065k |
| 300k | 3100k | 1565k |
| 400k | 4000k | 2065k |
| 500k | 4900k | 2565k |
| 600k | 5800k | 3065k |
| 700k | 6700k | 3565k |
| 800k | 7600k | 4065k |
| 900k | 8500k | 4565k |
For decompression BZip2CompressorInputStream allocates less memory if the
bzipped input is smaller than one block.
Instances of this class are not threadsafe.
TODO: Update to BZip2 1.0.1
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static final class -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final intprivate intprivate final intAlways: in the range 0 ..private BlockSortprivate intprivate intprivate booleanprivate intprivate final CRCprivate intprivate BZip2CompressorOutputStream.DataAll memory intensive stuff.private static final intprivate intIndex of the last char in the block, so the block size == last + 1.private static final intstatic final intThe maximum supported blocksize== 9.static final intThe minimum supported blocksize== 1.private intprivate intprivate OutputStreamprivate intFields inherited from interface org.apache.commons.compress.compressors.bzip2.BZip2Constants
BASEBLOCKSIZE, G_SIZE, MAX_ALPHA_SIZE, MAX_CODE_LEN, MAX_SELECTORS, N_GROUPS, N_ITERS, NUM_OVERSHOOT_BYTES, RUNA, RUNB -
Constructor Summary
ConstructorsConstructorDescriptionConstructs a newBZip2CompressorOutputStreamwith a blocksize of 900k.BZip2CompressorOutputStream(OutputStream out, int blockSize) Constructs a newBZip2CompressorOutputStreamwith specified blocksize. -
Method Summary
Modifier and TypeMethodDescriptionprivate voidprivate voidprivate voidbsPutInt(int u) private voidbsPutUByte(int c) private voidbsW(int n, int v) static intchooseBlockSize(long inputLength) Chooses a blocksize based on the given length of the data to compress.voidclose()private voidendBlock()private voidvoidfinish()voidflush()private voidfinal intReturns the blocksize parameter specified at construction time.private static voidhbAssignCodes(int[] code, byte[] length, int minLen, int maxLen, int alphaSize) private static voidhbMakeCodeLengths(byte[] len, int[] freq, BZip2CompressorOutputStream.Data dat, int alphaSize, int maxLen) private voidinit()Writes magic bytes like BZ on the first position of the stream and bytes indicating the file-format, which is huffmanized, followed by a digit indicating blockSize100k.private voidprivate voidprivate voidprivate voidsendMTFValues0(int nGroups, int alphaSize) private intsendMTFValues1(int nGroups, int alphaSize) private voidsendMTFValues2(int nGroups, int nSelectors) private voidsendMTFValues3(int nGroups, int alphaSize) private voidprivate voidsendMTFValues5(int nGroups, int nSelectors) private voidsendMTFValues6(int nGroups, int alphaSize) private voidvoidwrite(byte[] buf, int offs, int len) voidwrite(int b) private voidwrite0(int b) Keeps track of the last bytes written and implicitly performs run-length encoding as the first step of the bzip2 algorithm.private voidwriteRun()Writes the current byte to the buffer, run-length encoding it if it has been repeated at least four times (the first step RLEs sequences of four identical bytes).Methods inherited from class java.io.OutputStream
write
-
Field Details
-
MIN_BLOCKSIZE
public static final int MIN_BLOCKSIZEThe minimum supported blocksize== 1.- See Also:
-
MAX_BLOCKSIZE
public static final int MAX_BLOCKSIZEThe maximum supported blocksize== 9.- See Also:
-
GREATER_ICOST
private static final int GREATER_ICOST- See Also:
-
LESSER_ICOST
private static final int LESSER_ICOST- See Also:
-
last
private int lastIndex of the last char in the block, so the block size == last + 1. -
blockSize100k
private final int blockSize100kAlways: in the range 0 .. 9. The current block size is 100000 * this number. -
bsBuff
private int bsBuff -
bsLive
private int bsLive -
crc
-
nInUse
private int nInUse -
nMTF
private int nMTF -
currentChar
private int currentChar -
runLength
private int runLength -
blockCRC
private int blockCRC -
combinedCRC
private int combinedCRC -
allowableBlockSize
private final int allowableBlockSize -
data
All memory intensive stuff. -
blockSorter
-
out
-
closed
private volatile boolean closed
-
-
Constructor Details
-
BZip2CompressorOutputStream
Constructs a newBZip2CompressorOutputStreamwith a blocksize of 900k.- Parameters:
out- the destination stream.- Throws:
IOException- if an I/O error occurs in the specified stream.NullPointerException- ifout == null.
-
BZip2CompressorOutputStream
Constructs a newBZip2CompressorOutputStreamwith specified blocksize.- Parameters:
out- the destination stream.blockSize- the blockSize as 100k units.- Throws:
IOException- if an I/O error occurs in the specified stream.IllegalArgumentException- if(blockSize < 1) || (blockSize > 9).NullPointerException- ifout == null.- See Also:
-
-
Method Details
-
chooseBlockSize
public static int chooseBlockSize(long inputLength) Chooses a blocksize based on the given length of the data to compress.- Parameters:
inputLength- The length of the data which will be compressed byBZip2CompressorOutputStream.- Returns:
- The blocksize, between
MIN_BLOCKSIZEandMAX_BLOCKSIZEboth inclusive. For a negativeinputLengththis method returnsMAX_BLOCKSIZEalways.
-
hbAssignCodes
private static void hbAssignCodes(int[] code, byte[] length, int minLen, int maxLen, int alphaSize) -
hbMakeCodeLengths
private static void hbMakeCodeLengths(byte[] len, int[] freq, BZip2CompressorOutputStream.Data dat, int alphaSize, int maxLen) -
blockSort
private void blockSort() -
bsFinishedWithStream
- Throws:
IOException
-
bsPutInt
- Throws:
IOException
-
bsPutUByte
- Throws:
IOException
-
bsW
- Throws:
IOException
-
close
- Specified by:
closein interfaceAutoCloseable- Specified by:
closein interfaceCloseable- Overrides:
closein classOutputStream- Throws:
IOException
-
endBlock
- Throws:
IOException
-
endCompression
- Throws:
IOException
-
finish
- Throws:
IOException
-
flush
- Specified by:
flushin interfaceFlushable- Overrides:
flushin classOutputStream- Throws:
IOException
-
generateMTFValues
private void generateMTFValues() -
getBlockSize
public final int getBlockSize()Returns the blocksize parameter specified at construction time.- Returns:
- the blocksize parameter specified at construction time
-
init
Writes magic bytes like BZ on the first position of the stream and bytes indicating the file-format, which is huffmanized, followed by a digit indicating blockSize100k.- Throws:
IOException- if the magic bytes could not been written
-
initBlock
private void initBlock() -
moveToFrontCodeAndSend
- Throws:
IOException
-
sendMTFValues
- Throws:
IOException
-
sendMTFValues0
private void sendMTFValues0(int nGroups, int alphaSize) -
sendMTFValues1
private int sendMTFValues1(int nGroups, int alphaSize) -
sendMTFValues2
private void sendMTFValues2(int nGroups, int nSelectors) -
sendMTFValues3
private void sendMTFValues3(int nGroups, int alphaSize) -
sendMTFValues4
- Throws:
IOException
-
sendMTFValues5
- Throws:
IOException
-
sendMTFValues6
- Throws:
IOException
-
sendMTFValues7
- Throws:
IOException
-
write
- Overrides:
writein classOutputStream- Throws:
IOException
-
write
- Specified by:
writein classOutputStream- Throws:
IOException
-
write0
Keeps track of the last bytes written and implicitly performs run-length encoding as the first step of the bzip2 algorithm.- Throws:
IOException
-
writeRun
Writes the current byte to the buffer, run-length encoding it if it has been repeated at least four times (the first step RLEs sequences of four identical bytes).Flushes the current block before writing data if it is full.
"write to the buffer" means adding to data.buffer starting two steps "after" this.last - initially starting at index 1 (not 0) - and updating this.last to point to the last index written minus 1.
- Throws:
IOException
-