LW-FQZip 2 is compared with other state-of-the-art FASTQ data compression algorithms (including Quip, DSRC2, CRAM, FQZcomp, LFQC, LEON, SCALCE, LW-FQZip 1, bzip2, and gzip) using 10 real-world FASTQ files download from the Sequence Read Archive of the National Centre for Biotechnology Information (NCBI). The experimental results demonstrate that LW-FQZip 2 obtains superior compression ratios to other methods at reasonalbe time and memory costs. More discussions are available in our paper. The details of implementation, data sets, and experimental studies are provided in the supplementary file.
Dataset |
Platform |
Size(MB) |
Compressed ratio |
Compressed size(MB) |
Compressed time(S) |
Decompressed time(S) |
|
|---|---|---|---|---|---|---|---|
Long-read
| SRR2916693 |
454 GS |
425 |
16.5% |
71 |
35 |
25 |
SRR2994368 |
Illumina Miseq |
4688 |
17.3% |
812 |
300 |
240 |
|
SRR3211986 |
Pacbio RS |
1759 |
33.3% |
585 |
203 |
400 |
|
ERR739513 |
MinION |
871 |
35.2% |
307 |
122 |
170 |
|
SRR3190692 |
Illumina MiSeq |
11379 |
12.7% |
1441 |
540 |
416 |
|
short-read
| ERR385912 |
Illumina Hiseq 2000 |
641 |
6.4% |
41 |
25 |
12 |
ERR386131 |
Ion Torrent PGM |
1371 |
16.5% |
226 |
87 |
73 |
|
SRR034509 |
Illumina Analyzer II |
5247 |
23.7% |
1241 |
301 |
275 |
|
ERR174310 |
Illumina Hiseq 2000 |
105122 |
21.0% |
22061 |
14050 |
10428 |
|
ERR194147 |
Illumina Hiseq 2000 |
202631 |
20.1% |
40812 |
26488 |
19737 |
Dataset |
Platform |
Size(MB) |
Compressed ratio |
Compressed size(MB) |
Compressed time(S) |
Decompressed time(S) |
|
|---|---|---|---|---|---|---|---|
Long-read
| SRR2916693 |
454 GS |
425 |
15.3% |
65 |
303 |
295 |
SRR2994368 |
Illumina Miseq |
4688 |
16.0% |
748 |
1260 |
1198 |
|
SRR3211986 |
Pacbio RS |
1759 |
32.3% |
568 |
759 |
725 |
|
ERR739513 |
MinION |
871 |
34.8% |
303 |
333 |
320 |
|
SRR3190692 |
Illumina MiSeq |
11379 |
11.7% |
1330 |
2520 |
2372 |
|
short-read
| ERR385912 |
Illumina Hiseq 2000 |
641 |
5.0% |
32 |
282 |
268 |
ERR386131 |
Ion Torrent PGM |
1371 |
16.0% |
219 |
324 |
301 |
|
SRR034509 |
Illumina Analyzer II |
5247 |
22.7% |
1193 |
1200 |
1080 |
|
ERR174310 |
Illumina Hiseq 2000 |
105122 |
20.1% |
21152 |
42600 |
30000 |
|
ERR194147 |
Illumina Hiseq 2000 |
202631 |
14.3% |
28915 |
71400 |
60540 |