Introduction

LW-FQZip 2 is a lossless reference-based compression method targeting FASTQ files. It improved from the light-weight reference based compression tool LW-FQZip 1 (Y. Zhang et al., BMC Bioinformatics, 16:188, 2015) by introducing more efficient coding schemes and parallelism. LW-FQZip 2 is capable of obtaining superior compression ratios at reasonable time cost and memory consumption. The competence enables LW-FQZip 2 to serve as a candidate tool for archival or storage space-sensitive applications of real-world sequencing data. The average compression ratios (Compressed size/Original size) of LW-FQZip 2 on four groups of long-read data and one group of short-read data are shown as follows. The workflow of the method is shown in Fig.1.

Illumina Miseq

13.9%

454 GS

15.3%

Pacbio RS

32.3%

MinION

34.8%

Short-Read

15.6%
Release notes
The General Framework (More implementation details are provided in the supplementary file)