Demultiplexing fastq files

usage: demultiplexing_index.py [-h] -f INPUT -b BARCODE

optional arguments:
  -h, --help            show this help message and exit
  -f INPUT, --input INPUT
                        input undetermined fastq.gz (default: None)
  -b BARCODE, --barcode BARCODE
                        barcode file in fasta format (default: None)

Input

Barcode file

Please specify the barcode sequence as a fasta file. The name will be used as the output fastq file name.

>ABE8NG_posrep1
TAAGGCGA
>ABE8NG_posrep2
CGTACTAG
>ABE8NG_posrep3
AGGCAGAA
>ABE8NG_negrep1
TCCTGAGC
>ABE8NG_negrep2
GGACTCCT
>ABE8NG_negrep3
TAGGCATG
>ABE8NG_ALL
CTCTCTAC

Undetermined fastq file

This program only works for single-end data. Usually for sgRNA deep sequencing, you will use the R1 read.

Output

Demultiplexed fastq files will be named using the barcode file. Unmatched reads will be outputed to unmatched.fastq.gz

Usage

hpcf_interactive

module load python/2.7.13

# run interactively
demultiplexing_index.py -f Undetermined_S0_R1_001.fastq.gz -b barcode.fa

# submit job to HPC
bsub -P dx -q priority -R rusage[mem=8000] demultiplexing_index.py -f Undetermined_S0_R1_001.fastq.gz -b barcode.fa -n 2

code @ github.