Count indel integration pipeline (simplified version)¶
Summary¶
This pipeline simplifies the usage of count indel integration pipeline: here but it can only processes one gRNA per run.
Input¶
1. gRNA bed file¶
gRNA_bed_file is a tab separated file, it should have 6 columns: chr, start, end, name, value (can be anything), strand
The coordinates for gRNA need to include the PAM sequence.
g34 has two occurrences in the genome, so its bed file look like below
chr11 5249956 5249975 HBG1 1 +
chr11 5254880 5254899 HBG2 2 +
Pre-define gRNA bed file¶
You can find the following gRNA bed file here:
/home/yli11/HemTools/share/misc/DNMT3A.bed
/home/yli11/HemTools/share/misc/g34.bed
/home/yli11/HemTools/share/misc/p53.bed
/home/yli11/HemTools/share/misc/TET2.bed
Output¶
When the job is finished, you will be notified by an email with QC report and a summary.csv
for indel frequecies and different indel types.
Usage¶
Create a new working dir, put the fastq files in it (e.g., ln -s
) and run the following.
Step 0: Login to a compute node.
hpcf_interactive
Step 1: generate input file fastq.tsv
using --guess_input
module load python/2.7.13
export PATH=$PATH:"/home/yli11/HemTools/bin"
# cd to your working dir
run_lsf.py --guess_input
Step 2: submit job
run_lsf.py -p count_integration2 -f fastq.tsv --gRNA_bed gRNA.bed