Footprint analysis for ATAC-seq data¶
usage: atac_seq_footprint.py [-h] [-j JID] -f INPUT [-t TREATMENT]
[-c CONTROL] [-g GENOME]
RGT_HINT atac-seq footprint with bias correction
optional arguments:
-h, --help show this help message and exit
-j JID, --jid JID enter a job ID, which is used to make a new directory.
Every output will be moved into this folder. (default:
atac_seq_footprint_yli11_2020-07-04)
-f INPUT, --input INPUT
3-col tsv, bam,bed,output-prefix (default: None)
-t TREATMENT, --treatment TREATMENT
default is the output-prefix in the first row.
treatment output-prefix for differential footprint
analysis, should match to names in the input file
(default: None)
-c CONTROL, --control CONTROL
default is the second row. control output-prefix for
differential footprint analysis (default: None)
Genome Info:
-g GENOME, --genome GENOME
genome version: hg19, hg38, mm9, mm10. By default,
specifying a genome version will automatically update
index file, black list, chrom size and
effectiveGenomeSize, unless a user explicitly sets
those options. (default: hg19)
Summary¶
This pipeline applies HINT-ATAC (v0.13) and output bias-corrected footprint bed files and cutsites bw files.
Additionally, if -t
and -c
options are given, this program will perform differential footprint analysis. Example: https://www.regulatory-genomics.org/hint/tutorial/.
By default, -t
uses the name in the first row of the input file. and -c
uses the name in the second row.
Input¶
The input file is a tsv format containing 3 columns: bam, bed, output-prefix (sample name).
Either relative path or absolute path is OK.
Suppose you run this pipeline in bam_files
folder generated by HemTools atac_seq
Hudep1.markdup.bam ../peak_files/Hudep1.markdup.rmchrM_peaks.narrowPeak H1
Hudep2.markdup.bam ../peak_files/Hudep2.markdup.rmchrM_peaks.narrowPeak H2
Our recommendation is to create a new working dir and copy the input data so that the input file looks nicer.
Hudep1.markdup.bam Hudep1.markdup.rmchrM_peaks.narrowPeak H1
Hudep2.markdup.bam Hudep2.markdup.rmchrM_peaks.narrowPeak H2
Output¶
bias-corrected bigwig files
Look for *_bc.bw
in {{jid}} folder
called footprints
*.bed
in {{jid}} folder
differential motifs
Results are in Diff_footprints
folder
The txt contains the p-value
The pdf shows a scatter plot of the p-values.
Usage¶
module load python/2.7.13
atac_seq_footprint.py -f input.list
OR
module load python/2.7.13
atac_seq_footprint.py -f input.list -t H2 -c H1 -g hg19
Reference¶
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1642-2
Include conservation may enhance the footprint plot:
https://slowkow.github.io/CENTIPEDE.tutorial/
https://link.springer.com/article/10.1186/s13059-020-1929-3
https://www.regulatory-genomics.org/motif-analysis/additional-motif-data/
Other new tools¶
https://github.com/loosolab/TOBIAS
https://github.com/Boyle-Lab/TRACE