Differential Methylated region analysis

usage: DMR.py [-h] [-j JID] -f METHYLKIT_TSV -d DESIGN_MATRIX [-g GENOME]

optional arguments:
  -h, --help            show this help message and exit
  -j JID, --jid JID     enter a job ID, which is used to make a new directory.
                        Every output will be moved into this folder. (default:
                        DMR_yli11_2025-08-29)
  -f METHYLKIT_TSV, --methylKit_tsv METHYLKIT_TSV
                        TSV file, 3 columns, methylKit file, sample name,
                        group name (default: None)
  -d DESIGN_MATRIX, --design_matrix DESIGN_MATRIX
                        TSV file, 3 columns, group ID, group ID, output_prefix
                        (default: None)

Genome Info:
  -g GENOME, --genome GENOME
                        genome version: hg19, hg38, mm9, mm10 (default: hg19)

Summary

DMR analysis using methylKit. https://www.bioconductor.org/packages/devel/bioc/vignettes/methylKit/inst/doc/methylKit.html

Input

1. input.tsv

Please copy (or softlink) methylKit files (e.g., .markdup.sorted_CpG.methylKit from the DNA methylation analysis pipeline) and make a typical input.tsv of 3 columns: File name, Sample name, Group name.

==> input.tsv <==
PB_AAVS_rep1.markdup.sorted_CpG.methylKit       PB_AAVS_rep1    Control
PB_AAVS_rep2.markdup.sorted_CpG.methylKit       PB_AAVS_rep2    Control
PB_g1617_rep1.markdup.sorted_CpG.methylKit      PB_g1617_rep1   BCL11Ako
PB_g1617_rep2.markdup.sorted_CpG.methylKit      PB_g1617_rep2   BCL11Ako
PB_ZBTB7Ag1_rep1.markdup.sorted_CpG.methylKit   PB_ZBTB7Ag1_rep1        ZBTB7Ako
PB_ZBTB7Ag1_rep2.markdup.sorted_CpG.methylKit   PB_ZBTB7Ag1_rep2        ZBTB7Ako

2. design matrix

This is similar to peak_call.tsv. The columns are group 1, group 2, output prefix.

==> design.tsv <==
ZBTB7Ako        Control ZBTB7Ako_vs_control
BCL11Ako        Control BCL11Ako_vs_control

Each line will be used as a comparison. For example, if you have three groups and possibly 3 comparisons (e.g., 1 vs 2, 1 vs 3, and 2 vs 3), then you can write down the comparisons in three lines. Again, this is same as the peak_call.tsv.

Usage

    hpcf_interactive

module load python/2.7.13

DMR.py -f input.tsv -d design.tsv -g hg19

Output

DMR results (.DMR_results.annot.csv) are stored in the $jid folder. Gene annotation is done using homer. The Gene Name column can be then used to merge DNA methylation results with gene expression results.

code @ github.