Tsai Lab Bioinformatics¶
Protein mutagenesis¶
Code: https://github.com/tsailabSJ/Cas9Variants/tree/master/Mammalian_system
1. Dictionary generation¶
See PacBio_amplicon_sequencing folder. The code was originally written for Cas9 (Kasey), recently adapted to PE/RT (Kiera).
Kiera’s data is large, I have to split the reads and run the pipeline individually. use split.sh to split the reads.
Kasey dict: /research_jude/rgs01_jude/groups/tsaigrp/projects/Genomics/common/projects/Cas9mutagenesis/pacbio_230301_293246_tsaigrp_Amplicon/pacbio_cas9mut_amp_yli11_2023-07-13
KJ001.read_stat.csv.all_dictionary.tsv
KJ002.read_stat.csv.all_dictionary.tsv
Kiera dict: /research_jude/rgs01_jude/groups/tsaigrp/projects/Genomics/common/projects/pacbio/825386_tsaigrp_Amplicon_Kiera/create_barcode_yli11_2024-05-22
2. Compare enrichment of mutagenesis assay¶
See main_v2.lsf pipeline description. Code is generic to Cas9 or PE/RT.
Usually Kesey run this pipeline herself. Sometimes error happens when the input format is wrong, e.g., upper-case lower-case, tab/space, name not match in fastq.tsv and design_matrix.
Interactive heatmap is generated by the pipeline. This is how to run it manually.
src=/home/yli11/Tools/Cas9Variants/Mammalian_system
module load conda3/202011
source activate /home/yli11/.conda/envs/captureC
cd {{jid}}
comparison=${COL1}_vs_${COL2}
interactive_heatmap.py -f $comparison.AA_compare.csv --reformat_config liyc --header -o $comparison.AA_compare.interactive.html
PAM specificity assay¶
code: https://github.com/tsailabSJ/LentiviralPAMspecificity
1. Dictionary generation¶
2. Calculate enrichment¶
Pooled-GUIDE-seq¶
pooled gRNA design: /research_jude/rgs01_jude/groups/tsaigrp/projects/Genomics/common/projects/Azusa_pooled_GUIDE
code: guideseq_pool_gRNA_design_given_genes.py guideseq_pool_gRNA_design.py
guideseq_pool_gRNA_design_given_genes.py is unfinished I think.
This is how I used to generate the library:
module load conda3/202011
source activate /home/yli11/.conda/envs/captureC
guideseq_pool_gRNA_design.py -h
guideseq_pool_gRNA_design.py -i tcell_exon.bed --sample 21083 -o guideseq_pool_run1.csv
GUIDE-seq, CHANGE-seq, CHANGE-seq BE¶
They all know how to run them.
CHANGE-seq-BE:/research_jude/rgs01_jude/groups/tsaigrp/projects/Genomics/common/src/changeseq_py3
GUIDE-seq:/research_jude/rgs01_jude/groups/tsaigrp/projects/Genomics/common/src/changeseq_py3
CHANGE-seq:/research_jude/rgs01_jude/groups/tsaigrp/projects/Genomics/common/src/changeseq
PARADIGM¶
variant design code: /research_jude/rgs01_jude/groups/tsaigrp/projects/Genomics/common/projects/Paradigm/Yichao_code
pipeline code: https://github.com/tsailabSJ/PARADIGM_code
R code to calculate different crisprscores
cutadapt used to extract variable sequence given two flanking sequences
Genetic variation¶
data and code: /research_jude/rgs01_jude/groups/tsaigrp/projects/Genomics/common/projects/GUIDEseq/02172023_GUIDEseq2_GV_Novaseq
randomized CHANGE-seq¶
code: https://github.com/tsailabSJ/changeseq_randomized
/research_jude/rgs01_jude/groups/tsaigrp/projects/Genomics/common/projects/CHANGE_seq/liyc_Mixed_Base_analysis
Usually Ashely run this pipeline herself.