- What are different scenarios of using BPAC
Table: Usage based on different scenarios
Input
|
Output
|
Common Input |
Motif PWM provided or not |
- Accessibility data (URL to the alignment bam file, number of reads or bam index file)
- pair-ended or not
- name/version of genome
- Regions of interest (restricted to open chromatin region)
|
PWM provided |
Binding probability |
PWM not provided (not recommended) |
Binding probability Possible TF |
- What need to prepare
- DNase-seq or ATAC-seq Alignment file in bam format and indicate if it is pair-ended or not, it can be generated using Bowtie2
- Number of reads in a file, the name of the file is bam file name with a postfix ".reads" or bam file index file (samtools index bam_file_name). e.g.
if bam file is example.bamm, then a file named example.bam.reads is recommended and put in the same directory of example.bam file, index file
example.bam.bai is also recommended in the same directory of example.bam
- Region of interest in bed format, 3 or 4 columns, e.g.
- 3 column format:
chr1 10000 10015
chr1 20000 20015
chr2 10000 10015
- 4 column format, the 4th column is PWM score:
chr1 10000 10015 11.5
chr1 20000 20015 2.4
chr2 10000 10015 8.5
You also need to call peaks using MACS2 to get open chromatin region and restrict your region of interest to open chromatin region.
- Provide genome name/version, currently support human and mouse, used for generating features and retrieving sequences
- The web interface only support up to 1000 regions. If you need to analyse more regions, you can generate feature files using scripts within the package.
In this case, you also need to download genome file, genome annotation with TSS locations, and phastCons file from UCSC genome browser website.
Feature file is tab delimited:
chrom start end pwm PhastCons TSS readProfile readProfileUp readProfileDown cutProfile cutProfileUp cutProfileDown fp_read fp_cut label
chr1 714180 714199 16.265 0.00263 0 93.895 46.0 17.263 15.0 5.263 8.210 0.677 0.905 1
The main difference online version and offline version is that feature file contains header in online version, and region of interest bed file is 3 or 4 columns in online version.
|