BPAC: A universal model for prediction of transcription factor binding sites based on chromatin accessibility.

Main    FAQ    Download
  1. What are different scenarios of using BPAC

    Table: Usage based on different scenarios

    Input

    Output

    Common Input

    Motif PWM provided or not

    • Accessibility data (URL to the alignment bam file, number of reads or bam index file)
    • pair-ended or not
    • name/version of genome
    • Regions of interest (restricted to open chromatin region)

    PWM provided

    Binding probability

    PWM not provided
    (not recommended)

    Binding probability

    Possible TF



  2. What need to prepare
    • DNase-seq or ATAC-seq Alignment file in bam format and indicate if it is pair-ended or not, it can be generated using Bowtie2
    • Number of reads in a file, the name of the file is bam file name with a postfix ".reads" or bam file index file (samtools index bam_file_name). e.g. if bam file is example.bamm, then a file named example.bam.reads is recommended and put in the same directory of example.bam file, index file example.bam.bai is also recommended in the same directory of example.bam
    • Region of interest in bed format, 3 or 4 columns, e.g.
      • 3 column format:
        chr1     10000     10015
        chr1     20000     20015
        chr2     10000     10015
      • 4 column format, the 4th column is PWM score:
        chr1     10000     10015     11.5
        chr1     20000     20015     2.4
        chr2     10000     10015     8.5
      You also need to call peaks using MACS2 to get open chromatin region and restrict your region of interest to open chromatin region.
    • Provide genome name/version, currently support human and mouse, used for generating features and retrieving sequences
    • The web interface only support up to 1000 regions. If you need to analyse more regions, you can generate feature files using scripts within the package.
      In this case, you also need to download genome file, genome annotation with TSS locations, and phastCons file from UCSC genome browser website. Feature file is tab delimited:
  3. chrom  start  end  pwm  PhastCons  TSS  readProfile  readProfileUp  readProfileDown  cutProfile  cutProfileUp  cutProfileDown  fp_read  fp_cut  label
    chr1  714180  714199  16.265  0.00263  0  93.895  46.0  17.263  15.0  5.263  8.210  0.677  0.905  1

    The main difference online version and offline version is that feature file contains header in online version, and region of interest bed file is 3 or 4 columns in online version.



Maintained by Dr. Jiang Qian and Dr. Sheng Liu at the Qian's Bioinformatics Lab, Johns Hopkins School of Medicine