Version 1.0
TiGER (Tissue-specific Gene Expression and Regulation) is a database for generating comprehensive information about human tissue-specific gene regulation, including both expression and regulatory data.
Currently, the database contains tissue-specific expression profiles for ~20,000 UniGene genes, combinatorial regulation for 7,341 interacting transcription factor (TF) pairs, and 6,232 cis-regulatory modules (CRMs) for tissue-specific genes.
The data are organized as a relational database, which provides three different views: gene view, TF view, and tissue view. Each view has a query interface and associated underlying database tables/entities:
The gene view contains three major entities: (1) “EST” entity that stores enrichment values in 30 tissues for each gene; (2) “CRM” entity that stores the conservation profile, the density profile, and the energy profile used for CRM detections in the promoter region of each gene; and (3) “GeneCode” entity that stores the mapping between UniGene, RefSeq and gene symbol.
The TF view contains one major entity called “TF-Partner”. This entity stores all factors that interact with a given TF, the tissue in which the interaction occurs and the significance (-log(p)) of the interaction.
The tissue view contains three major entities: (1) &“TSS-Genes” entity that stores genes preferentially expressed in each of the 30 tissues; (2) “TSS-TFs” entity that stores interactions between TFs in each of the 30 tissues; and (3) “TSS-CRMs” entity that stores CRM modules in the promoter regions of tissue-specific genes.
We have included P-values for the enrichment scores in EST download file, where the 1st column is tissue name, the 2nd column is the enrichment score, and the 3rd column is the –log10(P) value. We define a gene as tissue-specific gene if it satisfies the two criteria: the enrichment score is greater than 5 and the P-value is smaller than 10-3.5.
We have included P-values for TF interaction in summary table and download file. To evaluate TF interaction results, we use known interactions as positive control due to the scarcity of tissue-specific interaction. More than 40% of the known interactions are recovered, with 84-fold enrichment compared to the expected.
We have sensitivity of 12% and enrichment of 10 using known regulatory regions as positive control.
We obtained DNA sequences and annotations (such as RefSeq) from the Human May 2004 (hg17) assembly of UCSC genome browser. Conservation score data was also downloaded from the multiple alignments of 8 vertebrate genomes with Human (hg17) at UCSC genome browser. The EST database was downloaded from NCBI website in 2005.
As more experimental data accumulates related to the nature of TF-DNA interactions, we plan to further develop our predictions on tissue-specific TF interactions. We also plan to extend our work on CRM detection by relating regulatory elements with temporal (e.g., development) and spatial (e.g., cell types) attributes. As new predictions on tissue-specific gene regulation accumulate, the TiGER database will need to be further expanded and modified. We will update the content of the database on a regular basis.
TiGER is constructed for free access and use. The downloadable data formats include standard .txt text files and .png images.