Our research interests have been mainly focused on ocular informatics. Specifically, my group develops and applies bioinformatics approaches to study gene regulation and signaling networks, with particular but not exclusive attention to the mammalian retina. Understanding the molecular basis of tissue specific gene regulation and signaling will contribute to better prevention, diagnosis and treatment of retinal disease.

Tissue specific gene regulation. To understand tissue-specific gene regulation, we have developed a computational framework to predict the target genes of tissue specific transcription factors (TFs). The framework is based on a Bayesian approach to integrate gene expression, genomic sequences, and TF binding motifs. We applied this approach to analyze retina-specific TFs, and predicted the respective target genes of the retina-specific TFs [Qian et al, Nucleic Acids Research, 2005]. This study provided the first comprehensive analysis of a regulatory network controlled by retina-specific TFs.

Building upon this computational study of target prediction for single retinal TFs, we next explored combinatorial gene regulation by multiple TFs. We proposed a novel approach for a large-scale analysis of TF interactions. In this approach, we evaluated the relationships between TFs using the relative position and co-occurrence of their binding sites in gene promoters. my group first tested the approach with data sets from Saccharomyces cerevisiae (yeast) [Yu et al, Nucleic Acids Research, 2006a]. We then applied the approach to 30 human tissues, including retina [Yu et al, Nucleic Acids Research, 2006b]. This study provided novel insights into tissue-specific gene regulation. First, the study confirmed that non-tissue specific TFs can play important roles in regulating tissue-specific genes. Second, it demonstrated that individual TFs can contribute to tissue specificity in different tissues by interacting with distinct TF partners. Additionally, the study identified several tissue-specific TF clusters that may play important roles in tissue-specific gene regulation.

To demonstrate the usefulness of the identified TF-TF interactions, we utilized the identified interactions to predict cis-regulatory modules in the promoters of tissue specific genes. This approach greatly increases the signal-to-noise ratio for target and binding site prediction [Yu et al, BMC Bioinformatics, 2007]. Developing further upon this approach, we constructed functional networks that can be used to identify novel retinal disease genes and predict biological function of retinal genes [Hu et al, Bioinformatics, 2010] and generated a database, TiGER, to provide a user-friendly interface of our results [Liu et al, BMC Bioinformatics, 2008].

Crosstalk between TFs and microRNAs. To understand the role of miRNAs in retinal development, my group collaborated with Dr. Don Zack to generate a global expression profile of miRNA expression during retinal development and also identified differences in miRNA expression between adult rod and cone photoreceptors. The study revealed dozens of miRNAs that show significant expression changes during retinal development. This definition of precise patterns of expression of miRNAs during development has provided insights into their function [Hackler et al, IOVS, 2010].

To understand how TFs and miRNAs work together in regulating gene expression, we analyzed the interaction patterns between TFs and miRNAs in regulatory networks. This work helped to define some of the deeper levels of regulation that were not previously appreciated. For example, we found that a regulated feedback loop, in which two TFs regulate each other and one miRNA regulates both of the factors, is the most significantly overrepresented network motif [Yu et al, Nucleic Acids Research, 2008]. Mathematical modeling shows that the miRNA in such motifs stabilizes the feedback loop to resist environmental perturbation, providing one mechanism to explain the robustness of developmental programs that is contributed by miRNAs. Our results demonstrate that the TFs and miRNAs extensively interact with each other and that the biological functions of miRNAs may be wired in the regulatory network topology.

We then applied this integrative analysis method to the retinal development and obtained novel insights into the organization of regulatory networks in the retina. We found that the active networks are topologically different at early and late stages of retinal development. At early stages, the active sub-networks tend to be highly connected, while at late stages, the networks are more organized in modular structures. Interestingly, network motif usage at early and late stages is also distinct. For example, network motifs containing reciprocal feedback regulatory relationships between two regulators are overrepresented in early developmental stages [Hwang, et al, PLoS One, 2012].

DNA methylation dependent TF-DNA interactions. Based on a protein microarray-based approach, which was developed by my collaborator Dr. Heng Zhu in the Department of Pharmacology [Hu et al, Cell, 2009], we systematically surveyed the entire human TF family and found that numerous purified TFs demonstrated methylated CpG (mCpG)-dependent DNA-binding activities [Hu et al, eLife, 2013]. To elucidate the underlying mechanism, we focused on Kruppel-like factor 4 (KLF4), and decoupled its mCpG- and CpG-binding activities via site-directed mutagenesis. Furthermore, we found that KLF4 binds to specific methylated or unmethylated motifs in human embryonic stem cells in vivo. While DNA methylation at promoter regions has been generally considered as a potent epigenetic modification that inhibits transcription factor (TF) recruitment, our study challenges the classical view on DNA methylation and suggests that mCpG-dependent TF binding activity is a widespread phenomenon.

Our bioinformatics work on integrating DNA methylation with other genomic features has also revealed an unexpected role of DNA methylation related to RNA splicing. The general view of RNA alternative splicing is that it occurs and is regulated at a post-transcriptional level [Wan et al, Nucleic Acids Research, 2011]. However, by integrating our work on methylation patterns and cell type-specific gene expression, we found that DNA methylation might also be involved in regulating alternative splicing [Wan et al, Nucleic Acids Research, 2013]. In addition, my group also discovered that a significant portion of tissue-specific differentially methylated regions are positively correlated with gene expression. The finding challenges the classical dogma that DNA methylation suppresses gene expression [Wan et al BMC Genomics, 2015]. The role of DNA methylation in transcription regulation was also explored [Hwang et al, BMC Genomics, accepted].

Recently, in collaboration with Shannath Merbs’ group, we investigated a possible epigenetic contribution to age-related macular degeneration (AMD), and identified several differentially methylated sites that are associated with disease [Oliver et al, Epigenetics, 2015]. These results lay the foundation for further mechanistic studies of the role of DNA methylation in tissue-specific regulation and in disease-related mechanisms.

Contruction of human signaling networks. To construct comprehensive signaling networks in humans, we developed a combined bioinformatics and protein microarray-based approach to experimentally identify substrates for 289 unique kinases, resulting in 3,656 high-quality kinase-substrate relationships (KSRs) [Newman et al Molecular Systems Biology, 2013]. Comparison of the human and yeast phosphorylation networks revealed that, although most KSRs are not well conserved, there exists an evolutionarily conserved kinase-to-kinase backbone [Hu et al, BBA-Proteins and Proteomics, 2014]. Moreover, our team’s identification of 300 new phosphorylation motifs revealed a complex landscape for KSR specificity, challenging the current view that kinases with similar catalytic domains recognize similar motifs. Using this dataset, we constructed a high-resolution phosphorylation network, and predicted a number of previously unknown kinase functions. We also constructed an interactive database to make the signaling networks and associated analytic tools available to the community [Hu et al, Bioinformatics, 2014]. Recently, my group performed a systematic analysis on scaffold proteins, which act as a “molecular glue”, linking multiple components in a pathway together to facilitate signal transduction. In this study, we proposed two possible regulatory mechanisms by which the activity of scaffold proteins is coordinated with their associated pathways through phosphorylation process. This study provides a useful framework to understand the vital role of scaffold proteins in cellular signal transduction [Hu et al, PLoS Comput. Biol., 2015].