Genomic analysis of the hierarchical strucutre of regulatory networks

Haiyuan Yu*, and Mark Gerstein*

Abstract: A fundamental question in biology is how the cell uses transcription factors (TFs) to coordinate the expression of thousands of genes in response to various stimuli. The relationships between TFs and their target genes can be modeled in terms of directed regulatory networks. These relationships, in turn, can be readily compared with commonplace "chain-of-command" structures in social networks, which have characteristic hierarchical layouts. Here, we develop algorithms for identifying generalized hierarchies (allowing for various loop structures) and use these approaches to illuminate extensive pyramid-shaped hierarchical structures existing in the regulatory networks of representative prokaryotes (Escherichia coli) and eukaryotes (Saccharomyces cerevisiae), with most TFs at the bottom levels and only a few master TFs on top. These masters are situated near the center of the protein-protein interaction network, a different type of network from the regulatory one, and they receive most of the input for the whole regulatory hierarchy through protein interactions. Moreover, they have maximal influence over other genes, in terms of affecting expression-level changes. Surprisingly, however, TFs at the bottom of the regulatory hierarchy are more essential to the viability of the cell. Finally, one might think master TFs achieve their wide influence through directly regulating many targets, but TFs with most direct targets are in the middle of the hierarchy. We find, in fact, that these midlevel TFs are "control bottlenecks" in the hierarchy, and this great degree of control for "middle managers" has parallels in efficient social structures in various corporate and governmental settings.

Governmental Hierarchy in Macao
Macao Hierarchy
Regulatory Hierarchy in S. cerevisiae
Yeast Hierarchy
Regulatory Hierarchy in E. coli
E. coli Hierarchy

Supplementary Data
 1. S. cerevisize regulatory network
 2. E. coli regulatory network
 3. Both files are organized in the following way: Each row is a TF; the first gene is the TF, the rest are its targets; the files are tab-delimited.

*To whom correspondence may be addressed E-mail: mark.gerstein@yale.edu or haiyuan.yu@yale.edu

Last modified on Oct. 25th, 2006