Initiator element
The initiator element (Inr), sometimes referred to as initiator motif, is a core promoter that is similar in function to the Pribnow box (in prokaryotes) or the TATA box (in eukaryotes). The Inr is the simplest functional promoter that is able to direct transcription initiation without a functional TATA box. It has the consensus sequence YYA+1NWYY in humans.[a][1] Similarly to the TATA box, the Inr element facilitates the binding of transcription Factor II D (TFIID).[1] The Inr works by enhancing binding affinity and strengthening the promoter.
Overview
[edit]The initiator element (Inr) is the most common sequence found at the transcription start site (TSS) of eukaryotic genes. It was originally described as a 17 bp element in 1989,[2] but other (newer and older) analyses have produced consensus sequences 2-9 bp in length.[3]
Inr in humans was first described in 1980 by Corden et al. as a broader TSS motif. It was first articulated and explained by two MIT biologists, Stephen T. Smale and David Baltimore in 1989.[2] Their research showed that Inr promoter is able to initiate basal transcription in absence of the TATA box. In the presence of a TATA box or other promoters, the Inr increases the efficiency of transcription by working alongside the promoters to bind RNA polymerase II. A gene with both types of promoters will have higher promoter binding strength, easier activation and higher levels of transcription activity. The TFIID, which is a component of the RNA polymerase II preinitiation complex binds to both the TATA box and Inr. Two subunits, TAF1 and TAF2, of the TFIID recognize the Inr sequence and bring the complex together.[4] The interaction between TFIID and Inr is believed to be most imperative in initiating transcription. This is likey due to the Inr sequence overlapping the start site.[5] The Inr element is also believed to interact with activator Sp1, specificity protein 1 transcription factor. Sp1 is then able to regulate the activation and initiation of transcription [6]
Archaea have some conservation at the TSS that determines promoter efficiency, which makes it a kind of initiator element. There is however no identified homolog of TAF1/2, so it's unknown how the archaeal Inr works.[7]
Location and sequence
[edit]The Inr element encompasses, simply, the 2-9 bp around the transcription start site (+1) that usually follow a consensus sequence. The exact range of bases it encompasses varies by the choice of consensus. The original human consensus of 1980 was YYCA+1YYYYY. Through mutational analysis by Lo and Smale, the "functional" consensus sequence of Inr in humans was inferred to be YYA+1NWYY.[a] Human genome-wide CAGE data suggests a very simple consensus of YR+1. Vo ngoc et al. have characterized the Inr at focused core promoters (those with a single or a narrow cluster of start sites) and found BBCA+1BW.[3]
The consensus sequence in Drosophila is TCA+1KTY.[4]
The conserved consensus in archaea is YR+1. For Sulfolobus, the consensus for transcripts with 5' UTR of <4 nt is YR+1TG, while for the rest it's YR+1WMAAA. For the araS gene of Sulfolobus, the most functional sequence is G+1AGAMK.[7]
Evolutionary change
[edit]Studies have shown that promoters with a functional Inr are more likely to lack a TATA box or to possess a degenerate TATA sequence. This is because a gene with an active Inr is less dependent on a functional TATA box or additional promoters.[8] Although Inr element varies between promoters, the sequence is highly conserved between humans and yeast.[8] An analysis of 7670 transcription start sites showed that roughly 40% had an exact match to the BBCA+1BW Inr sequence. While 16% contained only one mismatch [9] TFIID and subunits are very sensitive to the Inr sequence and nucleotide changes have been shown to drastically change the binding affinity. The +1 and -3 positions have been identified as the most critical for transcription efficiency and Inr function.[8] A replacement of the Adenosine nucleotide at the +1 to G or T changes transcription activity by 10% and a replacement of Thymine at the +3 position changes transcription activity levels by 22%.[10]
Significance
[edit]The Inr element for core promoters was found to be more prevalent than the TATA box in eukaryotic promoter domains.[11] In a study of 1800+ distinct human promoter sequences it was found that 49% contain the Inr element while 21.8% contain the TATA box.[11] Out of those sequences with the TATA box, 62% contained the Inr element as well. Though the Inr element is not fully understood it has been recognized as the most frequently occurring sequence at the start site of genes in multiple species. Further research can allow for more understanding of the elements that regulate gene production.
Notes
[edit]- ^ a b In nucleic acid notation for DNA, Y (pYrimidine) stands for C/T (cytosine or thymine, which are both pyrimidines), N (Nucleobase) is any of the four bases, W (Weak) stands for A/T (adenine or thymine, which both form only two hydrogen bonds), and K stands for G/T (Keto). Subscript +1 indicates the transcription start site.
References
[edit]- ^ a b Xi, Hualin; Yong Yu; Yutao Fu; Jonathan Foley; Anason Halees; Zhiping Weng (June 2007). "Analysis of overrepresented motifs in human core promoters reveals dual regulatory roles of YY1". Genome Research. 17 (6): 798–806. doi:10.1101/gr.5754707. PMC 1891339. PMID 17567998.
- ^ a b Smale, Stephen T.; Baltimore, David (1989-04-07). "The "initiator" as a transcription control element". Cell. 57 (1): 103–113. doi:10.1016/0092-8674(89)90176-1. ISSN 0092-8674. PMID 2467742. S2CID 33929615.
- ^ a b Vo Ngoc, L; Cassidy, CJ; Huang, CY; Duttke, SH; Kadonaga, JT (1 January 2017). "The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters". Genes & development. 31 (1): 6–11. doi:10.1101/gad.293837.116. PMC 5287114. PMID 28108474.
- ^ a b Lim, Chin Yan; Santoso, Buyung; Boulay, Thomas; Dong, Emily; Ohler, Uwe; Kadonaga, James T. (2004-07-01). "The MTE, a new core promoter element for transcription by RNA polymerase II". Genes & Development. 18 (13): 1606–1617. doi:10.1101/gad.1193404. ISSN 0890-9369. PMC 443522. PMID 15231738.
- ^ Kaufmann, J.; Smale, S. T. (1994-04-01). "Direct recognition of initiator elements by a component of the transcription factor IID complex". Genes & Development. 8 (7): 821–829. doi:10.1101/gad.8.7.821. ISSN 0890-9369. PMID 7926770.
- ^ O'Shea-Greenfield, A.; Smale, S. T. (1992-01-15). "Roles of TATA and initiator elements in determining the start site location and direction of RNA polymerase II transcription". The Journal of Biological Chemistry. 267 (2): 1391–1402. doi:10.1016/S0021-9258(18)48443-8. ISSN 0021-9258. PMID 1730658.
- ^ a b Ao, X; Li, Y; Wang, F; Feng, M; Lin, Y; Zhao, S; Liang, Y; Peng, N (November 2013). "The Sulfolobus initiator element is an important contributor to promoter strength". Journal of bacteriology. 195 (22): 5216–22. doi:10.1128/JB.00768-13. PMID 24039266.
- ^ a b c Yang, Chuhu; Bolotin, Eugene; Jiang, Tao; Sladek, Frances M.; Martinez, Ernest (2007-03-01). "Prevalence of the Initiator over the TATA box in human and yeast genes and identification of DNA motifs enriched in human TATA-less core promoters". Gene. 389 (1): 52–65. doi:10.1016/j.gene.2006.09.029. ISSN 0378-1119. PMC 1955227. PMID 17123746.
- ^ Ngoc, Long Vo; Cassidy, California Jack; Huang, Cassidy Yunjing; Duttke, Sascha H. C.; Kadonaga, James T. (2017-01-20). "The human initiator is a distinct and abundant element that is precisely positioned in focused core promoters". Genes & Development. 31 (1): 6–11. doi:10.1101/gad.293837.116. ISSN 0890-9369. PMC 5287114. PMID 28108474.
- ^ Javahery, R; Khachi, A; Lo, K; Zenzie-Gregory, B; Smale, S T (1994-01-01). "DNA sequence requirements for transcriptional initiator activity in mammalian cells". Molecular and Cellular Biology. 14 (1): 116–127. doi:10.1128/mcb.14.1.116. ISSN 0270-7306. PMC 358362. PMID 8264580.
- ^ a b Gershenzon, Naum I.; Ioshikhes, Ilya P. (2005-04-15). "Synergy of human Pol II core promoter elements revealed by statistical sequence analysis". Bioinformatics. 21 (8): 1295–1300. doi:10.1093/bioinformatics/bti172. ISSN 1367-4803. PMID 15572469.