Transcript Document
Transcription in Eukaryotes by Jean-Pierre Herveg, Etienne De Plaen and a lot of friends at the Brussels Branch of the Ludwig Institute for Cancer research (Licr) and the Christian de Duve* Institute for cellular Patholgy (ICP). April 2006 Université Catholique de Louvain Avenue E. Mounier, 1200 Brussels (Belgium) Questions 1. In Prokaryotes, the sigma factor helps the RNA pol to recognize a promoter. How is this done in Eukaryotes ? 2. Describe a eukaryotic promoter. 3. What are the three main postranscriptional modifications in eukaryotes ? 4. What is a lariat ? 5. How can the sequence of a pseudogene be recognized ? Transcription in Eukaryotes In eukaryotes, DNA is contained within the nucleus, where DNA is transcribed into RNA. RNA must then be carried across the nuclear pores (exported) into the cytosol. If in prokaryotes, transcription is performed by a single RNA pol, in eukaryotes, transcription is performed by 3 different RNA pols: RNA pol I transcribes 5.8 S, 18 S, and 28 S ribosomal RNA in the nucleolus. RNA pol II transcibes m RNA and the small nuclear RNA (snRNA) RNA pol III transcribes 5 S rRNA as well as all the tRNA species. * S mean Svedberg and is a unit of sedimentation. "S" es la unidad Svedberg de sedimentación. Svedberg es un físico sueco que inventó el centrífugo analítico. "S" es una unidad de tiempo, 1S = 10 -13 segundos 16S es una sustancia que sédimente a 16S en esta máquina. 16S Está también la constante de sedimentación de ARN de la pequeña unidad del ribosome en el procarioticos o small subunit ribosomal RNA (SSU ARNr). En eucarioticos esta secuencia es 18S. Ahora, no se medie más velocidad de sedimentación de ARN para compararlos. Se compara sus secuencias. RNA pol I and III RNA polymerase I (5.8 S, 18 S,and 28 S ribosomal RNA ) RNA polymerase I transcribes only the genes for ribosomal RNA, from a single type of promoter. The transcript includes the sequences of both large and small rRNAs, which are later released by cleavages and processing. There are many copies of the transcription unit, alternating with nontranscribed spacers, and organized in a cluster. RNA polymerase III (5S rRNA and tRNA species). The promoters fall into two general classes that are recognized in different ways by different groups of factors. The promoters for 5S and tRNA genes are internal; they lie downstream of the startpoint. The promoters for snRNA (small nuclear RNA) genes lie upstream of the startpoint in the more conventional manner of other promoters. In both cases, the individual elements that are necessary for promoter function consist exclusively of sequences recognized by transcription factors, which in turn direct the binding of RNA polymerase. RNA pol II general transcription factors (TF II) instead of the prokaryotic s factor RNA Pol II does not contain a subunit similar to the prokaryotic s factor, which can recognize the promoter and unwind the DNA double helix. In eukaryotes, these two functions are carried out by a set of proteins called general transcription factors. The RNA Pol II is associated with six general transcription factors, designated as TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH, where "TF" stands for "transcription factor" and "II" for the RNA Pol II. TATA-box binfing protein and TAFs TFIID consists of TBP (TATA-box binding protein) and TAFs (TBP associated factors). The role of TBP is to bind the “TATA” core promoter. TAFs may assist TBP in this process. In human cells, TAFs ar formed by 12 subunits. One of them, TAF250 (with molecular weight 250 kD), has the histone acetyltransferase activity, which can relieve the binding between DNA and histones in the nucleosome. Pre-initation complex (PIC) The transcription factor which catalyzes DNA melting is TFIIH. However, before TFIIH can unwind DNA the RNA Pol II and at least five general transcription factors (TFIIA is not absolutely necessary) have to form a pre-initiation complex (PIC). Elongation After PIC is assembled at the promoter, TFIIH, an helicase, can unwind DNA. This requires energy released from ATP hydrolysis. Then, RNA Pol II (NTPs) to synthesize a RNA transcript. During RNA elongation, TFIIF remains attached to the RNA polymerase, but all of the other transcription factors have dissociated from PIC. The carboxyl-terminal domain (CTD) of the largest subunit of RNA Pol II is critical for elongation. In the initiation phase, CTD is unphosphorylated, but during elongation it has to be phosphorylated. Termination Eukaryotic protein genes contain a poly-A signal located downstream of the last exon. This signal is used to add a series of adenylate residues during RNA processing. Transcription often terminates at 0.5 - 2 kb downstream of the poly-A signal, but the mechanism is unclear. initiation The promoter: Eukaryotic RNA pols lack the s factor found in the prokaryotic enzyme. Instead of the Pribnow box, a TATA box, is found in most eukaryotic genes. It is located at approximately -25. Many promoters have a CAAT box and some a GC box, both at around -40 to -110 bases upstream. The location of these additional elements can vary, and they can be present on either strand. The basal transcription machinery: TF means trascription factor and II indicates that this TF belongs to the RNA pol II family of enzymes. TF II bind to the promoter region, guiding the polymerase to this site. They form the basal transcription machinery. The initial event in the process is the recognition of the TATA box by the TATA box-binding protein, a component of TFIID. This is followed by the sequential binding of other factors, including TFIIA, TFIIB, RNA polymerase II, and TFIIE. Enhancers and silencers: Enhancer sequences, which can be located several thousand bases upstream, downstream, or in the middle of the transcribed region, can also bind proteins which stimulate transcription. These are often tissue- and species-specific, explaining the regulation of genes in some tissues, and the host range of viruses which have usurped these sequences to stimulate transcription of their own genes. ------------------------------------question Describe a eukaryotic promoter Acetylation: to separate DNA from the nucleosomes In eukaryotes, the association between DNA and histones prevents access of the polymerase and general transcription factors to the promoter. Histone acetylation catalyzed by HATs can relieve the binding between DNA and histones. Although a subunit of TFIID (TAF250 in human) has the HAT activity, participation of other HATs can make transcription more efficient. The following rules apply to most (but not all) cases: Binding of activators to the enhancer element recruits HATs to relieve association between histones and DNA, thereby enhancing transcription. Binding of repressors to the silencer element recruits histone deacetylases (denoted by HDs or HDACs) to tighten association between histones and DNA. Methylation: to silence genes ! Experimental evidence has shown that in certain cells there are heavily methylated genes and these genes are not expressed. On the other hand cells that have non-methylated forms of these genes are expressed. An example of these is seen in housekeeping cells (cells that produce proteins used in "clean up" of cellular debris and dead organelles) in which cells with non-methylated genes for these cells continuously transcribe or produce the materials needed to make housekeeping cells. Also one of the X chromosomes in females is not expressed. A hypothesis for this phenomenon is linked to the heavy methylation of the inactive X chromosome. TAFs: TAFs may assist TBP in connecting the basal transciption machinery to enhancers or silencers. In human cells, TAFs are formed by 12 subunits. One of them, TAF250 (with molecular weight 250 kD), has the histone acetyltransferase activity, which can relieve the binding between DNA and histones in the nucleosome. -----------------------------------question In Prokaryotes, the sigma factor helps the RNA pol to recognize a promoter. How is this done In Eukaryotes ? elongation TFIIH can now use its helicase activity to unwind DNA. This requires energy released from ATP hydrolysis. The DNA melting starts from about -10 bp. Then, RNA Pol II uses nucleoside triphosphates (NTPs) to synthesize a RNA transcript. During RNA elongation, TFIIF remains attached to the RNA polymerase, but all of the other transcription factors have dissociated from PIC (pre-initiation complex). The carboxyl-terminal domain (CTD) of the largest subunit of RNA Pol II is critical for elongation. In the initiation phase, CTD is unphosphorylated, but during elongation it has to be phosphorylated. This domain contains many proline, serine and threonine residues. termination Eukaryotic protein genes contain a poly-A signal located downstream of the last exon. This signal is used to add a series of adenylate residues during RNA processing. Transcription often terminates at 0.5 - 2 kb downstream of the poly-A signal, but the mechanism is unclear. postranscritional modifications Capping Modification of the 5'-ends of eukaryotic mRNAs is called capping. The cap consists of a methylated GTP linked to the rest of the mRNA by a 5' to 5' triphosphate "bridge” (Cap Structure). Capping occurs very early during the synthesis of eukaryotic mRNAs, even before mRNA molecules are finished being made by RNA polymerase II. Capped mRNAs are very efficiently translated by ribosomes to make proteins. In fact, some viruses, such as poliovirus, prevent capped cellular mRNAs from being translated into proteins. This enables poliovirus to take over the protein synthesizing machinery in the infected cell to make new viruses. Polyadenylation Modification of the 3'-ends of eukaryotic mRNAs is called polyadenylation (Polyadenylation Pathway). Polyadenylation is the addition of several hundred A nucleotides to the 3' ends of mRNAs. Polyadenylation signal All eukaryotic mRNAs destined to get a poly A tail (note: most, but not all, eukaryotic mRNAs get such a tail contain the sequence AAUAAA about 11-30 nucleotides upstream to where the tail is added. AAUAAA is recognized by an endonuclease that cuts the RNA, allowing the tail to be added by a specific enzyme: polyA polymerase. Splicing (trans or cis): ------------------------Question What are the three main postranscriptional modifications in eukaryotes ? splicing Another major difference between prokaryotic and eukaryotic mRNA is the occurence of splicing in eukaryotes. This process removes intervening sequences (introns) from the primary transcript, and precisely assembles a set of exons which form the transcript that is translated. The fragments which are removed are not random. they have consensus sequences at the 5’ and 3' splice sites. Exons end with the sequence AG and begin with a G, and Introns begin with GU and end with AG. About 20-50 bases upstream from the 3' splice site is an adenine residue known as the branch site: Splicing occurs through the production of a lariat intermediate. The 2'-OH of adenine (A) residue of the branch site attacks the phosphorous atom at the 5' splice site, cleaving the bond while generating a lariat loop, with the branch residue having 3 phosphodiester bonds, at the 2', 3', and 5' atoms. The 3' end of the 5' exon attacks the phosphorous atom at the 3' splice site, joining the two exons and releasing the lariat: Note that the number of phosphodiester bonds remains the same throughout this reaction; it proceeds through transesterification, not hydrolysis and ligation. These reactions are mediated with small nuclear ribonucleoprotein particles (snRNPs) an small cytoplasmic particles (scRNPs) consisting of snRNAs and scRNAs and specific proteins, forming spliceosomes. Spliceosomes recognize and align the splice sites, and prevent the intermediates from leaving the complex until the reaction is complete. The RNA molecules present in the spliceosomes base-pair with the primary transcript at the splice site to assist in this reaction. The snRNPs referred to as U1 and U2 recognize and bind to the end of exon 1 and the branch site, respectively. A complex of U4, U5, and U6 then joins to complete the spliceosom ---------------------------Question What is a lariat ? Base pairing appears to be involved in the recognition of these sites. The snRNA contained in U1 has a sequence which matches the consensus for the beginning of an intron, and the snRNA of U2 has a sequence which is complementary to the branch site. The U2 and U6 snRNPs together probably function to form the catalytic site of the complex: alternative splicing For those who are interested, there is also a self splicing RNA: http://138.192.68.68/bio/Courses/biochem2/RNA/SelfSplicingRNA.html An intron inside a vector Mammalian vectors have often an intron before the polylinker, This intron increase transcription: capping Polyadenylation signal pseudogenes --------------------------------Question How can the sequence of a pseudogene be recognized ?