| Gene | T2H3.9 |
| Putative Identification | GTP pyrophosphokinase |
| Position | 25585 - 30524, from within the CDS to the termination codon |
| Strand | - |
| EST hits | Z34769 |
| Database match | bacterial GTP pyrophosphokinases |
CDS: The table below lists the coordinates of the T2H3.9 exons and which exon prediction algorithms selected the 3' and 5' termini (GS = GenScan, Gr = GRAIL, M = MZEF, NPG = NetPlantGene - selects splice sites only, not exons). Splice sites delineated by identity to EST Z34769 are designated by EST and those suggested by similarity to EST N38487 are designated by est.
| Exon | Range | 3' | 5' |
|---|---|---|---|
| 1 | 30336 - 30524 | Gr, M, NPG | Gr, M, NPG |
| 2 | 29969 - 30151 | GS, Gr, NPG | Gr, M, NPG |
| 3 | 29818 - 29883 | GS, M, NPG | GS, M, NPG |
| 4 | 29357 - 29488 | GS, Gr, M, NPG | GS, Gr, M, NPG |
| 5 | 29143 - 29244 | GS, Gr, NPG | GS, Gr, NPG |
| 6 | 28990 - 29067 | GS, Gr, NPG | GS, Gr, NPG |
| 7 | 28827 - 28898 | GS, M, NPG | see note below |
| 8 | 28545 - 28652 | GS, Gr, M, NPG | GS, Gr, M |
| 9 | 28268 - 28365 | GS, Gr, M, NPG | GS, Gr, M |
| 10 | 28125 - 28185 | GS, M, NPG | GS, Gr, M, NPG |
| 11 | 27947 - 28012 | GS, NPG | GS, M |
| 12 | 27630 - 27731 | est, GS, Gr, M, NPG | est, GS, Gr, M |
| 13 | 27319 - 27477 | GS, Gr, NPG | est, GS, Gr, NPG |
| 14 | 27071 - 27202 | GS, Gr, M, NPG | GS, Gr, M, NPG |
| 15 | 26897 - 26983 | GS, Gr, M, NPG | GS, Gr, M, NPG |
| 16 | 26706 - 26777 | Gr, NPG | Gr, NPG |
| 17 | 26490 - 26576 | GS, Gr, M, NPG | GS, Gr, M, NPG |
| 18 | 25966 - 26413 | EST, GS, Gr, M, NPG | GS, Gr, M, NPG |
| 19 | 25818 - 25879 | EST, Gr, NPG | EST, GS, Gr, NPG |
| 20 | 25585 - 25731 | EST, Gr, M, NPG |
Note regarding exon 7: GenScan predicts the 5' end of exon 7 at position 28952. This would make intron 6 38 nt in length. Although this splice site maintains the reading frame, the resulting intron is too short to be considered as more than purely conjectural. MZEF predicts the 5' end of exon 7 at 28908. Although the resulting intron 6 is 82 nt, the reading frame is not preserved. Neither NetPlantGene nor GRAIL predict splice acceptors in this region. I have chosen an AG "consensus" splice acceptor at 28898 in order that the reading frame be preserved and preserved in a manner that good similarity to other proteins is maintained. The 8 amino acids bridging this splice site are highlighted in blue in the multiple sequence alignment below. It should be noted though that this gene could be processed in a manner different to the model presented here or it could encode a pseudogene.
Alternate exons not used in building the gene model: GenScan predicts that genes T2H3.9 and T2H3.J are joined as a single gene. Exons near T2H3.J are described. GenScan predicts exons from 29969 to 30021, from 28827 to 28952, from 27801 to 27857 and a terminal exon from 25773 to 25879. GRAIL predicts internal exons from 29571 to 29663, from 28088 to 28185, from 27801 to 27882 and from 25697 to 25731. MZEF predicts exon 2 from 30138 to 30151, from 28827 to 28909, from 27921 to 28102 and from 25663 to 25731. NetPlantGene predicts many splice sites in the region of T2H3.H. Notable splice sites include acceptors at 29681 (confidence score of 0.94), 29659 (0.97), 28009 (0.90) and donors at 30138 (0.92) and 25355 (0.96).
Complete CDS of T2H3.9
TGTCTGTGGAATGTGTGAACATATGTAATCTAACGAAAGGAGATGGGAATGCAAGAAGCG ATTGCAGTGCTCTCTCCTGTGCTTGGAAAGCTCCAAGAGCGTTAACTGGGTTTCTAGCTA GCACTGCTCATCCACCTGTGTGTTCTGTGTATTCATGTGGCAGAAATGGAAGAAAGAGCA GAATGAAAGCTTGCGCCTGGCAGAGGTATGAATATGAAGTAGGCTTTTCTGAGGCTCCTT ACTTTGTAAATGTGAGAAATATCTTGAAGTCCAGATTATCTTGTGGTGGTCATAAAAGAT GGGAACTGTATTGCGTATCAGCTGAATCTTCTTCTGGTGCATCCAGTGATGTTACCGTCG AAACATTGTGGGAGGACCTTTTCCCATCAATATCTTATCTACCCCGTAAAGAATTAGAAT TTGTTCAAAAGGGCCTTAAGGAATTGGATTGGGAGTCTATTGTTGCTGGATTACTACATG ACACAGTCGAGGATACAAATTTCATTACTTTTGAAAAGATAGAAGAAGAGTTTGGTGCAA CTGTGCGTCACATCGTAGAAGGGGAGACCAAGGTGTCAAAACTGGGAAAGTTAAAGTGTA AAACGGAAAGTGAAACAATACAAGATGTAAAAGCAGATGATTTGCGGCAGATGTTTCTGG CGATGACAGACGAGGTCCGCGTCATTATTGTCAAACTAGCTGACCGGTTGCATAATATGC GAACTCTCTGCCACATGCCTCCCCATAAGCAGTCCAGCATTGCAGGGGAGACTTTGCAGG TCTTTGCTCCTTTAGCAAAATTATTGGGAATGTATTCAATAAAGTCTGAACTGGAAAATC TGTCTTTCATGTACGTAAGTGCTGAGGATTATGATAGAGTCACTAGCAGGATTGCTAACC TCTACAAAGAGCATGAAAAAGAACTCACTGAGGCAAACAGAATTTTGGTGAAAAAGATTG AAGATGATCAGTTTCTGGACCTTGTGACTGTGAATACTGATGTTCGATCTGTTTGCAAGG AAACTTACAGCATCTACAAAGCTGCTCTCAAATCGAAAGGATCAATTAATGATTACAACC AGATTGCTCAGCAGTTACGGATTGTTGTAAAGCCAAAACCATCTGTAGGGGTCGGGCCTT TGTGCAGTCCACAACAGGTAAAAGATTACATTGCAACCCCGAAGCCCAATGGATACCAGA GCCTCCATACTACTGTGATTCCATTCTTGTATGAGAGTATGTTTCGACTGGAGGTTCAGA TCAGAACCGAGGAGATGGACTTGATTGCTGAAAGGGGCATTGCTGTTTACTACAATGGCA AGTCTCTATCTACTGGATTAGTTGGAAACGCGGTTCCTTTAGGTAGAAATTCAAGGGGGA AGACGGGTTGCCTCAACAATGCAGATTTTGCACTCAGGGTTGGGTGGCTAAATGCAATAA GGGAATGGCAAGAGGAGTTTGTGGGTAACATGAGCTCTAGAGAATTTGTGGATACCATTA CGAGGGATCTTTTAGGTAGTCGTGTGTTTGTATTCACACCTAAAGGAGAGATAAAGAACC TCCCGAAAGGGGCCACCGTTGTTGACTACGCTTATCTGATTCACACCGAAATCGGAAACA AGATGGTAGCAGCAAAGGTCAATGGTAATCTTGTTTCCCCAACTCACGTTCTTGAGAATG CTGAGGTCGTGGAGATAGTCACCTACAACGCCCTCTCAAGTAAATCTGCTTTCCAAAGAC ATAAACAGTGGTTGCAACATGCCAAAACAAGGAGTGCAAGACACAAGATTATGAGGTTCC TAAGGGAGCAAGCTGCACAATGTGCTGCCGAAATTACCCAGGATCAAGTGAATGACTTTG TGGCGGACTCTGATAGTGATGTGGAAGATCTCACAGAAGATTCAAGAAAGAGCCTACAAT GGTGGGAGAAAATCCTCGTCAATGTTAAGCAATTCCAGTCACAAGACAAAAGTAGAGATA CAACACCCGCTCCTCAAAACGGAAGCGTTTGGGCCCCAAAGGTGAATGGAAAACACAACA AAGCCATAAAGAACTCGAGTTCTGATGAGCCAGAGTTCCTCCTACCTGGAGATGGAATTG CCAGGATTTTACCTGCTAATATCCCTGCTTATAAGGAAGTGTTGCCCGGCTTAGACAGTT GGCGAGACAGTAAAATTGCCACATGGCATCATCTCGAAGGTCAGTCCATCGAATGGTTAT GTGTAGTATCCATGGATCGCAAAGGCATAATCGCAGAGGTTACAACAGTCCTCGCAGCTG AAGGCATTGCATTATGTTCTTGCGTGGCCGAGATTGACAGAGGAAGAGGATTAGCAGTAA TGTTATTTCAAATAGAAGCAAACATTGAAAGTTTGGTAATATGTCCTGTTGATCTAAACT CTCTTTTAACTCTTGTTTTTGTTCTTTTTGGATCTTCGATCACTAAACACTAA
Protein translation of T2H3.9
SVECVNICNLTKGDGNARSDCSALSCAWKAPRALTGFLASTAHPPVCSVYSCGRNGRKSR MKACAWQRYEYEVGFSEAPYFVNVRNILKSRLSCGGHKRWELYCVSAESSSGASSDVTVE TLWEDLFPSISYLPRKELEFVQKGLKELDWESIVAGLLHDTVEDTNFITFEKIEEEFGAT VRHIVEGETKVSKLGKLKCKTESETIQDVKADDLRQMFLAMTDEVRVIIVKLADRLHNMR TLCHMPPHKQSSIAGETLQVFAPLAKLLGMYSIKSELENLSFMYVSAEDYDRVTSRIANL YKEHEKELTEANRILVKKIEDDQFLDLVTVNTDVRSVCKETYSIYKAALKSKGSINDYNQ IAQQLRIVVKPKPSVGVGPLCSPQQVKDYIATPKPNGYQSLHTTVIPFLYESMFRLEVQI RTEEMDLIAERGIAVYYNGKSLSTGLVGNAVPLGRNSRGKTGCLNNADFALRVGWLNAIR EWQEEFVGNMSSREFVDTITRDLLGSRVFVFTPKGEIKNLPKGATVVDYAYLIHTEIGNK MVAAKVNGNLVSPTHVLENAEVVEIVTYNALSSKSAFQRHKQWLQHAKTRSARHKIMRFL REQAAQCAAEITQDQVNDFVADSDSDVEDLTEDSRKSLQWWEKILVNVKQFQSQDKSRDT TPAPQNGSVWAPKVNGKHNKAIKNSSSDEPEFLLPGDGIARILPANIPAYKEVLPGLDSW RDSKIATWHHLEGQSIEWLCVVSMDRKGIIAEVTTVLAAEGIALCSCVAEIDRGRGLAVM LFQIEANIESLVICPVDLNSLLTLVFVLFGSSITKH*
Multiple Sequence Analysis
T2H3.9 is aligned with bacterial GTP pyrophosphokinases (RelA gene products) from Bacillus subtilis, Staphylococcus aureus and Myxococcus xanthus. In B. subtilis this enzyme catalyzes the production of guanosine 3'-diphosphate 5'-triphosphate (EC 2.7.6.5) from ATP + GTP.
1 60
BsubRelA ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SaurRelA ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
MyxRelA ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
T2H3.H SVECVNICNLTKGDGNARSDCSALSCAWKAPRALTGFLASTAHPPVCSVYSCGRNGRKSR
61 120
BsubRelA ~~~~~~~~~~~~~~~~~~~~~~~~~MANEQVLTAEQVIDKARSYLSDEHIAFVEKAYLYA
SaurRelA ~~~~~~~~~~~~~~~~~~MNGVYHIMNNEYPYSADEVLHKAKSYLSADEYEYVLKSYHIA
MyxRelA ~~~~~~~~~~~~MWGNVAPPLVQSYDCSSFDDSPERHPPTVSQYHPDPDLDIIKKAYVYS
T2H3.9 MKACAWQRYEYEVGFSEAPYFVNVRNILKSRLSCGGH.KRWELYCVSAESSSGASSDVTV
121 180
BsubRelA EDAHREQYRKSGEPYIIHPIQVAGILVDLEMDPSTIAGGFLHDVVEDTD.VTLDDLKEAF
SaurRelA YEAHKGQFRKNGLPYIMHPIQVAGILTEMRLDGPTIVAGFLHDVIEDTP.YTFEDVKEMF
MyxRelA AKVHQGQLRKSGEPYLVHPLEVAGILGELKLDEASIVTGLLHDTIEDTL.ATEEELTELF
T2H3.9 ETLWEDLF..PSISYLPRKELEFVQKGLKELDWESIVAGLLHDTVEDTNFITFEKIEEEF
181 240
BsubRelA SEEVAMLVDG...VTKLGKIKYK....SQEEQQAENHRKMFVAMAQDIRVILIKLADRLH
SaurRelA NEEVARIVDG...VTKLKKVKYR....SKEEQQAENHRKLFIAIAKDVRVILVKLADRLH
MyxRelA GSEVAHLVDG...VTKLSKFSASA.SLSQEEKQAENFRKMIIAMAQDIRVILVKLADRTH
T2H3.9 GATVRHIVEGETKVSKLGKLKCKTESETIQDVKADDLRQMFLAMTDEVRVIIVKLADRLH
241 300
BsubRelA NMRTLKHLPQEKQRRISNETLEIFAPLAHRLGISKIKWELEDTALRYLNPQQYYRIVNLM
SaurRelA NMRTLKAMPREKQIRISRETLEIYAPLAHRLGINTIKWELEDTALRYIDNVQYFRIVNLM
MyxRelA NMRTLDHMSEEKQARIAQETLDIYAPLANRLGISWIKTELEDLSFRYVKPQEFFALQAKL
T2H3.9 NMRTLCHMPPHKQSSIAGETLQVFAPLAKLLGMYSIKSELENLSFMYVSAEDYDRVTSRI
301 360
BsubRelA KKKRAERELYVDEVVNEVKKRVEE......VNIKADFSGRPKHIYSIYRKMVLQNKQFNE
SaurRelA KKKRSEREAYIETAIDRIRTEMDR......MNIEGDINGRPKHIYSIYRKMMKQKKQFDQ
MyxRelA NKRKKEREKYIEDTCDLIRSKLAE......RGLKGEVSGRFKHVYSIYKKIKSQGIDFDQ
T2H3.9 ANLYKEHEKELTEANRILVKKIEDDQFLDLVTVNTDVRSVCKETYSIYKAALKSKGSIND
361 420
BsubRelA IYDLL.AVRILVNSIKDCYAVLGIIHTCWKPMPGRFKDYIAMPKPNMYQSLHTTVIGPKA
SaurRelA IFDLL.AIRVIVNSINDCYAILGLVHTLWKPMPGRFKDYIAMPKQNLYQSLHTTVVGPNG
MyxRelA IHDII.AFRIIAPTAPSCYEALGLVHEMWKPVPGRFKDFIAIPKPNMYQSLHTTIIGPLS
T2H3.9 YNQIAQQLRIVVKPKPS....VGVGPLC...SPQQVKDYIATPKPNGYQSLHTTVIPFLY
421 480
BsubRelA D...PLEVQIRTFEMHEIAEYGVAAHWAYK.................EGKAANE.GATFE
SaurRelA D...PLEIQIRTFDMHEIAEHGVAAHWAYK.................EGKKVSEKDQTYQ
MyxRelA E...RVEVQIRTSEMHKIAEEGIAAHWKYK.................EGKAVISKD...D
T2H3.9 ESMFRLEVQIRTEEMDLIAERGIAVYYNGKSLSTGLVGNAVPLGRNSRGKTGCLNNADFA
481 540
BsubRelA KKLSWFREILEFQNE...STDAEEFMESLKIDLFSDMVYVFTPKGDVIELPSGSVPIDFS
SaurRelA NKLNWLKELAEADHT...SSDAQEFMETLKYDLQSDKVYAFTPASDVIELPYGAVPIDFA
MyxRelA EKFAWLRQLMEWQQD...LKDPKEFLETVKVDLFTDEVFVFTPKGDVRSLPRGATPVDFA
T2H3.9 LRVGWLNAIREWQEEFVGNMSSREFVDTITRDLLGSRVFVFTPKGEIKNLPKGATVVDYA
541 600
BsubRelA YRIHSEIGNKTIGAKVNGKMVTLDHKLRTGDIVEILT.....SKHSYGPSQDWVKLAQTS
SaurRelA YAIHSEVGNKMIGAKVNGKIVPIDYILQTGDIVEIRT.....SKHSYGPSRDWLKIVKSS
MyxRelA YAIHSDVGNRCVGAKVNGKIVPLRYKMKNGDTVEVLT.....SPQQH.PSKDWLTFVKTS
T2H3.9 YLIHTEIGNKMVAAKVNGNLVSPTHVLENAEVVEIVTYNALSSKSAFQRHKQWLQHAKTR
601 660
BsubRelA QAKHKIRQFFKKQRREENVEKGRELVEKEIKNLDFELKDVLTPENIQKVADKFNFSNEED
SaurRelA SAKGKIKSFFKKQDRSSNIEKGRMMVEVEIKEQGFRVEDILTEKNIQVVNEKYNFANEDD
MyxRelA RAQQRIRGFIKQQQREKSLQLGRELADRELKRFQLNFNRLLKSGEMKKAAVDLGFRVEDD
T2H3.9 SARHKIMRFLREQAAQCAAEITQDQVNDFVADSDSDVEDLTEDSRKSLQWWEKILVNVKQ
661 720
BsubRelA MYAAVGYNGITALQVANRLTEKE..........RKQRDQEEQEKIVQEVTGEPKP....Y
SaurRelA LFAAVGFGGVTSLQIVNKLTERQ...........RILDKQRALNEAQEVT.KSLP....I
MyxRelA MLVAIGYGKVTPQQLSHRLVPQE...KLNAAEAGGRADANPAATTSGGAGNSVLPGLSRV
T2H3.9 FQSQDKSRDTTPAPQNGSVWAPKVNGKHNKAIKNSSSDEPEFLLPGDGIARILPANIPAY
721 780
BsubRelA PQ......GRKREAGVRVKGIDNLLVRLSKCCNPVPGDDIVGFITKGRGVSVHREDCPNV
SaurRelA KD......NIITDSGVYVEGLENVLIKLSKCCNPIPGDDIVGYITKGHGIKVHRTDCPNI
MyxRelA TDLAKRLVGRSNRSGVQIGGVDDVLVRFGRCCNPVPGDPIAGFITRGRGVTVHTVGCEKA
T2H3.9 KEVLPGLDSWRDSKIATWHHLEGQSIEW.LCVVSMDRKGIIAEVTTVLAAEGIAL.CSCV
781 840
BsubRelA KTNEAQERLIPVEWEHESQVQKRKEYNVEIEILGYDRRGLLNEVLQAVNETKTNISSVSG
SaurRelA K.NET.ERLINVEWVKSKDATQK..YQVDLEVTAYDRNGLLNEVLQAVSSTAGNLIKVSG
MyxRelA LATD.PERRVDVSW....DVRGDFKRPVTLRVLTADRPGLLADITNTFSKKGVNISQANC
T2H3.9 AEIDRGRGLAVMLFQIEANIESLVICPVDLNSLLTLVFVLFGSSITKH*~~~~~~~~~~~
841 883
BsubRelA KSDRNKVATIHMAIFIQNINHLHKVVERIKQIRDIYSVRRVMN*
SaurRelA RSDIDKNAIINISVMVKNVNDVYRVVEKIKQLGDVYTVTRVWN*
MyxRelA RATGDDRAVNTFEVIISDLKQLTDLMRTIERLQGVYSVERI*~~
T2H3.9 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
written 14 Sep 98
updated 15 Sep 98
Larry
Parnell