| Gene | T5J8.B |
| Putative Identification | hypothetical protein |
| Position | 44506 - 46441 |
| Strand | + |
| EST match | T44655, also very similar to rice EST D46359 |
| Database match | hypothetical protein from Synechocystis sp. |
CDS: The table below lists the coordinates of the T5J8.B exons and which exon prediction algorithms selected the 5' and 3' termini (GS = GenScan, Gr = GRAIL, M = MZEF, NPG = NetPlantGene - selects splice sites only, not exons). Splice sites delineated by EST T44566 are designated by EST.GenScan predicts a splice donor for the first exon at position 44550; NetPlantGene predicts a splice donor for this exon at position 44649.
| Exon | Range | 5' | 3' |
|---|---|---|---|
| 1 | 44506 - 44658 | GS, Gr | Gr |
| 2 | 44737 - 44883 | GS, Gr, M, NPG | GS, Gr, M, NPG |
| 3 | 44989 - 45105 | GS, Gr, M, NPG | GS, M, NPG |
| 4 | 45281 - 45424 | GS, Gr, M, NPG | EST, GS, Gr, M, NPG |
| 5 | 45505 - 45591 | EST, GS, Gr, M, NPG | EST, GS, Gr, M, NPG |
| 6 | 45673 - 45788 | EST, GS, Gr, M, NPG | EST, GS, Gr, M, NPG |
| 7 | 45891 - 46072 | EST, GS, Gr, M, NPG | EST, GS, M, NPG |
| 8 | 46172 - 46229 | EST, GS, NPG | EST, GS, NPG |
| 9 | 46318 - 46441 | EST, GS, Gr, NPG | Gr |
Complete CDS of T5J8.B
ATGTTTGCTCAGTTAGCACCATCGCCGTCTCCTTCTCCGGCGAACGTTAGCCACTTCGTT CACCGTCGCACCTTCCGTCAATGTACATACGCCGTCTCCGCTTCTCTTCCCACTGCAAAT TCCTCTCGTCCACCACCAATTCAGGCAAACATTATTGTTGGAGGAAAGGATTTGAATTTG GATGTTACACGGAAAGATGATTCAATTGGGTTCAAATTAGAGGCAAATGAAGATGAAATT GATTGGATGAATCTAGAATCTGATATTCGTCTATGGACTAGAGCATTACGTCCTGTTCAG TGGTATCCAGGGCACATAATGAAAACTGAAAAGGAACTTAGGGAACAGCTTAAGTTGATG GATGTTGTGATTGAAGTCCGTGATGCTAGGATTCCTTTATCGACCACTCATCCAAAGATG GATGCGTGGCTTGGGAATAGGAAACGGATATTGGTATTAAATAGAGAAGATATGATCTCG AACGATGATCGAAATGATTGGGCGAGATACTTTGCAAAGCAAGGAATAAAGGTCATTTTC ACTAATGGCAAACTTGGGATGGGAGCTATGAAGCTAGGTCGGTTAGCCAAAAGTTTAGCA GGTGACGTAAATGGGAAACGGCGAGAAAAAGGACTTCTCCCTAGATCAGTTAGAGCTGGA ATAATTGGATACCCTAATGTTGGGAAATCATCTCTGATCAATCGTCTATTGAAACGAAAA ATTTGCGCAGCAGCTCCAAGACCAGGTGTAACTAGAGAAATGAAATGGGTCAAGCTTGGG AAAGATCTTGATCTCTTAGATTCACCTGGAATGCTTCCTATGCGTATCGATGATCAAGCA GCTGCTATAAAGCTGGCAATTTGTGATGACATTGGAGAGAAAGCTTATGACTTCACTGAT GTTGCTGGAATCCTTGTGCAGATGTTAGCACGGATTCCAGAAGTAGGCGCAAAGGCTCTT TACAACCGATACAAGATCCAGCTAGAAGGCAACTGCGGGAAAAAATTTGTGAAGACGCTT GGTCTTAATTTGTTTGGTGGAGATAGTCATCAAGCTGCATTTAGAATACTAACGGATTTT CGCAAAGGCAAATTCGGGTATGTCTCGTTGGAGAGACCTCCACTGTGA
Protein translation
MFAQLAPSPSPSPANVSHFVHRRTFRQCTYAVSASLPTANSSRPPPIQANIIVGGKDLNL DVTRKDDSIGFKLEANEDEIDWMNLESDIRLWTRALRPVQWYPGHIMKTEKELREQLKLM DVVIEVRDARIPLSTTHPKMDAWLGNRKRILVLNREDMISNDDRNDWARYFAKQGIKVIF TNGKLGMGAMKLGRLAKSLAGDVNGKRREKGLLPRSVRAGIIGYPNVGKSSLINRLLKRK ICAAAPRPGVTREMKWVKLGKDLDLLDSPGMLPMRIDDQAAAIKLAICDDIGEKAYDFTD VAGILVQMLARIPEVGAKALYNRYKIQLEGNCGKKFVKTLGLNLFGGDSHQAAFRILTDF RKGKFGYVSLERPPL*
Alternate translations
Using the initial exon as predicted by GenScan gives a translation of:
MFAQLAPSPSPSPANIVGGKDLNLDVTRKDDSIGFKLEANEDEIDWMNLESDIRLWTRAL RPVQWYPGHIMKTEKELREQLKLMDVVIEVRDARIPLSTTHPKMDAWLGNRKRILVLNRE DMISNDDRNDWARYFAKQGIKVIFTNGKLGMGAMKLGRLAKSLAGDVNGKRREKGLLPRS VRAGIIGYPNVGKSSLINRLLKRKICAAAPRPGVTREMKWVKLGKDLDLLDSPGMLPMRI DDQAAAIKLAICDDIGEKAYDFTDVAGILVQMLARIPEVGAKALYNRYKIQLEGNCGKKF VKTLGLNLFGGDSHQAAFRILTDFRKGKFGYVSLERPPL*
Using the initial exon as predicted by NetPlantGene gives a translation of:
MFAQLAPSPSPSPANVSHFVHRRTFRQCTYAVSASLPTANSSRPPPIQIVGGKDLNLDVT RKDDSIGFKLEANEDEIDWMNLESDIRLWTRALRPVQWYPGHIMKTEKELREQLKLMDVV IEVRDARIPLSTTHPKMDAWLGNRKRILVLNREDMISNDDRNDWARYFAKQGIKVIFTNG KLGMGAMKLGRLAKSLAGDVNGKRREKGLLPRSVRAGIIGYPNVGKSSLINRLLKRKICA AAPRPGVTREMKWVKLGKDLDLLDSPGMLPMRIDDQAAAIKLAICDDIGEKAYDFTDVAG ILVQMLARIPEVGAKALYNRYKIQLEGNCGKKFVKTLGLNLFGGDSHQAAFRILTDFRKG KFGYVSLERPPL*
Multiple sequence analysis
Presented is an alignment of conserved hypothetical proteins from several different species. Proteins from Synechocystis sp. (Synecho.hypo), B. subtilis (Bsub.hypo), B. burgdorferi (BB0643), Schizosaccharomyces pombe (YspAC6F6.03c), and Arabidopsis thaliana are presented. The two A. thaliana proteins are T5J8.B and T32G6.19.
1 60
YspAC6F6.03c MGTYKKEKSRIGRYNAAGELVRAAEFQSSEVPKARIEGANEKKPGNLRVKGENFYRNAKD
61 120
YspAC6F6.03c VARVNMYRGGKAKQPDRRWFNNTRVIAQPTLTQFREAMGQKLNDPYQVLLRRNKLPMSLL
T5J8.B ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~MFAQLAPSPSPSP
121 180
YspAC6F6.03c QENTEIPKVRVLESEPFENTFG.PKSQRKRPKISFDSVAELAKESDEKQNAYEEKIEERI
T5J8.B ANVSHFVHRRTFRQCTYAVSASLPTANSSRPPPIQANIIVGGKDLNLDVTRKDDSIGFKL
181 240
YspAC6F6.03c LANPDESD......DVML...AARDAIFSKGQSKRIWNELYKVIDSSDVLIQVLDARDPV
T5J8.B EANEDEIDWMNLESDIRLWTRALRPVQWYPGHIMKTEKELREQLKLMDVVIEVRDARIPL
Synecho.hypo ~~~~~~~~~~~~~~~~~~~~~~MALIQWYPGHIAKAERQLREQLNKVDVVLEVIDARIPL
Bsub.hypo ~~~~~~~~~~~~~~~~~~~~~~~MTIQWFPGHMAKARREVTEKLKLIDIVYELVDARIPM
BB0643 ~~~~~~~~~~~~~~~~~~~~~MANKINWFPGHMKRALDLIKNNLQKANIVLEILDARAPF
AtT32G6.19 MVMMLKKTVKKGLIGGMSFAKDAGKINWFPGHMAAATRAIRNRLKLSDLVIEVRDARIPL
241 300
YspAC6F6.03c GTRCGTVERYLRNEASHKHMILVLNKVDLVPTSVAA.....AWVKILAKEYPTIAFHASI
T5J8.B STTHPKMDAWLGN....RKRILVLNREDMISNDDRN.....DWARYFAKQGIKVIFTNGK
Synecho.hypo ASQHPDVPLWVGE....KPKLIILNRVDMITPELQE.....QWLTWFRQQNQTVYYANAK
Bsub.hypo SSRNPMIEDILKN....KPRIMLLNKADKADAAVTQ.....QWKEHFENQGIRSLSINSV
BB0643 SSKNPLTEKITKN....QAKIILLHKSDVAQINEII.....KWKKYFENLG.NTVIISNI
AtT32G6.19 SSANEDLQSQMSA....KRRIIALNKKDLANPNVLNVIRFFKWTRHFESSKQDCIAINAH
301 360
YspAC6F6.03c NNSFGKGSLIQILRQFASLHSDKKQ........ISVGLIGFPNAGKSSIINTL.......
T5J8.B LGMGAMK.LGRLAKSLAGDVNGKRREKGLLPRSVRAGIIGYPNVGKSSLINRL.......
Synecho.hypo QGTGVKA.ISKAAQQAGKAVNERRQRRGMQPRPVRAVVMGFPNVGKSALINRL.......
Bsub.hypo NGQGL.NQIVPASKEILQEKFDRMRAKGVKPRAIRALIIGIPNVGKSTLINRL.......
BB0643 YKKGMRKQIIDIIKKLAIVK....KIKNYKEK.IKVLIIGVPNVGKSSIINLL.......
AtT32G6.19 SRSSVMKLLDLVELKLKEVI........AREPTLLVMVVGVPNVGKSALINSIHQIAAAR
361 420
YspAC6F6.03c ....RKKKVCNVAPIPGETKVWQYVALMK..RIFLIDCPGIVPPSSNDSDAELLLKGV..
T5J8.B ....LKRKICAAAPRPGVTREMKWVKLGK..DLDLLDSPGMLPMRIDDQAAAIKLAICDD
Synecho.hypo ....LGRKVVESARRAGVTRQLRWIRISD..SLELLDSPGVIPLKLENQANALKLAICED
Bsub.hypo ....AKKNIAKTGDRPGITTSQQWVKVGK..ELELLDTPGILWPKFEDELVGLRLAVTGA
BB0643 ....SGKKSAKVANKPGYTKNIQIVKINE..EINLFDMPGILWHNLVDQSIAKKLAILDM
AtT32G6.19 FPVQERLKRATVGPLPGVTQDIAGFKIAHRPSIYVLDSPGVLVPSIPDIETGLKLALSGS
421 480
YspAC6F6.03c ..VRVENVSNPEAYIPTVL.SRCKVKHLERTYEISGWNDSTEFLAKLAKKGGRLL....K
T5J8.B IGEKAYDFTDVAGILVQML.ARIPEVGAKALYNRYKIQLEGNCGKKFVKTLGLNL....F
Synecho.hypo IGEAAYENQAVAIAMVDLL.LDL.SLG.EALTQRYQVPPEAMNGDLYLIALADAR....H
Bsub.hypo IKDSIINLQDVAVFGLRFL....EEHYPERLKERYGLDEIPEDIAELFDAIGEKRGCLMS
BB0643 IKNEIVDNTDLALYLLEIM....DQNNKNILLKKYEI..YHKNSLDILQNFAKARKLIGK
AtT32G6.19 VKDSVVGEERIAQYFLAILNIRGTPLHWKYLVEGINEGPHADCIDKPSYNLKDLRHQRTK
481 540
YspAC6F6.03c GG..EPDEASVAKMVLNDFMRGKIPWFIGPKGLSSSNDEINSSQKVATQQTEGSDQDGEE
T5J8.B GG..DSHQA..AFRILTDFRKGKFGYVSLERPPL*~~~~~~~~~~~~~~~~~~~~~~~~~
Synecho.hypo NG..DRERA..AVQLLNDFRKGLLGPLPLELPPEV*~~~~~~~~~~~~~~~~~~~~~~~~
Bsub.hypo GGLINYDKT..TEVIIRDIRTEKFGRLSFEQPTM*~~~~~~~~~~~~~~~~~~~~~~~~~
BB0643 KNELNLEKA..SKILIKEFREGKFGKIILDKNYNAF*~~~~~~~~~~~~~~~~~~~~~~~
AtT32G6.19 QPDSSALHY..VGDMISEVQRSLYITLSEFDGDTEDENDLECLIEQQFEVLQKALKIPHK
541 586
YspAC6F6.03c AEEEWHGISDDGKADESESTKPVAEGSASESTDESAVDDNKNRS*~
T5J8.B ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Synecho.hypo ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Bsub.hypo ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
BB0643 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
AtT32G6.19 ASEARLMVSKKFLTLFRTGRLGPFILDDVPETETDHPNSKRVVVL*
written 16 Feb 98
Larry Parnell