| Gene | F6P23.4 |
| Putative Identification | predicted protein |
| Position | 13502 to 19375, from initial methionine to termination codon |
| Strand | - |
| EST match | T43411 |
| Database match | F6P23.3, weakly similar to SpAC2C4.17 |
CDS: The table below lists the coordinates for each exon and which gene prediction programs selected either the 3' or the 5' terminus of a particular exon (GS = GenScan, Gr = GRAIL, M = MZEF). Exon splice sites defined by EST T43411 are designated as EST.
Exon |
Range |
3' |
5' |
| 1 | 19166 to 19375 | EST,GS | GS,Gr |
| 2 | 18871 to 19076 | EST,M | EST,GS,Gr,M |
| 3 | 18700 to 18793 | Gr,M | EST,M |
| 4 | 17161 to 18607 | GS | GS,Gr,M |
| 5 | 14745 to 16584 | GS,Gr,M | GS,M |
| 6 | 14542 to 14665 | GS,M | GS,M |
| 7 | 14150 to 14443 | GS,Gr | GS,M |
| 8 | 13867 to 14069 | GS,Gr,M | GS,Gr |
| 9 | 13502 to 13793 | GS | GS |
Complete CDS of F6P23.4
ATGGCTACCACCACCGTCGTATGCGGCGGAGGAGGAGATGTTGAGGGTTGCGATGGCGAG AGAAGCCTGGATCTGTTACCGGCGGCTCTCTTGGAGACTATTATGACGAAACTCGATGTT GCGTCTCTCTGCTCGTTGGCTTCTACTTGCAAAACACTTAAGAGCTGTGTCACTCGTGTT CTCACCTTCACTCCCAACTTCCACATCTTCAATGTTTCATTGTCAATGGAAACTGTTAGA CCATTGTTGTTTCCGAATCAACAACTTTCTAGTCTCAAGCTTGATTGTGGAAGGCTTGGT AATTCAGCTATTGATATATTGGTACGTCCTTCTTTGAGGGAGATTTCTCTTCACAATTGT CGGGATTTCAGCGGAGATCTCATCTCCGAAATTGGTAGGAAATGCAAGGATTTGAGGCTC CTATGTTTGGGTTCAGTAGCAGAGAAAGTTGGTCGAAGTATTAGCCGTTGTGCTTTGGAG GATTTACTTAATGGTTGCTCCCATTTAGAGGTCTTAGCACTCATGTTTGACCTCTCATTG TATCTGCGTCCTGGTGATGGTCGCATTTTTGGATTGGTATCTGATAGATTGACTCATCTG GAGCTAGGACACATCACGTCAAGGATGATGACTCAGCTACTCACATCAACTGAAATCTCG GGGCAAGACAGCAACCGAGTTACTACATCTACTGTGCTACAAAATGTCCAGCGGTTGCGC CTCTCAGTGGATTGTATTACTGATGCTGTGGTTAAGGCGATATCGAAATCTCTTCCATCT TTGATAGATTTGGACATAAGAGACGCGCCTCTTGAGGATCCAAGACAAGTATCAGACCTC ACGGATTTTGGGCTTCATGAGATCAATCAGAATGGTAAGCTGAAGCACCTGTCTCTAATC CGTAGCCAGGAGTTTCACCCTACTTACTTCCGGCGAGTTAGTGATCAAGGAATGCTTTTT CTTGCTGACAAATGCTTGGGAATGGAAACCATTTGTCTTGGTGGCTTTTGTCGGGTAACA GATGCTGGTTTCAAGACTATTCTTCACTCTTGCGCTAGCCTCTCCAAGTTTAGTATATAT CATGGACCTAAATTGACGGATCTTGTCTTCCATGACATCTTAGCAACTACTCTTTCGCTT AGTCATGTTAGCTTAAGAAGGTGCCATCTTCTTACTGACCACGCAATTCAAAAGCTGGCA TCAAGTCTCAAGCTGGAGAATCTTGACCTCAGAGGGTGCAGAAACCTGAGGGATGAAACC TTGACAGCTGTATCTCATCTGCCAAAACTGAAGGTTTTACTTCTTGATGGCGCTGATATT AGCGACACAGGGCTTTCGTACCTAAAAGAGGGAGTTTTGGACTCCCTAGTTTCTTTATCT GTCAGAGGGTGCAGAAATTTAACCGATAAGTTCATGTCTACCTTATTTGACGGATCCTCA AAACTGGCGTTGCGGGAACTCGACTTGTCTAATCTTCCTAATCTAACGGATGCTGCCATT TTTGCTCTAGCAAAATCTGGAGCCCCAATCACAAAACTGCAACTAAGAGAATGTAGACTC ATAGGTGATGCCTCAGTCATGGCTCTTGCATCCACCCGGGTTTACGAAGACGAATGTCCT GGAAGTAGCCTGTGCTTGTTGGATCTTTACGACTGTGGTGGCATCACGCAACTCTCATTC AAATGGCTAAAGAAACCGTTTTTCCCGAGGCTTAAATGGTTAGGGATCACAGGAAGTGTA AACAGAGATATCGTAGATGCTTTGGCAAGGAGACGACCACATTTGCAGGTCTCTTGCCGT GGAGAAGAGCTTGGAAATGATGGGGAAGACGACTGGGACTCTGCAGACATACACCAACAC ATAGAAGCACAAGAAGATGAGCTTGAACAGTGGATTCTTGGAGACGAAGGTGATGTTGAA ATGGAGGATGCAGAAGACGAGAGTGAAGAAGACGCAAATTTAATAATAACAATTAATTTT CAAATGGATTTCAGAAATTCCTTCAAATCTCATAGCTCATACAAACAAATTCGAAGCCCC GGAGATCAAAGCGAGCCAAGCCCTGAACATCTGCCAATTCTCCACGATCACCATCCTGAT CATTCCGGCATGGTGGTCGATGACCAGAAACCCGACAGCACTCGTTCTAGCCTCGACGAT GGTCGTAACGCGCCTGTAGAACGTGATGCCAGTTACAAATTCTGGCAAGACAACACTACT GGAACGTCAACGGATCACACGGCGGTGAGAACAAGTGATAAAGATCCAATCGCTATTAGT CGAAAAGGCGATAGATTAAGCGGTAGTTTCGATTTCGTGCACGGGAAGCTACCGGTTGAT GAATCTCCAACGAAAATGGTCGCTGGAGAGCCTGTGAATCGGCAATGGAGAGGTAGAAAT AATGAAGAGATCACACTTGATGTTGATCAAGAAAATGATGATGTAAGTCACCAAACAATG CCTACGCCTACATCAACGGCTCGAACCTCGTTTGATGCGTCTAGAGAAATGCGAGTCTCT TTTAACGTTCGTAGAGCGGGTGGTGCGTTTGTCGCTGGTTCGGTTCCTTCTTCGTCTTCC CATTCGTCTTCTTCTTCGTCCGCGACAATGCGGACGAATCAGGATCAGCCGCAACTGCAG GAGGAAGAGGTTGTGAGATGTACTTCGAACATGTCGTTTCAGAGGAAATCAGAGCTTATA TCTAGAGTGAAGACAAGGTCTAGGCTTCAAGATCCGCCACGTGAGGAAGAAACGCCTTAC TCGGGTTGGAGATCGGGTCAGTTGAAATCCGGGTTACTTGCGGATATTGATGAAGAGGAT GATCCATTAGCGGAAGAAGATGTACCAGATGAGTATAAAAGAGGGAAGCTTGACGCCATT ACATTGCTCCAGTGGCTTAGCTTAGTCGCTATTATAGCTGCATTAGCGTGTAGTTTGTCG ATTCAGTCTTGGAAGAAAGTTAGAGTTTGGAATCTTCATCTATGGAAATGGGAAGTGTTC TTGTTGGTTCTTATATGTGGGAGGTTGGTTTCGGGTTGGGGAATCCGAATCGTTGTCTTC TTCATCGAGAGAAACTTCCTTTTGCGGAAGCGGGTTCTTTACTTCGTTTACGGTGTGCGT AGAGCTGTTCAGAACTGTCTATGGCTAGGCCTAGTCCTTCTCGCTTGGCACTTCTTGTTC GACAAGAAAGTACAGAGGGAGACAAGGAGCAGGTTTCTTCCTTACGTAACCAAGATCTTG GTGTGTTTCTTGCTAAGTACTATCTTGTGGTTGATCAAGACACTAGTGGTTAAAGTTCTG GCCTCTTCGTTCCACGTCAGTACTTACTTTGATCGGATTCAAGAAGCGCTCTTTAACCAG TACGTGATCGAGACGTTATCGGGTCCTCCCATGATTGAAATGAGCAGGATTGAGGAAGAG GAAGAGCGGGCGCAAGACGAGATATTCAAGATGCAGAACGCAGGTGCTAATTTACCACCA GATCTTTGCGCGGCTGCGTTTCCACCGGGAAAAAGCGGGAGAGTAATGAATCCGAAACTC TCCCCGATAATCCCGAAATCAACAACTGATAATGGAATTAGCATGGAACATCTTCACAGG ATGAATCATAAGAACATCTCTGCTTGGAACATGAAGAGACTAATGAAGATTGTGAGAAAT GTTTCTCTGACCACACTGGACGAACAGATGCTAGAAAGCACATACGAAGATGAATCCACC CGCCAGATACGAAGCGAAAAAGAGGCTAAAGCAGCTGCAAGGAAGATTTTCAAGAACGTT GAGCAACGTGGCGCAAAGTACATTTACCTAGAGGATTTGATGAGATTTTTGCGGGAAGAC GAGGCGATGAAGACCATGGGCCTCTTCGAAGGTGCACCGGAGAATAAAAGGATCAGCAAA TCAGCCTTAAAGAACTGGCTGGTCAATGCTTTCAGAGAGCGACGAGCTCTTGCCCTGACA CTCAACGACACCAAGACAGCAGTGAACAAGCTCCATCACATGATTAATATTGTCACTGCC ATAGTCATTGTTGTCATTTGGCTTGTCCTTCTTGAAATTGCTTCCTCCAAGGTACTTCTC TTTGTAAGTTCACAAGTTGTACTCTTGGCCTTCATCTTTGGGAACACTGTCAAGACCGTC TTCGAGTCCATCATCTTCTTATTCATCGTACACCCTTACGATGTTGGTGATCGGTGTGAG ATTGACAGTGTACAGTTGGTGGTGGAGGAGATGAACATACTCACTACCGTGTTCTTACGA TACGACAATCTGAAGATTATGTATCCGAATAGTCTTTTGTGGCAGAAATCAATCAACAAT TACTACCGCAGTCCGGATATGGGAGATGCAATCGAGTTCTGTGTCCACATTACTACTCCT CTTGAAAAAATCTCCGTGATCAAACAGAGAATATCGAACTACATCGACAACAAGCCGGAG TATTGGTACCCACAAGCCAAAATCATTGTAAAAGATTTGGAAGATTTGCACATAGTAAGA CTAGCAATATGGCCATGTCACAGGATTAATCACCAAGACATGGCTGAAAGATGGACAAGA AGAGCCGTCTTAGTCGAAGAAGTGATTAAGATACTCCTCGAGCTCGACATTCAACACCGG TTTTATCCGCTCGATATCAATGTCCGAACAATGCCTACCGTTGTCTCTAGCAGAGTTCCA CCTGGCTGGTCACAAAACCAACCTGCCTGA
Protein translation:
MATTTVVCGGGGDVEGCDGERSLDLLPAALLETIMTKLDVASLCSLASTCKTLKSCVTRV LTFTPNFHIFNVSLSMETVRPLLFPNQQLSSLKLDCGRLGNSAIDILVRPSLREISLHNC RDFSGDLISEIGRKCKDLRLLCLGSVAEKVGRSISRCALEDLLNGCSHLEVLALMFDLSL YLRPGDGRIFGLVSDRLTHLELGHITSRMMTQLLTSTEISGQDSNRVTTSTVLQNVQRLR LSVDCITDAVVKAISKSLPSLIDLDIRDAPLEDPRQVSDLTDFGLHEINQNGKLKHLSLI RSQEFHPTYFRRVSDQGMLFLADKCLGMETICLGGFCRVTDAGFKTILHSCASLSKFSIY HGPKLTDLVFHDILATTLSLSHVSLRRCHLLTDHAIQKLASSLKLENLDLRGCRNLRDET LTAVSHLPKLKVLLLDGADISDTGLSYLKEGVLDSLVSLSVRGCRNLTDKFMSTLFDGSS KLALRELDLSNLPNLTDAAIFALAKSGAPITKLQLRECRLIGDASVMALASTRVYEDECP GSSLCLLDLYDCGGITQLSFKWLKKPFFPRLKWLGITGSVNRDIVDALARRRPHLQVSCR GEELGNDGEDDWDSADIHQHIEAQEDELEQWILGDEGDVEMEDAEDESEEDANLIITINF QMDFRNSFKSHSSYKQIRSPGDQSEPSPEHLPILHDHHPDHSGMVVDDQKPDSTRSSLDD GRNAPVERDASYKFWQDNTTGTSTDHTAVRTSDKDPIAISRKGDRLSGSFDFVHGKLPVD ESPTKMVAGEPVNRQWRGRNNEEITLDVDQENDDVSHQTMPTPTSTARTSFDASREMRVS FNVRRAGGAFVAGSVPSSSSHSSSSSSATMRTNQDQPQLQEEEVVRCTSNMSFQRKSELI SRVKTRSRLQDPPREEETPYSGWRSGQLKSGLLADIDEEDDPLAEEDVPDEYKRGKLDAI TLLQWLSLVAIIAALACSLSIQSWKKVRVWNLHLWKWEVFLLVLICGRLVSGWGIRIVVF FIERNFLLRKRVLYFVYGVRRAVQNCLWLGLVLLAWHFLFDKKVQRETRSRFLPYVTKIL VCFLLSTILWLIKTLVVKVLASSFHVSTYFDRIQEALFNQYVIETLSGPPMIEMSRIEEE EERAQDEIFKMQNAGANLPPDLCAAAFPPGKSGRVMNPKLSPIIPKSTTDNGISMEHLHR MNHKNISAWNMKRLMKIVRNVSLTTLDEQMLESTYEDESTRQIRSEKEAKAAARKIFKNV EQRGAKYIYLEDLMRFLREDEAMKTMGLFEGAPENKRISKSALKNWLVNAFRERRALALT LNDTKTAVNKLHHMINIVTAIVIVVIWLVLLEIASSKVLLFVSSQVVLLAFIFGNTVKTV FESIIFLFIVHPYDVGDRCEIDSVQLVVEEMNILTTVFLRYDNLKIMYPNSLLWQKSINN YYRSPDMGDAIEFCVHITTPLEKISVIKQRISNYIDNKPEYWYPQAKIIVKDLEDLHIVR LAIWPCHRINHQDMAERWTRRAVLVEEVIKILLELDIQHRFYPLDINVRTMPTVVSSRVP PGWSQNQPA*
Alignment of F6P23.3. and F6P23.4 shows entensive identity between the two proteins. Protein AC2C4.17 identified from the S. pombe sequencing project is also included in this alignment.
1 60
F6P23.4 MATTTVVCGGGGDVEGCDGERSLDLLPAALLETIMTKLDVASLCSLASTCKTLKSCVTRV
61 120
F6P23.4 LTFTPNFHIFNVSLSMETVRPLLFPNQQLSSLKLDCGRLGNSAIDILVRPSLREISLHNC
121 180
F6P23.4 RDFSGDLISEIGRKCKDLRLLCLGSVAEKVGRSISRCALEDLLNGCSHLEVLALMFDLSL
181 240
F6P23.4 YLRPGDGRIFGLVSDRLTHLELGHITSRMMTQLLTSTEISGQDSNRVTTSTVLQNVQRLR
241 300
F6P23.4 LSVDCITDAVVKAISKSLPSLIDLDIRDAPLEDPRQVSDLTDFGLHEINQNGKLKHLSLI
301 360
F6P23.4 RSQEFHPTYFRRVSDQGMLFLADKCLGMETICLGGFCRVTDAGFKTILHSCASLSKFSIY
361 420
F6P23.4 HGPKLTDLVFHDILATTLSLSHVSLRRCHLLTDHAIQKLASSLKLENLDLRGCRNLRDET
421 480
F6P23.4 LTAVSHLPKLKVLLLDGADISDTGLSYLKEGVLDSLVSLSVRGCRNLTDKFMSTLFDGSS
481 540
F6P23.4 KLALRELDLSNLPNLTDAAIFALAKSGAPITKLQLRECRLIGDASVMALASTRVYEDECP
541 600
F6P23.4 GSSLCLLDLYDCGGITQLSFKWLKKPFFPRLKWLGITGSVNRDIVDALARRRPHLQVSCR
601 660
F6P23.4 GEELGNDGEDDWDSADIHQHIEAQEDELEQWILGDEGDVEMEDAEDESEEDANLIITINF
661 720
F6P23.4 QMDFRNSFKSHSSYKQIRSPGDQSEPSPEHLPILHDHHPDHSGMVVDDQKPDSTRSSLDD
721 780
F6P23.4 GRNAPVERDASYKFWQDNTTGTSTDHTAVRTSDKDPIAISRKGDRLSGSFDFVHGKLPVD
781 840
F6P23.4 ESPTKMVAGEPVNRQWRGRNNEEITLDVDQENDDVSHQTMPTPTSTARTSFDASREMRVS
841 900
F6P23.4 FNVRRAGGAFVAGSVPSSSSHSSSSSSATMRTNQDQPQLQEEEVVRCTSNMSFQRKSELI
F6P23.3 ~~~~~~~~~~MSGSVRSCTS.STSFSSATMRLNLEQQLEDEGEVVVRCSSV...RKTELV
SpAC2C4.17 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
901 960
F6P23.4 SRVKTRSRLQDPPREEETPYSGW..RSGQLKSGLLA....D..IDEEDDPLAEEDVPDEY
F6P23.3 SRAKARSRLIDPPQEEEQQYSSWIGTSDQLRSGLLGRHSDD..IEEEDDSSAEEDVPVEY
SpAC2C4.17 ~~~~MNEHRREPHRRSGYQDDSAFTNTEKLVDELDHNVEPEQLLEKNRTDFKLMYVIVKF
961 1020
F6P23.4 KRGKLDAITLLQWLSL...VAIIAALACSLSIQSWKKVRVWNLHLWKWEVFLLVLICGRL
F6P23.3 RKLKMDAITLLQWMSL...IALVVALVLSLGLHTWRNATLWSLHLWKWEVVLLVLICGRL
SpAC2C4.17 YRWFNNLSFITRWITIWFPLAGALVIPLAVGVSPYPNAKLGGVRIFWIFVWLEVAWGGFW
1021 1080
F6P23.4 VSGWGIRIVVFFIERNFLLRKRVLYFVYGVRRAVQNCLWLGLVLLAWHFLFDKKVQRETR
F6P23.3 VSGCGIRIIVFFIERNFLLRKRVLYFVYGVKTAVQNCLWLGLVLLAWHFLFDKKVEKETQ
SpAC2C4.17 VSRVIARLLPYILYPLMGILPFTMYKYTVILTALEMPLAIFFCSIVCVCTFSPIMIGKGN
1081 1140
F6P23.4 SRFLPYVTKILVCFLLSTILWLIKTLVVKVLASSFHVSTYFDRIQEALFNQYV..IETLS
F6P23.3 SDVLLLMSKILVCFLLSTVLWLIKTLVVKVLASSFHVSTYFDRIQEALFHHYL..IETLS
SpAC2C4.17 FTSTTVTTTTSATATPTASASSNAVESVFVTKTAASVPSWIKVITKILGAAVVTSIVLLL
1141 1200
F6P23.4 GPPMIEMSRIEEEEERAQDEIFKMQNAGANLPPDLCAAA.FPPGKSGRVMNPKLSPIIPK
F6P23.3 GPPMLELSRIEEEEDRTQDEIYKMQKGGADLSPELCSAA.FPQEKSGSTMNMKFSPIIPK
SpAC2C4.17 EKIFLHFIGFHYHEVQYQYRITDNKRNTAVLAKLLTAALDAPYHDSPRVRRQDYLLGLID
1201 1260
F6P23.4 STTDNGISMEHLHRMNHKNISAWNMKRLMKIVRNVSLTT....LDEQMLESTYEDESTRQ
F6P23.3 TGSDNGITMDDLHKMNQKNVSAWNMKRLMKIVRNVSLST....LDEQALQNTCEDESTRQ
SpAC2C4.17 TRSMSESKGSGNGKLRKVKKISKNAKRIFSKTRNAISTAFTDMLGKHAKDLTPEQEFILE
1261 1320
F6P23.4 .IRSEKEAKAAARKIFKNVEQRGAKYIYLEDLMRFLREDEAMKTMGLFEGAPENKRISKS
F6P23.3 .IRSEKEAKAAARKIFKNVAQPGTKHIYLEDLMRFLRVDEAMKTMCLFEGALVTKKITKS
SpAC2C4.17 TIRSKKKCLALARKIWYSLVPEGEDCFQKEDLIGLIPDDEINDIFHILDND.YSRTVTLD
1321 1380
F6P23.4 ALKNWLVNAFRERRALALTLNDTKTAVNKLHHMINIVTAIVIVVIWLVLLEIASSKVLLF
F6P23.3 ALKNWLVNAFRERRALALTLNDTKTAVNKLHHMISFLTAIVIIVIWLILLEIATSKYLLF
SpAC2C4.17 EMEQFTREISIEFRSISSSLRDVDLALGKLDRVGLGVVGIIAVLTFISFLDTSFATILAA
1381 1440
F6P23.4 VSSQVVLLAFIFGNTVKTVFESIIFLFIVHPYDVGDRCEIDSVQLVVEEMNILTTVFLRY
F6P23.3 LTSQVVLLAFMFGNSLKTVFESIIFLFIIHPYDVGDRLLIDTVEMVVEEMNILTTVFLRA
SpAC2C4.17 FGTTLLSLSFVFSTSAQELMSSIIFLFSKHPFDISDVVIVNNIKYEVVSLSLLFTVFRTM
1441 1500
F6P23.4 DNLKIMYPNSLLWQKSINNYYRSPDMGDAIEFCVHITTPLEKISVIKQRISNYIDNKPEY
F6P23.3 DNLKIVYPNILLWQKAIHNYNRSPDMGDEVTCCVHITTPPEKIAAIKQRISSYIDSKPEY
SpAC2C4.17 GGSTVQAPNSLLNTLFIENLRRSQPQSETITIVSPFATDFKQLERLRDLLLTFVKENERD
1501 1560
F6P23.4 WYPQAKIIVKDLEDLHIVRLAIWPCHRINHQDMAERWTRRAVLVEEVIKILLELDIQHRF
F6P23.3 WYPKADVIVKDVEDLNIVRIAIWLCHKINHQNMGERFTRRALLIEEVIKILLELDIQYRF
SpAC2C4.17 FRPIIDLNVSDFSTLDSLKFTVTYYYKSNWQNVSLQCVRRNKFMCALKNAIATTNLPAVA
1561 1620
F6P23.4 YPLDINVRTMPTVVSSRVP.PGWSQNQPA*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
F6P23.3 HPLDINVKTMPTVVSSRVP.PAWSQNPDLR*~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SpAC2C4.17 DPVRGSPDYPFVIEQYNLERPEYSKTASRPQFSDISSTASSNSLSNKPGFAHSESRNYHT
1621 1680
SpAC2C4.17 HDEDNSSDDNHKREDRGHLPAQYLRQSVATWQIPNLISAIEAYDSQNESSQENATYTVVE
1681 1740
SpAC2C4.17 SNGNANGDNTATNSQGATDNGQTTTNTTQNNVDNTQATTDNTQANTDNMQVAIDYSQNMD
1741
SpAC2C4.17 GQIQY*
written 22 Aug 97
updated 18 Oct 97
Larry Parnell