Gene T10M13.12
Putative Identification predicted protein of unknown function, putative transposon
Position 61456 to 66456, from the initial methionine to the termination codon
Strand +
EST match similar to rice EST D40083
Database match Z. mays transposable element Mu7 X15872

T10M13.12, in the region of position 66000, exhibits significant similarity to both rice EST D40083 (64% identity) and to a portion of the Z. mays rcy:Mu7 Cy transposable element system (X15872). T10M13.12 either contains a transposon or encodes a remnant of an earlier transposition event.

CDS:  The table below lists the coordinates of the exons for T10M13.12 and which exon predicting programs selected the 5' and 3' termini (GS = GenScan, Gr = GRAIL, M = MZEF, NPG = NetPlantGene - selects splice sites only, not exons). Splice sites suggested by aligning T10M13 to rice EST D40083 are designated by est.

Exon Range 5' 3'
1 61456 - 61542 GS, Gr GS, Gr, NPG
2 61622 - 61711 GS, Gr, M, NPG GS, Gr, M, NPG
3 61790 - 61894 GS, Gr, NPG GS, NPG
4 61969 - 62047 GS, Gr, M, NPG M, NPG
5 62325 - 62394 GS, Gr, M, NPG GS, Gr, M, NPG
6 62480 - 62524 GS, Gr, M, NPG GS, Gr, M, NPG
7 62731 - 65613 GS, Gr, M, NPG est, GS, Gr, NPG
8 65763 - 65979 est, GS, Gr, M, NPG GS, Gr, M, NPG
9 66151 - 66456 est, GS, Gr, NPG Gr

Alternate exons not used in building the gene model. GenScan predicts an internal exon from 61969 to 62074. GenScan concatenates gene models for T10M13.12 and T10M13.13 and so fails to predict a terminal exon; an internal exon is predicted from 66151 to 66443. GRAIL predicts internal exons from 61790 to 61930, from 61969 to 62070 and from 62152 to 62236. MZEF predicts exon 7 from 62731 to 64130. NetPlantGene selects many putative splice sites in this region. Of note are a splice acceptor at 62152 (confidence score = 0.96) and a splice donor at 62236 (0.96).

Complete CDS of T10M13.12

ATGCAATCGGATTCGGGTTTGCCTCCCAAGACGTATTCGGGTGTCAAATTCGCTCTCGTT
GGATTCAATCCCATCCATGGAAACTCGTTACGGTCGAAGCTAGTGAGTGGTGGTGGTGTT
GATGTTGGCCAATTCACTCAGTCATGTACTCACCTCATCGTAGATAAGCTTCTCTATGAT
GATCCGATTTGCGTTGCTGCTCGAAACAGCGGGAAGGTAGTTGTCACCGGGTCATGGGTT
GATCACAGCTTCGACATTGGAATGCTTGACAATGCAAATTCGATATTGTATAGGCCTCTT
AGAGATTTGAATGGGATTCCAGGGTCCAAAGCCTTAGTTGTGTGCTTGACTGGCTACCAA
GGAGAAAAGTATGAGCTAGCCAAGCGGATTAAGAGGATTAAACTTGTGAACCACCGTTGG
TTAGAGGACTGCTTAAAGAATTGGAAGCTCCTACCTGAGGTTGATTACGAGATAAGTGGC
TATGAGTTGGACATAATGGAGGCTTCAGCTAGGGATTCTGAGGACGAAGCAGAAGATGCC
TCTGTCAAGCCTGCAAATACCAGTCCTCTTGGTCTTAGGGTTGGTGCCGTGCCTGCAGTT
GAAATCTCTAAGCCGGGAGGAAAAGATTTCCCTCTTGAGGAAGGGTCATCATTATGTAAT
ACGTCCAAAGATAATTGGTTAACTCCTAAAAGGACGGACAGACCTTTTGAAGCAATGGTC
TCTACTGATCTAGGTGTTGCTCAGCAGCATAATTACGTGTCCCCCATTAGGGTTGCAAAC
AAGACTCCTGAGCAAGGGATGAGCAAAATGGAGACTGATGGCTCGACGTCTATTAACAGG
AGTATCAGAAGGCATTCTTCTCTAGCCACTTATTCAAGGAAAACACTTCAGAGATCGCCA
GAGACTGATACTTTGGGAAAAGAGTCAAGTGGCCAAAACCGTTCCTTGAGAATGGATGAC
AAGGGCCTAAAAGCTTCGTCTGCCTTTAATACCTCTGCATCAAAATCTGGTTCTTCCATG
GAAAGAACGTCACTCTTTCGAGATCTTGGCAAGATTGATATGTTGCATGGCGAGGAGTTC
CCTCCGATGATGCCTCAGGCAAAATTTACAGATGGATCTGTCAGTAGGAAAGATTCACTG
AGAGTACACCACAACAGTGAGGCAAGTATTCCACCACCGTCTAGTTTGTTATTGCAGGAA
CTAAGACCAAGTTCGCCTAACGACAACCTTAGGCCTGTGATGAGCATTAGTGACCCAACT
GAAAGTGAGGAAGCTGGCCATAAATCACCCACGAGTGAGTTAAACACTAAACTGTTGAGC
TCTAATGTGGTACCCATGGTCGATGCTCTTTCAACTGCGGAGAATATCATTTCAAATTGT
GCGTGGGATGAAATACCGGAGAAATCATTGACTGAGAGAATGACAGAAAATGTCTTATTG
CAGGAACAAAGATCAGGCTCACCTAAGCAAAACCTTAGTGTTGTGCCAAACCTCAGGGAA
GCTGCACATGAGTTGGATCTGAGTGATTCAGCAGCTAGGTTGTTCAATTCAGGTGTTGTT
CCCATGGAAGCTGATATCAGAACTCCAGAAAATTCTACTATGAAGGGTGCATTGGATGAA
GTACCTGAAAGATCTGTAACTGACCCTGTGATGAGGAGATCTAGCACCTCTCCTGGATCG
GGTTTAATCAGAATGAAAGACAAGCAAGAAACAGAGCTGACCACGAAGAAAACAGCTCCA
AAAAAGAGCCTAGGCACCAGAGGCAGGAAGAAGAACCCCATTAACCAAAAGGGATCAATA
TACTTGAGCGAACCTTCCCCAACGGACGAGCGCAATGTTTGTCTAAACAAAGGAAAAGTT
TCAGCGCCAGTAACAGGTAATAGCAATCAAAAAGAGATATCAAGCCCTGTCCTAAATACT
GAGGTTGTACAAGACATGGCAAAACATATTGACACAGAGACTGAAGCCCTCCAGGGAATT
GACTCTGTAGATAATAAATCTTTAGCCCCAGAAGAGAAAGACCATCTTGTGTTGGATCTG
ATGGTGAACCAAGATAAGCTGCAGGCTAAGACCCCAGAGGCAGCTGATGCAGAGGTGGAA
ATTACGGTGCTAGAACGGGAGCTTAATGATGTTCCAACTGAAGATCCAAGTGATGGTGCA
TTACAATCCGAGGTTGATAAGAATACAAGTAAACGCAAAAGGGAGGCTGGTGTAGGTAAA
AATAGCCTTCAAAGAGGGAAGAAAGGAAGTTCTTTTACAGCCAAAGTAGGAAAATCCAGA
GTCAAGAAGACCAAAATATCTAGAAAAGAAAATGATATCAAAGCAAATGGTACTCTGATG
AAAGATGGAGGGGATAACTCTGCGGATGGGAAGGAGAACTTAGCATTGGAACATGAAAAT
GGGAAGGTCAGTTCTGGTGGAGACCAAAGCCTTGTTGCGGGGGAAACATTAACAAGAAAG
GAAGCTGCCACTAAAGATCCAAGCTATGCTGCAGCGCAATTAGAGGTTGATACAAAGAAA
GGTAAACGCAGAAAGCAGGCCACTGTAGAAGAAAATAGGCTTCAAACACCTAGTGTCAAA
AAGGCGAAAGTTTCTAAAAAAGAAGATGGCGCCAAAGCAAACAATACTGTGAAGAAAGAT
ATATGGATTCACTCTGCAGAAGTGAAGGAGAATGTAGCAGTAGATGAAAATTGTGGAGAT
GTCAGTTCTGATGGAGCTCAAAGCCTGGTTGTGGAGAAATCTTTAGCTAAAAAGGAGGCT
GCAGCTAAGGATCCAAGTAATGCTGCAATGCAATTAGAGTTTGATGATAATAAATGTAAA
CACGGAAAGGAGGGTATTGTAGAAAGAAGTAGCCTTCAAAGTGGAAAGAAAGGAAGTTCT
TCTAGAGTTGAAGTAGGGAAATCAAGTGTCAAGAAGACTAAAAAATCTGAAAAAGGAAGT
GGCACCGAAGCAACCGACACTGTGATGAAAGATGTAGGGGATAATTCTGCAAAAGAGAAG
GAGAACATTGCAGTGGATAATGAATCTAGAAAGGTGGGATCTGGTGGAGACCAAAGCCCG
GTAGCAAGAAAGAAAGTTGCAAAGTCAGCTAAAACAGGTACAAAGGCGGAGAAAGAGTCT
AAGCAGCTCAGGGTTAATCCTTTGGCTAGTAGAAAAGTCTTCCAGGACCAAGAACATGAG
CCGAAATTTTTTATTGTCAGTGGTCCTAGGTCCCAGAGAAACGAATACCAGCAGATCATT
AGGCGTTTAAAAGGAAAATGTTGCCGGGATTCTCATCAGTGGTCTTATCAAGCAACACAT
TTCATTGCTCCTGAAATCCGTAGGACCGAAAAGTTTTTCGCTGCTGCTGCATCTGGAAGT
TGGATTCTGAAGACTGACTATGTGGCTGATTCAAAGGAAGCTGGGAAACTATTACAAGAG
GAGCCTTATGAATGGCACAGTTCTGGTCTTAGTGCTGATGGTGCGATAAACCTCGAGTCC
CCAAAGAAATGGCGGCTCGTCAGGGAGAAAACAGGACACGGTGCTTTATATGGACTGCGC
ATTGTTGTATACGGTGACTGCACCATCCCTTGTTTGGATACACTAAAGCGAGCTGTGAAA
GCTGGGGATGGTACGATACTTGCAACGGCGCCTCCTTACACGCGTTTCTTGAATCAAAAC
ACGGATTTCGCGTTGATAAGCCCCGGGATGCCGCGGGATGACGTCTGGATCCAAGAGTTT
ATACGCCACGAAATCCCGTGTGTCCTCTCCGATTACCTGGTGGAGTACGTTTGTAAACCC
GGATACGCACTTGACAAGCATGTGCTCTACAACACGAACTCATGGGCAGAAAAGTCGTTT
AACAAGATGCAGCTTAGAGCAGATTTGTGTGTGTACCATTAA

 

Protein translation

MQSDSGLPPKTYSGVKFALVGFNPIHGNSLRSKLVSGGGVDVGQFTQSCTHLIVDKLLYD
DPICVAARNSGKVVVTGSWVDHSFDIGMLDNANSILYRPLRDLNGIPGSKALVVCLTGYQ
GEKYELAKRIKRIKLVNHRWLEDCLKNWKLLPEVDYEISGYELDIMEASARDSEDEAEDA
SVKPANTSPLGLRVGAVPAVEISKPGGKDFPLEEGSSLCNTSKDNWLTPKRTDRPFEAMV
STDLGVAQQHNYVSPIRVANKTPEQGMSKMETDGSTSINRSIRRHSSLATYSRKTLQRSP
ETDTLGKESSGQNRSLRMDDKGLKASSAFNTSASKSGSSMERTSLFRDLGKIDMLHGEEF
PPMMPQAKFTDGSVSRKDSLRVHHNSEASIPPPSSLLLQELRPSSPNDNLRPVMSISDPT
ESEEAGHKSPTSELNTKLLSSNVVPMVDALSTAENIISNCAWDEIPEKSLTERMTENVLL
QEQRSGSPKQNLSVVPNLREAAHELDLSDSAARLFNSGVVPMEADIRTPENSTMKGALDE
VPERSVTDPVMRRSSTSPGSGLIRMKDKQETELTTKKTAPKKSLGTRGRKKNPINQKGSI
YLSEPSPTDERNVCLNKGKVSAPVTGNSNQKEISSPVLNTEVVQDMAKHIDTETEALQGI
DSVDNKSLAPEEKDHLVLDLMVNQDKLQAKTPEAADAEVEITVLERELNDVPTEDPSDGA
LQSEVDKNTSKRKREAGVGKNSLQRGKKGSSFTAKVGKSRVKKTKISRKENDIKANGTLM
KDGGDNSADGKENLALEHENGKVSSGGDQSLVAGETLTRKEAATKDPSYAAAQLEVDTKK
GKRRKQATVEENRLQTPSVKKAKVSKKEDGAKANNTVKKDIWIHSAEVKENVAVDENCGD
VSSDGAQSLVVEKSLAKKEAAAKDPSNAAMQLEFDDNKCKHGKEGIVERSSLQSGKKGSS
SRVEVGKSSVKKTKKSEKGSGTEATDTVMKDVGDNSAKEKENIAVDNESRKVGSGGDQSP
VARKKVAKSAKTGTKAEKESKQLRVNPLASRKVFQDQEHEPKFFIVSGPRSQRNEYQQII
RRLKGKCCRDSHQWSYQATHFIAPEIRRTEKFFAAAASGSWILKTDYVADSKEAGKLLQE
EPYEWHSSGLSADGAINLESPKKWRLVREKTGHGALYGLRIVVYGDCTIPCLDTLKRAVK
AGDGTILATAPPYTRFLNQNTDFALISPGMPRDDVWIQEFIRHEIPCVLSDYLVEYVCKP
GYALDKHVLYNTNSWAEKSFNKMQLRADLCVYH*

 

Analysis of repeated elements in protein T10M13.12. Dotplot analysis of T10M13.12 revealed repeated peptide sequences. These are shown below.

                  .         .         .  
     441 SNVVPMVDALSTAENIISNCAWDEIPEKSLTE 472
         | ||||   : | ||     | ||:||:|.|:
     517 SGVVPMEADIRTPENSTMKGALDEVPERSVTD 548

There are three copies of ~110 amino acid repeat in T10M13.12. This repeat is designated as the "B" repeat. Copy 1 is from residues 713-823, copy 2 from 824-921 and copy 3 from 922-1032.

    1                                                   50
B1  TEDPSDGALQ SEVDKNTSKR KREAGVGKNS LQRGKKGSSF TAKVGKSRVK
B2  TKDPSYAAAQ LEVDTKKGKR RKQATVEENR LQT....... ......PSVK
B3  AKDPSNAAMQ LEFDDNKCKH GKEGIVERSS LQSGKKGSSS RVEVGKSSVK

    51                                                 100
B1  KTKISRKEND IKANGTLMKD GGDNSADGKE NLALEHENGK VSSGGDQSLV
B2  KAKVSKKEDG AKANNTVKKD IWIHSAEVKE NVAVDENCGD VSSDGAQSLV
B3  KTKKSEKGSG TEATDTVMKD VGDNSAKEKE NIAVDNESRK VGSGGDQSPV

    101      111
B1  AGETLTRKEA A
B2  VEKSLAKKEA A
B3  ARKKVAKSAK T

 


written 30 Jul 97
updated 29 Dec 97
updated 4 Aug 98
Larry Parnell