| Gene | T10M13.12 |
| Putative Identification | predicted protein of unknown function, putative transposon |
| Position | 61456 to 66456, from the initial methionine to the termination codon |
| Strand | + |
| EST match | similar to rice EST D40083 |
| Database match | Z. mays transposable element Mu7 X15872 |
T10M13.12, in the region of position 66000, exhibits significant similarity to both rice EST D40083 (64% identity) and to a portion of the Z. mays rcy:Mu7 Cy transposable element system (X15872). T10M13.12 either contains a transposon or encodes a remnant of an earlier transposition event.
CDS: The table below lists the coordinates of the exons for T10M13.12 and which exon predicting programs selected the 5' and 3' termini (GS = GenScan, Gr = GRAIL, M = MZEF, NPG = NetPlantGene - selects splice sites only, not exons). Splice sites suggested by aligning T10M13 to rice EST D40083 are designated by est.
| Exon | Range | 5' | 3' |
|---|---|---|---|
| 1 | 61456 - 61542 | GS, Gr | GS, Gr, NPG |
| 2 | 61622 - 61711 | GS, Gr, M, NPG | GS, Gr, M, NPG |
| 3 | 61790 - 61894 | GS, Gr, NPG | GS, NPG |
| 4 | 61969 - 62047 | GS, Gr, M, NPG | M, NPG |
| 5 | 62325 - 62394 | GS, Gr, M, NPG | GS, Gr, M, NPG |
| 6 | 62480 - 62524 | GS, Gr, M, NPG | GS, Gr, M, NPG |
| 7 | 62731 - 65613 | GS, Gr, M, NPG | est, GS, Gr, NPG |
| 8 | 65763 - 65979 | est, GS, Gr, M, NPG | GS, Gr, M, NPG |
| 9 | 66151 - 66456 | est, GS, Gr, NPG | Gr |
Alternate exons not used in building the gene model. GenScan predicts an internal exon from 61969 to 62074. GenScan concatenates gene models for T10M13.12 and T10M13.13 and so fails to predict a terminal exon; an internal exon is predicted from 66151 to 66443. GRAIL predicts internal exons from 61790 to 61930, from 61969 to 62070 and from 62152 to 62236. MZEF predicts exon 7 from 62731 to 64130. NetPlantGene selects many putative splice sites in this region. Of note are a splice acceptor at 62152 (confidence score = 0.96) and a splice donor at 62236 (0.96).
Complete CDS of T10M13.12
ATGCAATCGGATTCGGGTTTGCCTCCCAAGACGTATTCGGGTGTCAAATTCGCTCTCGTT GGATTCAATCCCATCCATGGAAACTCGTTACGGTCGAAGCTAGTGAGTGGTGGTGGTGTT GATGTTGGCCAATTCACTCAGTCATGTACTCACCTCATCGTAGATAAGCTTCTCTATGAT GATCCGATTTGCGTTGCTGCTCGAAACAGCGGGAAGGTAGTTGTCACCGGGTCATGGGTT GATCACAGCTTCGACATTGGAATGCTTGACAATGCAAATTCGATATTGTATAGGCCTCTT AGAGATTTGAATGGGATTCCAGGGTCCAAAGCCTTAGTTGTGTGCTTGACTGGCTACCAA GGAGAAAAGTATGAGCTAGCCAAGCGGATTAAGAGGATTAAACTTGTGAACCACCGTTGG TTAGAGGACTGCTTAAAGAATTGGAAGCTCCTACCTGAGGTTGATTACGAGATAAGTGGC TATGAGTTGGACATAATGGAGGCTTCAGCTAGGGATTCTGAGGACGAAGCAGAAGATGCC TCTGTCAAGCCTGCAAATACCAGTCCTCTTGGTCTTAGGGTTGGTGCCGTGCCTGCAGTT GAAATCTCTAAGCCGGGAGGAAAAGATTTCCCTCTTGAGGAAGGGTCATCATTATGTAAT ACGTCCAAAGATAATTGGTTAACTCCTAAAAGGACGGACAGACCTTTTGAAGCAATGGTC TCTACTGATCTAGGTGTTGCTCAGCAGCATAATTACGTGTCCCCCATTAGGGTTGCAAAC AAGACTCCTGAGCAAGGGATGAGCAAAATGGAGACTGATGGCTCGACGTCTATTAACAGG AGTATCAGAAGGCATTCTTCTCTAGCCACTTATTCAAGGAAAACACTTCAGAGATCGCCA GAGACTGATACTTTGGGAAAAGAGTCAAGTGGCCAAAACCGTTCCTTGAGAATGGATGAC AAGGGCCTAAAAGCTTCGTCTGCCTTTAATACCTCTGCATCAAAATCTGGTTCTTCCATG GAAAGAACGTCACTCTTTCGAGATCTTGGCAAGATTGATATGTTGCATGGCGAGGAGTTC CCTCCGATGATGCCTCAGGCAAAATTTACAGATGGATCTGTCAGTAGGAAAGATTCACTG AGAGTACACCACAACAGTGAGGCAAGTATTCCACCACCGTCTAGTTTGTTATTGCAGGAA CTAAGACCAAGTTCGCCTAACGACAACCTTAGGCCTGTGATGAGCATTAGTGACCCAACT GAAAGTGAGGAAGCTGGCCATAAATCACCCACGAGTGAGTTAAACACTAAACTGTTGAGC TCTAATGTGGTACCCATGGTCGATGCTCTTTCAACTGCGGAGAATATCATTTCAAATTGT GCGTGGGATGAAATACCGGAGAAATCATTGACTGAGAGAATGACAGAAAATGTCTTATTG CAGGAACAAAGATCAGGCTCACCTAAGCAAAACCTTAGTGTTGTGCCAAACCTCAGGGAA GCTGCACATGAGTTGGATCTGAGTGATTCAGCAGCTAGGTTGTTCAATTCAGGTGTTGTT CCCATGGAAGCTGATATCAGAACTCCAGAAAATTCTACTATGAAGGGTGCATTGGATGAA GTACCTGAAAGATCTGTAACTGACCCTGTGATGAGGAGATCTAGCACCTCTCCTGGATCG GGTTTAATCAGAATGAAAGACAAGCAAGAAACAGAGCTGACCACGAAGAAAACAGCTCCA AAAAAGAGCCTAGGCACCAGAGGCAGGAAGAAGAACCCCATTAACCAAAAGGGATCAATA TACTTGAGCGAACCTTCCCCAACGGACGAGCGCAATGTTTGTCTAAACAAAGGAAAAGTT TCAGCGCCAGTAACAGGTAATAGCAATCAAAAAGAGATATCAAGCCCTGTCCTAAATACT GAGGTTGTACAAGACATGGCAAAACATATTGACACAGAGACTGAAGCCCTCCAGGGAATT GACTCTGTAGATAATAAATCTTTAGCCCCAGAAGAGAAAGACCATCTTGTGTTGGATCTG ATGGTGAACCAAGATAAGCTGCAGGCTAAGACCCCAGAGGCAGCTGATGCAGAGGTGGAA ATTACGGTGCTAGAACGGGAGCTTAATGATGTTCCAACTGAAGATCCAAGTGATGGTGCA TTACAATCCGAGGTTGATAAGAATACAAGTAAACGCAAAAGGGAGGCTGGTGTAGGTAAA AATAGCCTTCAAAGAGGGAAGAAAGGAAGTTCTTTTACAGCCAAAGTAGGAAAATCCAGA GTCAAGAAGACCAAAATATCTAGAAAAGAAAATGATATCAAAGCAAATGGTACTCTGATG AAAGATGGAGGGGATAACTCTGCGGATGGGAAGGAGAACTTAGCATTGGAACATGAAAAT GGGAAGGTCAGTTCTGGTGGAGACCAAAGCCTTGTTGCGGGGGAAACATTAACAAGAAAG GAAGCTGCCACTAAAGATCCAAGCTATGCTGCAGCGCAATTAGAGGTTGATACAAAGAAA GGTAAACGCAGAAAGCAGGCCACTGTAGAAGAAAATAGGCTTCAAACACCTAGTGTCAAA AAGGCGAAAGTTTCTAAAAAAGAAGATGGCGCCAAAGCAAACAATACTGTGAAGAAAGAT ATATGGATTCACTCTGCAGAAGTGAAGGAGAATGTAGCAGTAGATGAAAATTGTGGAGAT GTCAGTTCTGATGGAGCTCAAAGCCTGGTTGTGGAGAAATCTTTAGCTAAAAAGGAGGCT GCAGCTAAGGATCCAAGTAATGCTGCAATGCAATTAGAGTTTGATGATAATAAATGTAAA CACGGAAAGGAGGGTATTGTAGAAAGAAGTAGCCTTCAAAGTGGAAAGAAAGGAAGTTCT TCTAGAGTTGAAGTAGGGAAATCAAGTGTCAAGAAGACTAAAAAATCTGAAAAAGGAAGT GGCACCGAAGCAACCGACACTGTGATGAAAGATGTAGGGGATAATTCTGCAAAAGAGAAG GAGAACATTGCAGTGGATAATGAATCTAGAAAGGTGGGATCTGGTGGAGACCAAAGCCCG GTAGCAAGAAAGAAAGTTGCAAAGTCAGCTAAAACAGGTACAAAGGCGGAGAAAGAGTCT AAGCAGCTCAGGGTTAATCCTTTGGCTAGTAGAAAAGTCTTCCAGGACCAAGAACATGAG CCGAAATTTTTTATTGTCAGTGGTCCTAGGTCCCAGAGAAACGAATACCAGCAGATCATT AGGCGTTTAAAAGGAAAATGTTGCCGGGATTCTCATCAGTGGTCTTATCAAGCAACACAT TTCATTGCTCCTGAAATCCGTAGGACCGAAAAGTTTTTCGCTGCTGCTGCATCTGGAAGT TGGATTCTGAAGACTGACTATGTGGCTGATTCAAAGGAAGCTGGGAAACTATTACAAGAG GAGCCTTATGAATGGCACAGTTCTGGTCTTAGTGCTGATGGTGCGATAAACCTCGAGTCC CCAAAGAAATGGCGGCTCGTCAGGGAGAAAACAGGACACGGTGCTTTATATGGACTGCGC ATTGTTGTATACGGTGACTGCACCATCCCTTGTTTGGATACACTAAAGCGAGCTGTGAAA GCTGGGGATGGTACGATACTTGCAACGGCGCCTCCTTACACGCGTTTCTTGAATCAAAAC ACGGATTTCGCGTTGATAAGCCCCGGGATGCCGCGGGATGACGTCTGGATCCAAGAGTTT ATACGCCACGAAATCCCGTGTGTCCTCTCCGATTACCTGGTGGAGTACGTTTGTAAACCC GGATACGCACTTGACAAGCATGTGCTCTACAACACGAACTCATGGGCAGAAAAGTCGTTT AACAAGATGCAGCTTAGAGCAGATTTGTGTGTGTACCATTAA
Protein translation
MQSDSGLPPKTYSGVKFALVGFNPIHGNSLRSKLVSGGGVDVGQFTQSCTHLIVDKLLYD DPICVAARNSGKVVVTGSWVDHSFDIGMLDNANSILYRPLRDLNGIPGSKALVVCLTGYQ GEKYELAKRIKRIKLVNHRWLEDCLKNWKLLPEVDYEISGYELDIMEASARDSEDEAEDA SVKPANTSPLGLRVGAVPAVEISKPGGKDFPLEEGSSLCNTSKDNWLTPKRTDRPFEAMV STDLGVAQQHNYVSPIRVANKTPEQGMSKMETDGSTSINRSIRRHSSLATYSRKTLQRSP ETDTLGKESSGQNRSLRMDDKGLKASSAFNTSASKSGSSMERTSLFRDLGKIDMLHGEEF PPMMPQAKFTDGSVSRKDSLRVHHNSEASIPPPSSLLLQELRPSSPNDNLRPVMSISDPT ESEEAGHKSPTSELNTKLLSSNVVPMVDALSTAENIISNCAWDEIPEKSLTERMTENVLL QEQRSGSPKQNLSVVPNLREAAHELDLSDSAARLFNSGVVPMEADIRTPENSTMKGALDE VPERSVTDPVMRRSSTSPGSGLIRMKDKQETELTTKKTAPKKSLGTRGRKKNPINQKGSI YLSEPSPTDERNVCLNKGKVSAPVTGNSNQKEISSPVLNTEVVQDMAKHIDTETEALQGI DSVDNKSLAPEEKDHLVLDLMVNQDKLQAKTPEAADAEVEITVLERELNDVPTEDPSDGA LQSEVDKNTSKRKREAGVGKNSLQRGKKGSSFTAKVGKSRVKKTKISRKENDIKANGTLM KDGGDNSADGKENLALEHENGKVSSGGDQSLVAGETLTRKEAATKDPSYAAAQLEVDTKK GKRRKQATVEENRLQTPSVKKAKVSKKEDGAKANNTVKKDIWIHSAEVKENVAVDENCGD VSSDGAQSLVVEKSLAKKEAAAKDPSNAAMQLEFDDNKCKHGKEGIVERSSLQSGKKGSS SRVEVGKSSVKKTKKSEKGSGTEATDTVMKDVGDNSAKEKENIAVDNESRKVGSGGDQSP VARKKVAKSAKTGTKAEKESKQLRVNPLASRKVFQDQEHEPKFFIVSGPRSQRNEYQQII RRLKGKCCRDSHQWSYQATHFIAPEIRRTEKFFAAAASGSWILKTDYVADSKEAGKLLQE EPYEWHSSGLSADGAINLESPKKWRLVREKTGHGALYGLRIVVYGDCTIPCLDTLKRAVK AGDGTILATAPPYTRFLNQNTDFALISPGMPRDDVWIQEFIRHEIPCVLSDYLVEYVCKP GYALDKHVLYNTNSWAEKSFNKMQLRADLCVYH*
Analysis of repeated elements in protein T10M13.12. Dotplot analysis of T10M13.12 revealed repeated peptide sequences. These are shown below.
. . .
441 SNVVPMVDALSTAENIISNCAWDEIPEKSLTE 472
| |||| : | || | ||:||:|.|:
517 SGVVPMEADIRTPENSTMKGALDEVPERSVTD 548
There are three copies of ~110 amino acid repeat in T10M13.12. This repeat is designated as the "B" repeat. Copy 1 is from residues 713-823, copy 2 from 824-921 and copy 3 from 922-1032.
1 50
B1 TEDPSDGALQ SEVDKNTSKR KREAGVGKNS LQRGKKGSSF TAKVGKSRVK
B2 TKDPSYAAAQ LEVDTKKGKR RKQATVEENR LQT....... ......PSVK
B3 AKDPSNAAMQ LEFDDNKCKH GKEGIVERSS LQSGKKGSSS RVEVGKSSVK
51 100
B1 KTKISRKEND IKANGTLMKD GGDNSADGKE NLALEHENGK VSSGGDQSLV
B2 KAKVSKKEDG AKANNTVKKD IWIHSAEVKE NVAVDENCGD VSSDGAQSLV
B3 KTKKSEKGSG TEATDTVMKD VGDNSAKEKE NIAVDNESRK VGSGGDQSPV
101 111
B1 AGETLTRKEA A
B2 VEKSLAKKEA A
B3 ARKKVAKSAK T
written 30 Jul 97
updated 29 Dec 97
updated 4 Aug 98
Larry
Parnell