Gene T10M13.3
Putative Identification curlyleaf-like 1
Position 11958 to 18008, from the initial methionine to the termination codon
Strand +
EST hits none
Database match A. thaliana curlyleaf 1 (Y10580) and F26B6.3

T10M13.3 is a putative curlyleaf-like 1 gene encoding a protein with strong similarity to curlyleaf 1 of A. thaliana and enhancer of zeste from D. melanogaster, as well as other sequences in the polycomb group. For information about plant homeotic genes please see the article by Goodrich J, et al. (1997, Nature 385:44-51).

 

CDS:  The table below lists the coordinates of the T10M13.3 exons and which exon prediction algorithms selected the 5' and 3' termini (GS = GenScan, Gr = GRAIL, M = MZEF, NPG = NetPlantGene - selects splice sites only, not exons).

Exon Range 5' 3'
exon 1 11958 - 12100 GS GS
exon 2 12857 - 12960 GS GS
exon 3 13258 - 13364 GS, Gr, M GS, Gr, M, NPG
exon 4 13465 - 13730* GS, Gr, NPG NPG
exon 5 13871 - 14029 GS, Gr, M, NPG GS, Gr, M, NPG
exon 6 14283 - 14367 GS, Gr, NPG GS, Gr, NPG
exon 7 14471 - 14614 GS, Gr, M, NPG GS, M, NPG
exon 8 14703 - 14750 GS, NPG GS
exon 9 15150 - 15754 GS, Gr, M, NPG GS, Gr, M, NPG
exon 10 15844 - 15991 GS, Gr GS, Gr, NPG
exon 11 16313 - 16533 GS, Gr, M, NPG Gr, M, NPG
exon 12 16605 - 16736 GS, Gr, M, NPG NPG
exon 13 16830 - 16917 GS, Gr, M, NPG GS, Gr, M, NPG
exon 14 17008 - 17055 GS, Gr, M, NPG GS, Gr, M, NPG
exon 15 17135 - 17263 GS, Gr, M, NPG GS, Gr, M, NPG
exon 16 17343 - 17420 GS, Gr, M, NPG GS, Gr, M, NPG
exon 17 17817 - 18008 GS, Gr, NPG Gr

*Note:  This non-consensus splice donor is predicted by NetPlantGene with a confidence score of 0.96 and the point at which the putative mRNA is spliced is confirmed by similarity to curlyleaf 1.

Alternate exons not used in building the gene model:   GenScan selects an exon from 13465 to 13601 and one from 17817 to 17989. GRAIL fails to predict an initial exon, but selects an exon from 12030 to 12187 as the first exon of T10M13.3. Other exons selected by GRAIL are from 13465 to 13760, from 14471 to 14688, from 14802 to 14895. MZEF predicts an exon from 14802 to 14895. NetPlantGene predicts several other splice donor and splice acceptor sites. Some of these are a splice acceptor at 14296 (confidence score = 0.95) and splice donors at 14906 (1.00), 15769 (0.99), 16736 (0.87).

Complete CDS of T10M13.3

ATGTTCAGGAAAGGACTGTCATACATTCAGGATGGATTTACAGAGATTGCAATGGATGTT
TCCTTAGTTCAGGGCGTGCTCAACTTCAATCGTCACAATTATCTCTTCAAGCAGAGGCTC
TCGGATTTCTTCATGCCCTTCAATCCACACTCCGTTCGTCCTTCTCACAAGTCTGATTGC
GGAAAAAGCAGAGAGAGAGAGAAAGTTCGAGCGGAAGAGAAGCGGAAAGCTCGAGGAGTC
ATCAATGATGATGATGATGATGGTGAAGAAGAAGAAGATAGACTCGAGGGTTTGGAAAAC
AGATTAAGTGAGCTTAAAAGGAAAATTCAAGGAGAAAGAGTTAGGTCTATTAAAGAGAAA
TTTGAGGCTAATAGAAAGAAAGTGGATGCTCATGTTTCTCCCTTTTCATCTGCTGCATCG
AGCCGAGCTACCGCAGAGGATAATGGAAATAGCAATATGCTTTCTTCGAGAATGAGAATG
CCACTCTGCAAGTTAAATGGTTTTTCTCATGGTGTGGGAGATAGAGACTATGTTCCTACT
AAGGATGTTATATCAGCAAGTGTCAAGCTTCCTATTGCTGAGAGAATACCGCCATACACT
ACCTGGATATTTTTGGACAGAAATCAAAGAATGGCTGAAGATCAGTCTGTGGTTGGTCGA
AGACAAATCTACTATGAACAACATGGTGGTGAGACGCTAATATGCAGCGATAGTGAGGAA
GAACCAGAACCTGAGGAGGAAAAACGTGAATTTTCCGAGGGTGAAGATTCCATTATATGG
TTAATTGGGCAGGAGTATGGCATGGGTGAGGAAGTGCAGGATGCCCTTTGCCAGTTGCTA
AGCGTAGATGCTTCTGATATCCTGGAAAGATACAATGAGCTCAAGTTGAAGGATAAGCAG
AATACCGAGGAATTTTCTAATTCCGGATTCAAGCTGGGAATATCTCTGGAAAAGGGCCTT
GGTGCAGCTCTAGATTCTTTTGATAATCTTTTCTGCCGCCGTTGCTTGGTATTTGACTGT
CGTCTGCATGGATGTTCTCAGCCTTTGATTAGTGCTCTCAAGGCGGTCAGAGAAGTACCA
GAAACATGCAGTAATTTTGCATCTAAAGCAGAAGAGAAAGCTTCAGAAGAGGAATGCAGC
AAGGCTGTCTCCTCTGATGTTCCCCATGCTGCTGCTAGTGGTGTCAGTCTGCAAGTTGAG
AAGACTGATATTGGTATCAAGAATGTAGATTCATCCTCTGGTGTAGAACAAGAGCATGGA
ATTAGAGGAAAGCGTGAGGTCCCAATTCTAAAAGACTCCAATGATCTGCCTAATTTATCG
AACAAGAAACAGAAGACCGCAGCCTCAGATACAAAAATGTCATTTGTTAATTCTGTCCCT
AGCTTAGATCAGGCATTGGATAGCACAAAGGGTGATCAAGGTGGAACAACTGACAATAAA
GTAAACAGAGACTCAGAAGCTGATGCAAAAGAAGTAGGTGAGCCTATTCCAGACAATTCG
GTCCATGATGGTGGTTCCTCAATTTGTCAGCCACACCATGGTAGTGGAAACGGAGCAATA
ATCATTGCAGAAATGTCTGAGACAAGTCGACCATCTACAGAGTGGAATCCTATCGAGAAG
GATCTTTACTTGAAGGGAGTCGAAATCTTTGGAAGAAACAGCTGTCTTATTGCAAGAAAC
CTGCTTTCTGGCTTGAAGACATGCCTAGATGTGTCCAATTACATGCGTGAAAACGAAGTT
TCAGTTTTTCGAAGATCTAGTACCCCAAATTTGCTGTTGGATGATGGCAGGACTGACCCA
GGGAATGATAATGATGAGGTGCCTCCAAGGACAAGATTGTTCCGTAGAAAAGGCAAAACC
CGGAAGCTAAAATACTCTACAAAGTCTGCTGGTCATCCGTCTGTCTGGAAAAGAATAGCT
GGTGGCAAAAACCAGTCCTGTAAACAATACACGCCGTGTGGATGCCTGTCAATGTGCGGA
AAGGATTGCCCTTGTCTAACTAATGAAACTTGCTGCGAGAAATATTGCGGGTGCTCAAAA
AGCTGTAAAAATCGTTTCCGAGGATGTCATTGTGCAAAGAGTCAATGCAGAAGTAGGCAG
TGTCCCTGCTTTGCTGCTGGCAGAGAATGTGATCCAGATGTTTGCAGAAATTGCTGGGTT
AGTTGTGGAGATGGTTCTCTCGGTGAAGCACCAAGACGCGGAGAAGGGCAATGCGGAAAC
ATGAGACTTCTCCTGAGGCAACAACAGAGGATCCTATTGGGAAAGTCTGATGTTGCTGGA
TGGGGTGCTTTTCTAAAGAACTCGGTCAGCAAAAATGAATACCTTGGAGAATACACCGGT
GAATTGATCTCACACCATGAGGCGGATAAGCGTGGGAAAATATATGACCGGGCAAATTCG
TCCTTCCTCTTTGACTTGAATGATCAGTACGTCCTCGATGCTCAACGCAAAGGTGACAAG
CTGAAATTTGCCAATCACTCAGCTAAACCCAATTGCTACGCTAAGGTGATGTTTGTAGCA
GGAGATCACAGGGTCGGGATTTTTGCAAACGAACGAATAGAAGCTAGCGAAGAGCTTTTC
TATGACTATAGATATGGACCAGACCAAGCACCAGTGTGGGCTCGCAAACCTGAAGGCTCC
AAGAAAGATGATTCAGCCATTACTCATCGTAGAGCCAGAAAGCACCAATCTCATTGA

 

Protein translation of T10M13.3

MFRKGLSYIQDGFTEIAMDVSLVQGVLNFNRHNYLFKQRLSDFFMPFNPHSVRPSHKSDC
GKSREREKVRAEEKRKARGVINDDDDDGEEEEDRLEGLENRLSELKRKIQGERVRSIKEK
FEANRKKVDAHVSPFSSAASSRATAEDNGNSNMLSSRMRMPLCKLNGFSHGVGDRDYVPT
KDVISASVKLPIAERIPPYTTWIFLDRNQRMAEDQSVVGRRQIYYEQHGGETLICSDSEE
EPEPEEEKREFSEGEDSIIWLIGQEYGMGEEVQDALCQLLSVDASDILERYNELKLKDKQ
NTEEFSNSGFKLGISLEKGLGAALDSFDNLFCRRCLVFDCRLHGCSQPLISALKAVREVP
ETCSNFASKAEEKASEEECSKAVSSDVPHAAASGVSLQVEKTDIGIKNVDSSSGVEQEHG
IRGKREVPILKDSNDLPNLSNKKQKTAASDTKMSFVNSVPSLDQALDSTKGDQGGTTDNK
VNRDSEADAKEVGEPIPDNSVHDGGSSICQPHHGSGNGAIIIAEMSETSRPSTEWNPIEK
DLYLKGVEIFGRNSCLIARNLLSGLKTCLDVSNYMRENEVSVFRRSSTPNLLLDDGRTDP
GNDNDEVPPRTRLFRRKGKTRKLKYSTKSAGHPSVWKRIAGGKNQSCKQYTPCGCLSMCG
KDCPCLTNETCCEKYCGCSKSCKNRFRGCHCAKSQCRSRQCPCFAAGRECDPDVCRNCWV
SCGDGSLGEAPRRGEGQCGNMRLLLRQQQRILLGKSDVAGWGAFLKNSVSKNEYLGEYTG
ELISHHEADKRGKIYDRANSSFLFDLNDQYVLDAQRKGDKLKFANHSAKPNCYAKVMFVA
GDHRVGIFANERIEASEELFYDYRYGPDQAPVWARKPEGSKKDDSAITHRRARKHQSH*

 

Protein motifs:  WAP-disulfide core signature from residues 659 to 672. This signature is part of a larger domain that directs binding to DNA. The sequence is highlighted in the alignment.

Sequence:	659  CGKDCPCLTNETCC  672

 

Multiple sequence alignmentof A. thaliana polycomb-type proteins

T10M13.3 is aligned to curlyleaf 1 protein and F26B6.3. These latter two sequences are nearly identical; the six differences between them are highlighted.

           1                                                         60
  F26B6.3  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
Atcurlylf  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
 T10M13.3  MFRKGLSYIQDGFTEIAMDVSLVQGVLNFNRHNYLFKQRLSDFFMPFNPHSVRPSHKSDC 

           61                                                       120
  F26B6.3  ~~~~~~MASEASPSSSATRSEPPKDSPAEERGPASKEVSEVIESLKKKLAADRCISIKKR 
Atcurlylf  ~~~~~~MASEASPSSSATRSEPPKDSPAEERGPASKEVSEVIESLKKKLAADRCISIKKR 
 T10M13.3  GKSREREKVRAEEKRKARGVINDDDDDGEEEEDRLEGLENRLSELKRKIQGERVRSIKEK 

           121                                                      180
  F26B6.3  IDENKKNLFAITQSFMRSSMERGGSCKDGSDLLVKRQRDSPGMKSGIDESNNNRYVEDGP 
Atcurlylf  IDENKKNLFAITQSFMRSSMERGGSCKDGSDLLVKRQRDSPGMKSGIDESNNNRYVEDGP 
 T10M13.3  FEANRKKVDAHVSPFSSAASSRATAEDNGNSNMLSSRMRMPLCKL.......NGF..... 

           181                                                      240
  F26B6.3  ASSGMVQGSSVPVK.ISLRPIKMPDIKRLSPYTTWVFLDRNQRMTEDQSVVGRRRIYYDQ 
Atcurlylf  ASSGMVQGSSVPVK.ISLRPIKMPDIKRLSPYTTWVFLDRNQRMTEDQSVVGRRRIYYDQ 
 T10M13.3  .SHGVGDRDYVPTKDVISASVKLPIAERIPPYTTWIFLDRNQRMAEDQSVVGRRQIYYEQ 

           241                                                      300
  F26B6.3  TGGEALICSDSEEEAIDDEEEKRDFLEPEDYIIRMTLEQLGLSDSVLAELASFLSRSTSE 
Atcurlylf  TGGEALICSDSEEEAIDDEEEKRDFLEPEDYIIRMTLEQLGLSDSVLAELANFLSRSTSE 
 T10M13.3  HGGETLICSDSEEEP.EPEEEKREFSEGEDSIIWLIGQEYGMGEEVQDALCQLLSVDASD 

           301                                                      360
  F26B6.3  IKARHGVL.MKEKEVSESGDNQA..ESSLLNKDMEGALDSFDNLFCRRCLVFDCRLHGCS 
Atcurlylf  IKARHGVL.MKEKEVSESGDNQA..ESSLLNKDMEGALDSFDNLFCRRCLVFDCRLHGCS 
 T10M13.3  ILERYNELKLKDKQNTEEFSNSGFKLGISLEKGLGAALDSFDNLFCRRCLVFDCRLHGCS 

           361                                                      420
  F26B6.3  QDLIFPAEKPAPWCPPVDENLTCGANCYKTLLKSGRFPGYGTIEGKTGTSSDGAGTKTTP 
Atcurlylf  QDLIFPAEKPAPWCPPVDENLTCGANCYKTLLKSGRFPGYGPIEGKTGTSSDGAGTKTTP 
 T10M13.3  QPLI.SALKAVREVPE.....TCSNFASKAEEKASE......EECSKAVSSD........ 

           421                                                      480
  F26B6.3  TKFSSKLNGRKPKTFPSESASSNEKCALETSDSENGLQQDTNSDKVSSSPKVKGSGRRVG 
Atcurlylf  TKFSSKLNGRKPKTFPSESASSNEKCALETSDSENGLQQDTNSDKVSSSPKVKGSGRRVG 
 T10M13.3  ..............VPHAAASG...VSLQVEKTDIGIKNVDSSSGVEQEHGIRGK.REVP 

           481                                                      540
  F26B6.3  RKRNKNRVAERVPRKTQKRQKKTEASDSDSIASGSCSPSDAKHKDNEDATSSSQKHVKSG 
Atcurlylf  RKRNNNRVAERVPRKTQKRQKKTEASDSDSIASGSCSPSDAKHKDNEDATSSSQKHVKSG 
 T10M13.3  ILKDSN....DLPNLSNKKQKTAASDTKMSFVNSVPSLDQALDSTKGDQGGTTDNKVNRD 

           541                                                      600
  F26B6.3  NSGKSRKNGTPAEVSNNSVKDDVPVCQSNEVASELDAPGSDESLRKEEFMGETVSRGRLA 
Atcurlylf  NSGKSRKNGTPAEVSNNSVKDDVPVCQSNEVASELDAPGSDESLRKEEFMGETVSRGRLA 
 T10M13.3  SEADAKEVGEP..IPDNSVHDG.....GSSICQPHHGSGNGAIIIAE..MSET...SRPS 

           601                                                      660
  F26B6.3  TNKLWRPLEKSLFDKGVEIFGMNSCLIARNLLSGFKSCWEVFQYMTCSENKASFFGGDGL 
Atcurlylf  TNKLWRPLEKSLFDKGVEIFGMNSCLIARNLLSGFKSCWEVFQYMTCSENKASFFGGDGL 
 T10M13.3  TE..WNPIEKDLYLKGVEIFGRNSCLIARNLLSGLKTCLDVSNYM..RENEVSVFRRSST 

           661                                                      720
  F26B6.3  NPDGSSKFDINGNMVNNQVRRRSRFLRRRGKVRRLKYTWKSAAYHSIRKRITEKKDQPCR 
Atcurlylf  NPDGSSKFDINGNMVNNQVRRRSRFLRRRGKVRRLKYTWKSAAYHSIRKRITEKKDQPCR 
 T10M13.3  PNLLLDDGRTDPGNDNDEVPPRTRLFRRKGKTRKLKYSTKSAGHPSVWKRIAGGKNQSCK 

           721                                                      780
  F26B6.3  QFNPCNCKIACGKECPCLLNGTCCEKYCGCPKSCKNRFRGCHCAKSQCRSRQCPCFAADR 
Atcurlylf  QFNPCNCQIACGKECPCLLNGTCYEKYCGCPKSCKNRFRGCHCAKSQCRSRQCPCFAADR 
 T10M13.3  QYTPCGCLSMCGKDCPCLTNETCCEKYCGCSKSCKNRFRGCHCAKSQCRSRQCPCFAAGR 

           781                                                      840
  F26B6.3  ECDPDVCRNCWVIGGDGSLGVPSQRGDNYECRNMKLLLKQQQRVLLGISDVSGWGAFLKN 
Atcurlylf  ECDPDVCRNCWVIGGDGSLGVPSQRGDNYECRNMKLLLKQQQRVLLGISDISGWGAFLKN 
 T10M13.3  ECDPDVCRNCWVSCGDGSLGEAPRRGEG.QCGNMRLLLRQQQRILLGKSDVAGWGAFLKN 

           841                                                      900
  F26B6.3  SVSKHEYLGEYTGELISHKEADKRGKIYDRENCSFLFNLNDQFVLDAYRKGDKLKFANHS 
Atcurlylf  SVSKHEYLGEYTGELISHKEADKRGKIYDRENCSFLFNLNDQFVLDAYRKGDKLKFANHS 
 T10M13.3  SVSKNEYLGEYTGELISHHEADKRGKIYDRANSSFLFDLNDQYVLDAQRKGDKLKFANHS 

           901                                                      960
  F26B6.3  PEPNCYAKVIMVAGDHRVGIFAKERILAGEELFYDYRYEPDRAPAWAKKPEAPGSKKDEN 
Atcurlylf  PEPNCYAKVIMVAGDHRVGIFAKERILAGEELFYDYRYEPDRAPAWAKKPEAPGSKKDEN 
 T10M13.3  AKPNCYAKVMFVAGDHRVGIFANERIEASEELFYDYRYGPDQAPVWARKPE..GSKKDDS 

           961        974
  F26B6.3  VTPSVGRPKKLA*~
Atcurlylf  VTPSVGRPKKLA*~
 T10M13.3  AITHRRARKHQSH*


written 29 Jul 97
updated 31 Jul 98
Larry Parnell