Gene T17A5.8
Putative Identification aspartyl protease
Position 29936 to 32717, from the initial methionine to the termination codon
Strand -
EST match T44996 and H76843
Database matches barley nucellin, putative aspartic protease,
tobacco chloroplast nucleoid DNA binding protein

 

CDS:  The table below lists the coordinates of the T17A5.8 exons and which exon prediction algorithm chose the 3' and 5' termini (GF = Genefinder, GS = GenScan, Gr = Grail, M = MZEF). All four of these programs chose an exon from 31012 to 30877, but based upon reading frames and splice sites determined by the two ESTs this exon cannot be incorporated into the gene model.

Exon Range 3' 5'
1 32413 to 32717* GF,GS,Gr GS
2 32125 to 32260 EST,GF,GS,M GF,GS,Gr,M
3 31487 to 31722 ESTGF,GS,Gr,M ESTGF,GS,Gr
4 31119 to 31343 ESTGF,GS,Gr,M ESTGF,GS,Gr,M
5 30546 to 30645 ESTGr,M GF,GS,Gr,M
6 30405 to 30462 ESTGF,GS,Gr,M EST,GS,Gr,M
7 30235 to 30306 EST,GF,GS EST,GF,Gr
8 29936 to 30132 GF,GS EST,GS,GF

* Genefinder predicts a start site at position 32522, which would shorten the resulting peptide by 65 amino acids [MVWYS..KYYRV removed].

Complete CDS of T17A5.8

ATGGTTTGGTACTCAAGCTGTAGAATTTTGTTTCTGGGTCTGCTTATTTTGTTGGCTTCG
AGCTGGGTTTTGGATAGATGCGAGGGATTTGGTGAATTTGGGTTTGAATTTCATCATCGT
TTCTCCGATCAGGTTGTTGGGGTTTTGCCTGGAGATGGTTTACCTAATCGAGATTCTTCT
AAGTATTATAGAGTGATGGCTCATCGTGATCGGTTAATTAGAGGTCGTCGACTTGCAAAT
GAAGATCAATCACTCGTTACTTTCTCTGACGGCAACGAAACTGTTCGTGTTGATGCCCTA
GGATTTTTGCATTACGCTAATGTGACTGTTGGGACGCCGTCTGATTGGTTTATGGTTGCT
TTAGATACTGGAAGTGACTTGTTTTGGTTGCCCTGTGACTGCACCAATTGTGTTCGTGAA
TTGAAAGCACCTGGTGGCTCGAGTTTGGACCTTAATATTTATAGCCCTAATGCTTCATCG
ACAAGTACTAAAGTTCCTTGTAATAGCACATTATGTACAAGAGGTGATCGATGCGCTTCC
CCTGAAAGTGATTGCCCATACCAGATCCGGTATCTTTCTAATGGTACCTCTTCTACTGGA
GTCTTGGTGGAGGATGTACTTCACTTAGTTTCAAATGACAAAAGTTCCAAAGCTATTCCC
GCTCGTGTTACTTTTGGATGTGGTCAAGTTCAGACCGGTGTATTCCATGATGGTGCAGCT
CCAAATGGTCTTTTCGGGCTTGGCTTAGAAGACATATCGGTGCCTAGTGTACTAGCAAAA
GAAGGAATTGCAGCAAACTCATTCTCAATGTGTTTTGGGAACGATGGAGCTGGTAGGATC
AGTTTTGGAGATAAAGGTAGCGTAGACCAACGGGAAACACCATTGAACATAAGACAACCA
CACCCAAACAAAGACAGCTTCCAGTATCCAGCTGTGAATCTGACAATGAAAGGCGGGAGC
TCGTATCCCGTTTATCACCCGTTAGTAGTAATCCCCATGAAGGACACAGATGTCTACTGT
TTAGCCATTATGAAGATAGAAGACATTAGCATCATTGGACAGAACTTCATGACTGGCTAT
CGCGTTGTCTTTGATCGTGAGAAACTGATTCTGGGGTGGAAAGAATCTGATTGTTACACC
GGTGAGACATCGGCTCGGACGCTTCCATCGAACCGTTCCTCCTCCTCGGCTAGACCGCCA
GCTTCTTCGTTTGACCCAGAGGCGACAAACATACCATCTCAAAGACCAAACACGTCGACG
ACTTCTGCTGCTTATTCTCTCTCTATCTCACTTTCATTGTTCTTCTTCTCAATTTTGGCC
ATCCTTTAA

 

Protein sequence:

MVWYSSCRILFLGLLILLASSWVLDRCEGFGEFGFEFHHRFSDQVVGVLPGDGLPNRDSS
KYYRVMAHRDRLIRGRRLANEDQSLVTFSDGNETVRVDALGFLHYANVTVGTPSDWFMVA
LDTGSDLFWLPCDCTNCVRELKAPGGSSLDLNIYSPNASSTSTKVPCNSTLCTRGDRCAS
PESDCPYQIRYLSNGTSSTGVLVEDVLHLVSNDKSSKAIPARVTFGCGQVQTGVFHDGAA
PNGLFGLGLEDISVPSVLAKEGIAANSFSMCFGNDGAGRISFGDKGSVDQRETPLNIRQP
HPNKDSFQYPAVNLTMKGGSSYPVYHPLVVIPMKDTDVYCLAIMKIEDISIIGQNFMTGY
RVVFDREKLILGWKESDCYTGETSARTLPSNRSSSSARPPASSFDPEATNIPSQRPNTST
TSAAYSLSISLSLFFFSILAIL*

 

Protein motifs:

     A signature sequence for the active site of eukaryotic and viral aspartyl proteases is found from residues 119 to 130 [VALDTGSDLFWL]. The highlighted aspartic acid is the active residue.

 

Alignment of T17A5.8 to barley nucellin, tobacco chloroplast nucleoid DNA-binding protein, and Arabidopsis protein F21M12.13. Several different signature sequences are highlighted: Active site of the aspartyl proteases in red, active site of serine lipase in green, zinc-finger motif in blue (as reported by the authors).


                1                                                         60
   AtF21M12.13  ~~~~~MASSSLHFFF....FLTLLLPFTFTTA....TRDT..................C. 
  TobaccoCND41  MEHSLMSTGSYFLLFSSSAFLLILLSFSVEKSHALETRETIESHFHTLQLSSLLPSSSCN 
     AtT17A5.8  ~~~~~~~~~~~MVWYSSCRILFLGLLILLASSWVLDRCEGFGEFGFEFHHRFSDQVVGVL 
BarleyNucellin  ~~~~~~~~~MAAMWSPIIGLLLLLLPLGPSSAIKFP........................ 

                61                                                       120
   AtF21M12.13  .ATAAPDGSDDLSIIPINAKCSPFAPTHVSASVIDTVL..HMASSDS..HRLTYLS.SLV 
  TobaccoCND41  PATKGKRRGASLEVVNRQGPCTLLNQKGAKAPTLTEILAHDQARVDSIQARITDQSYDLF 
     AtT17A5.8  PGDGLPNRDS....................................SKYYRVMAHRDRLI 
BarleyNucellin  ............................................................ 

                121                                                      180
   AtF21M12.13  AGKPKPTS............VPVASGNQLHIGNYVVRAKLGTPPQLMFMVLDTSNDAVWL
  TobaccoCND41  KKKDKKSSNKKKSVKDSKANLPAQSGLPLGTGNYIVNVGLGTPKKDLSLIFDTGSDLTWT
     AtT17A5.8  RG..RRLANEDQSLVTFSDGNETVRVDALGFLHY.ANVTVGTPSDWFMVALDTGSDLFWL
BarleyNucellin  .......................LEGNVYPVGHFYATLNIGEPAKPYFLDVDTGSNLTWL

                181                                                      240
   AtF21M12.13  P..CSGC..SGCSNASTSFNTNSSSTYSTVSCSTAQCTQARGLTCPSSSPQPSV...CSF 
  TobaccoCND41  Q..CQPCVKSCYAQQQPIFDPSTSKTYSNISCTSAACSSLKSATGNSPGCSSSN...CVY 
     AtT17A5.8  PCDCTNCVRELKAPGGSSLDLNIYSPNASSTSTKVPCNSTLCTRGDRCASPESD...CPY 
BarleyNucellin  ECHHPVHGCKGCHPRPPHPYYTPADGNLKVVCGSPLCVAVRRDVPGIPECSRNDPHRCHY 

                241                                                      300
   AtF21M12.13  NQSY.GGDSSFSASLVQDTLTLAP.D.....VIPNFSFGCINSASGNS...LPP.QGLMG 
  TobaccoCND41  GIQY.GDSSFTIGFFAKDKLTLTQND.....VFDGFMFGCGQNNKGLF...GKT.AGLIG 
     AtT17A5.8  QIRYLSNGTSSTGVLVEDVLHLVSNDKSSKAIPARVTFGCGQVQTGVFHDGAAP.NGLFG 
BarleyNucellin  EIQYVTGKSE..GDLATDIISVNGRDKK......RIAFGCGYKQEEPADSPPSPVDGILG 

                301                                                      360
   AtF21M12.13  LGRGPMSLVSQT.TSLYSGVFSYCLPSFRS.....FYFSGSLKLGLLGQPKSIRYTPLLR 
  TobaccoCND41  LGRDPLSIVQQT.AQKFGKYFSYCLPTSRGSNGHLTFGNGNGVKASKAVKNGITFTP.FA 
     AtT17A5.8  LGLEDISVPSVL.AKEGIAANSFSMCFGNDGAGRISFGDKGSVDQRETPLNIRQPHPNKD 
BarleyNucellin  LGMGKAGFAAQLKGHKMIKENVIGHCLSSKGKGVLYVGDFN......PPTRGVTWAPMRE 

                361                                                      420
   AtF21M12.13  NPRRPSLYYVNLTGVSVGSVQVPVDPVYLTFDANSGAGTIIDSGTVITRFAQPVYEAIRD 
  TobaccoCND41  SSQGTAYYFIDVLGISVGGKALSISPMLF.....QNAGTIIDSGTVITRLPSTAYGSLKS 
     AtT17A5.8  SFQYPAVNLTMKGGSSYPVYHPLVVIPMKDTDVYCLAIMKIEDISIIGQNFMTGYRVVFD 
BarleyNucellin  SLFYYSPGLAEVFIDKQPIRGNPTFEAVFDSGSTYTHVPAQIYNEIVSKVRGTLSESSLE 

                421                                                      480
   AtF21M12.13  EFRKQVN.VSSFSTLGAFDTCFSADNEN..VAPKITLHMT.SLDLKLPMENTLIHSSAGT 
  TobaccoCND41  AFKQFMSKYPTAPALSLLDTCYDLSNYTSISIPKISFNFNGNANVELDPNGILITNGASQ 
     AtT17A5.8  REKLILGWKES........DCYTGETSARTLPSNRSSSSARPPASSFDPEATNIPSQRPN 
BarleyNucellin  EVKGRALPL.CWKGKKPFGSVNDVKNQFKALSLKIT.HARGTNNLDIPPQNYLFVKEDGE 

                481                                                        542
   AtF21M12.13  LTCLSMAGIRQNANAVLN..VIANLQQQNLRILFDVPNSRIGIAPEPCN*~~~~~~~~~~~~~
  TobaccoCND41  V.CLAFAG..NGDDDSIG..IFGNIQQQTLEVVYDVAGGQLGFGYKGCS*~~~~~~~~~~~~~
     AtT17A5.8  TSTTSAA.......YSLS..I.......SLSLFF........FSILAIL*~~~~~~~~~~~~~
BarleyNucellin  TCLAILDASLDPVLKELNFILIGAVTMQDLFVIYDNEKKQLGWVRAQCDRVQELESVIDSRL*

 


written 30 Oct 97
Larry Parnell