| Gene | T17A5.8 |
| Putative Identification | aspartyl protease |
| Position | 29936 to 32717, from the initial methionine to the termination codon |
| Strand | - |
| EST match | T44996 and H76843 |
| Database matches | barley nucellin,
putative aspartic protease, tobacco chloroplast nucleoid DNA binding protein |
CDS: The table below lists the coordinates of the T17A5.8 exons and which exon prediction algorithm chose the 3' and 5' termini (GF = Genefinder, GS = GenScan, Gr = Grail, M = MZEF). All four of these programs chose an exon from 31012 to 30877, but based upon reading frames and splice sites determined by the two ESTs this exon cannot be incorporated into the gene model.
| Exon | Range | 3' | 5' |
|---|---|---|---|
| 1 | 32413 to 32717* | GF,GS,Gr | GS |
| 2 | 32125 to 32260 | EST,GF,GS,M | GF,GS,Gr,M |
| 3 | 31487 to 31722 | ESTGF,GS,Gr,M | ESTGF,GS,Gr |
| 4 | 31119 to 31343 | ESTGF,GS,Gr,M | ESTGF,GS,Gr,M |
| 5 | 30546 to 30645 | ESTGr,M | GF,GS,Gr,M |
| 6 | 30405 to 30462 | ESTGF,GS,Gr,M | EST,GS,Gr,M |
| 7 | 30235 to 30306 | EST,GF,GS | EST,GF,Gr |
| 8 | 29936 to 30132 | GF,GS | EST,GS,GF |
* Genefinder predicts a start site at position 32522, which would shorten the resulting peptide by 65 amino acids [MVWYS..KYYRV removed].
Complete CDS of T17A5.8
ATGGTTTGGTACTCAAGCTGTAGAATTTTGTTTCTGGGTCTGCTTATTTTGTTGGCTTCG AGCTGGGTTTTGGATAGATGCGAGGGATTTGGTGAATTTGGGTTTGAATTTCATCATCGT TTCTCCGATCAGGTTGTTGGGGTTTTGCCTGGAGATGGTTTACCTAATCGAGATTCTTCT AAGTATTATAGAGTGATGGCTCATCGTGATCGGTTAATTAGAGGTCGTCGACTTGCAAAT GAAGATCAATCACTCGTTACTTTCTCTGACGGCAACGAAACTGTTCGTGTTGATGCCCTA GGATTTTTGCATTACGCTAATGTGACTGTTGGGACGCCGTCTGATTGGTTTATGGTTGCT TTAGATACTGGAAGTGACTTGTTTTGGTTGCCCTGTGACTGCACCAATTGTGTTCGTGAA TTGAAAGCACCTGGTGGCTCGAGTTTGGACCTTAATATTTATAGCCCTAATGCTTCATCG ACAAGTACTAAAGTTCCTTGTAATAGCACATTATGTACAAGAGGTGATCGATGCGCTTCC CCTGAAAGTGATTGCCCATACCAGATCCGGTATCTTTCTAATGGTACCTCTTCTACTGGA GTCTTGGTGGAGGATGTACTTCACTTAGTTTCAAATGACAAAAGTTCCAAAGCTATTCCC GCTCGTGTTACTTTTGGATGTGGTCAAGTTCAGACCGGTGTATTCCATGATGGTGCAGCT CCAAATGGTCTTTTCGGGCTTGGCTTAGAAGACATATCGGTGCCTAGTGTACTAGCAAAA GAAGGAATTGCAGCAAACTCATTCTCAATGTGTTTTGGGAACGATGGAGCTGGTAGGATC AGTTTTGGAGATAAAGGTAGCGTAGACCAACGGGAAACACCATTGAACATAAGACAACCA CACCCAAACAAAGACAGCTTCCAGTATCCAGCTGTGAATCTGACAATGAAAGGCGGGAGC TCGTATCCCGTTTATCACCCGTTAGTAGTAATCCCCATGAAGGACACAGATGTCTACTGT TTAGCCATTATGAAGATAGAAGACATTAGCATCATTGGACAGAACTTCATGACTGGCTAT CGCGTTGTCTTTGATCGTGAGAAACTGATTCTGGGGTGGAAAGAATCTGATTGTTACACC GGTGAGACATCGGCTCGGACGCTTCCATCGAACCGTTCCTCCTCCTCGGCTAGACCGCCA GCTTCTTCGTTTGACCCAGAGGCGACAAACATACCATCTCAAAGACCAAACACGTCGACG ACTTCTGCTGCTTATTCTCTCTCTATCTCACTTTCATTGTTCTTCTTCTCAATTTTGGCC ATCCTTTAA
Protein sequence:
MVWYSSCRILFLGLLILLASSWVLDRCEGFGEFGFEFHHRFSDQVVGVLPGDGLPNRDSS KYYRVMAHRDRLIRGRRLANEDQSLVTFSDGNETVRVDALGFLHYANVTVGTPSDWFMVA LDTGSDLFWLPCDCTNCVRELKAPGGSSLDLNIYSPNASSTSTKVPCNSTLCTRGDRCAS PESDCPYQIRYLSNGTSSTGVLVEDVLHLVSNDKSSKAIPARVTFGCGQVQTGVFHDGAA PNGLFGLGLEDISVPSVLAKEGIAANSFSMCFGNDGAGRISFGDKGSVDQRETPLNIRQP HPNKDSFQYPAVNLTMKGGSSYPVYHPLVVIPMKDTDVYCLAIMKIEDISIIGQNFMTGY RVVFDREKLILGWKESDCYTGETSARTLPSNRSSSSARPPASSFDPEATNIPSQRPNTST TSAAYSLSISLSLFFFSILAIL*
Protein motifs:
A signature sequence for the active site of eukaryotic and viral aspartyl proteases is found from residues 119 to 130 [VALDTGSDLFWL]. The highlighted aspartic acid is the active residue.
Alignment of T17A5.8 to barley nucellin, tobacco chloroplast nucleoid DNA-binding protein, and Arabidopsis protein F21M12.13. Several different signature sequences are highlighted: Active site of the aspartyl proteases in red, active site of serine lipase in green, zinc-finger motif in blue (as reported by the authors).
1 60
AtF21M12.13 ~~~~~MASSSLHFFF....FLTLLLPFTFTTA....TRDT..................C.
TobaccoCND41 MEHSLMSTGSYFLLFSSSAFLLILLSFSVEKSHALETRETIESHFHTLQLSSLLPSSSCN
AtT17A5.8 ~~~~~~~~~~~MVWYSSCRILFLGLLILLASSWVLDRCEGFGEFGFEFHHRFSDQVVGVL
BarleyNucellin ~~~~~~~~~MAAMWSPIIGLLLLLLPLGPSSAIKFP........................
61 120
AtF21M12.13 .ATAAPDGSDDLSIIPINAKCSPFAPTHVSASVIDTVL..HMASSDS..HRLTYLS.SLV
TobaccoCND41 PATKGKRRGASLEVVNRQGPCTLLNQKGAKAPTLTEILAHDQARVDSIQARITDQSYDLF
AtT17A5.8 PGDGLPNRDS....................................SKYYRVMAHRDRLI
BarleyNucellin ............................................................
121 180
AtF21M12.13 AGKPKPTS............VPVASGNQLHIGNYVVRAKLGTPPQLMFMVLDTSNDAVWL
TobaccoCND41 KKKDKKSSNKKKSVKDSKANLPAQSGLPLGTGNYIVNVGLGTPKKDLSLIFDTGSDLTWT
AtT17A5.8 RG..RRLANEDQSLVTFSDGNETVRVDALGFLHY.ANVTVGTPSDWFMVALDTGSDLFWL
BarleyNucellin .......................LEGNVYPVGHFYATLNIGEPAKPYFLDVDTGSNLTWL
181 240
AtF21M12.13 P..CSGC..SGCSNASTSFNTNSSSTYSTVSCSTAQCTQARGLTCPSSSPQPSV...CSF
TobaccoCND41 Q..CQPCVKSCYAQQQPIFDPSTSKTYSNISCTSAACSSLKSATGNSPGCSSSN...CVY
AtT17A5.8 PCDCTNCVRELKAPGGSSLDLNIYSPNASSTSTKVPCNSTLCTRGDRCASPESD...CPY
BarleyNucellin ECHHPVHGCKGCHPRPPHPYYTPADGNLKVVCGSPLCVAVRRDVPGIPECSRNDPHRCHY
241 300
AtF21M12.13 NQSY.GGDSSFSASLVQDTLTLAP.D.....VIPNFSFGCINSASGNS...LPP.QGLMG
TobaccoCND41 GIQY.GDSSFTIGFFAKDKLTLTQND.....VFDGFMFGCGQNNKGLF...GKT.AGLIG
AtT17A5.8 QIRYLSNGTSSTGVLVEDVLHLVSNDKSSKAIPARVTFGCGQVQTGVFHDGAAP.NGLFG
BarleyNucellin EIQYVTGKSE..GDLATDIISVNGRDKK......RIAFGCGYKQEEPADSPPSPVDGILG
301 360
AtF21M12.13 LGRGPMSLVSQT.TSLYSGVFSYCLPSFRS.....FYFSGSLKLGLLGQPKSIRYTPLLR
TobaccoCND41 LGRDPLSIVQQT.AQKFGKYFSYCLPTSRGSNGHLTFGNGNGVKASKAVKNGITFTP.FA
AtT17A5.8 LGLEDISVPSVL.AKEGIAANSFSMCFGNDGAGRISFGDKGSVDQRETPLNIRQPHPNKD
BarleyNucellin LGMGKAGFAAQLKGHKMIKENVIGHCLSSKGKGVLYVGDFN......PPTRGVTWAPMRE
361 420
AtF21M12.13 NPRRPSLYYVNLTGVSVGSVQVPVDPVYLTFDANSGAGTIIDSGTVITRFAQPVYEAIRD
TobaccoCND41 SSQGTAYYFIDVLGISVGGKALSISPMLF.....QNAGTIIDSGTVITRLPSTAYGSLKS
AtT17A5.8 SFQYPAVNLTMKGGSSYPVYHPLVVIPMKDTDVYCLAIMKIEDISIIGQNFMTGYRVVFD
BarleyNucellin SLFYYSPGLAEVFIDKQPIRGNPTFEAVFDSGSTYTHVPAQIYNEIVSKVRGTLSESSLE
421 480
AtF21M12.13 EFRKQVN.VSSFSTLGAFDTCFSADNEN..VAPKITLHMT.SLDLKLPMENTLIHSSAGT
TobaccoCND41 AFKQFMSKYPTAPALSLLDTCYDLSNYTSISIPKISFNFNGNANVELDPNGILITNGASQ
AtT17A5.8 REKLILGWKES........DCYTGETSARTLPSNRSSSSARPPASSFDPEATNIPSQRPN
BarleyNucellin EVKGRALPL.CWKGKKPFGSVNDVKNQFKALSLKIT.HARGTNNLDIPPQNYLFVKEDGE
481 542
AtF21M12.13 LTCLSMAGIRQNANAVLN..VIANLQQQNLRILFDVPNSRIGIAPEPCN*~~~~~~~~~~~~~
TobaccoCND41 V.CLAFAG..NGDDDSIG..IFGNIQQQTLEVVYDVAGGQLGFGYKGCS*~~~~~~~~~~~~~
AtT17A5.8 TSTTSAA.......YSLS..I.......SLSLFF........FSILAIL*~~~~~~~~~~~~~
BarleyNucellin TCLAILDASLDPVLKELNFILIGAVTMQDLFVIYDNEKKQLGWVRAQCDRVQELESVIDSRL*
written 30 Oct 97
Larry Parnell