Protein PVX_114260 (PlasmoDB link) - New domain Collagen (Pfam link)

Sequence and domain localizations (colored) at: 2047...2106

   1 MVNREEEESAKSDEEGTKSDGECTKPDEECTKSDGECTKSDEECTKSDEECTKSDEECTK   60
  61 SDEECTKSDEECTKSDEECTKSDEECTKSDEEFTKTDEEFAKPDEEIAKPDEENGKKKCK  120
 121 GRTKNYTGEFLAQPNDSQEKDCYEEKGTHEGYSKRSASSDMSADGRFETSAIEHWGGKKG  180
 181 SLAYEGGEDEQEELSGADREGNRKANGHIPCVEIKVQPKGENTSSPVRGCGSWGHFHPGG  240
 241 SCAGEGAVEGAVEGAGNEAWNEVTAAGGTGQMGHQGQATPATQATLVAPKRSNRQGVEGD  300
 301 EKSSANFTCTREHVDLGNDQQSRESGRMRNRAKEKEWYCKIGTNPTVSTIRAGAICGNGS  360
 361 NNASCKVAGDFRACKEGTEKRKEEMSGGGEEQGEIMREGGNERRGEARSGARSEAISEVR  420
 421 SGAISEVRSGAISEVRSGAISEARSEARSEARSEGRSEVLSGARSGTLGKPPSTDQSVGA  480
 481 ISDLGHASSKYTQSAFGKNSDAPNDGYHLANGEISQNDIILKKKKITQEGALRSTQPSAP  540
 541 IGTPQSAEGNSNDHSPVNRRANYKIGKNVLLKGAATSERGSGDSRGSSENGGNGEKGGNG  600
 601 EKGDSTLLPGDLEEHAPIFGEQNCARGTLMGSDEKCISARVSGKPDEKSDAKSGAKLGAN  660
 661 SGARAHAEAHAQLGGASDGSAAEVADEVSAEVSDEVSDEVSSEVSDGASDEVYGEASGEA  720
 721 SDGASDGAFDGAFDGASGEVPDAAPDAHLDEQSNPQPDQSRRRKGTNFSEKEKKKKKKHF  780
 781 LNIGRGSNTRGRRTNLSIEHHHHYQSCFHPKGNFATPSTAKYGSPRDEKVSPHFGTPTQK  840
 841 KDNVNKNGQDEPEEEKLLVQHSEKRKTAKKNCKMDSSDLIELVMRSHGSSLHGSQLHSSR  900
 901 SHGSQLHSSSLHISSLHGSPSHGSQLHSSRSHGSQSHGSQSPTVMSGEESEANEEVCAVV  960
 961 NHPRSRGSYVPKQERKPAHAEGQSGAEKEPPSGRLFPEGAFPHEEAKHGGNKFVSSKTNE 1020
1021 LSNGKDNISEGVHNVEKALFEVPNLRSSAHGEKESVMPMRLTSQVGTPQQVSRVTRMSSL 1080
1081 KRAQENKGRDFAKFAKSNDEHSVVGRIPKGGVSPKGRNPPEKASSEGGSGDDHYDRHDGG 1140
1141 VPNGTAVKGGFPSSCVRSSEPRWKEGSAVNGHTCDGNIKSERSDDYEYQSGGRFPGEGAT 1200
1201 PLEKEGGKEHPKFYNGICEPSSESHRINNNCLETAASRREDYYNQGVRNEYGLRENRQSS 1260
1261 SSQLTELSRPIGPYQLRQPNRPSGRSDAPPVDVKSNCARGQEPRAATNGRNYLNKSEIKI 1320
1321 EDFNLHNDMQRRSMNRESSTLDDELHQWKMEGGGGYAHGEGMNLQGGSNDDKKGKNMFVV 1380
1381 NGGDTIKEVEGTWKNDKWVDEWGEGNRRRAEEDKYSRNADEGVGDYENYHTADEAAVIAA 1440
1441 DEAVDEEEASEMYEIKRQIEYIMQNDDIDFSRIIVKPSKNYVKINLFIDRYIRGYDELQN 1500
1501 SDMMFCFGESSEEDVIKRKKSSVKEGGADGDGSQNDGVKRVKKKNGSWPYKKKYKPQKNR 1560
1561 RINYDAIDTMWQPHFHPHNKEFRVRYRYKGGMRLKTISCKHFGYLPSKKISILFLFRWLL 1620
1621 CGKYIAEKTKRSRLCITDINDYNLPELLSKRRNRNNDYMTEDEWKALEEKNTKEFYEHIH 1680
1681 KINDFLMSNYKDDSFVNQLKVIISSCDKQFKKEEVLNILNQCLRDKLNSERRREALMRGD 1740
1741 GGEGGEVGRVSGEVGGDGGDGGDGGEVGRVSGEVGRVGGEVDASLGSSLGSSLSGKISST 1800
1801 VGGTVGGTLDGKISSTHSGRRDQSREPRQGPLPLSGGPPLNSKKRKLPFNPGADPCSKES 1860
1861 MDGGSIHHSSSGSSKGGSQGGSQESFHGSSQGSIHGSSHGTIHGGNHGGSHSSSSSVGDS 1920
1921 MGKAIFASNKRMKDSSFGRANGGSEGAKGDKAQVGKKDENDAMWRGGYNRQNGAHAEKGK 1980
1981 GDSPYHMGVNNIGMNYLPFQCAEPELSDEAGGRSDRARNRNPEVCNDRMVIVDHGVHVGH 2040
2041 VGSLCDGCDGVRGRGESLGANGGVAQDGGSGQMGSGQMGSGQMRSGQMRSDHIRGELIES 2100
2101 GRARRGSSGGEPPRSHPPGGNHLASYPSYADLEKLSNTEEIREYYNSLIELKKSLYVRSP 2160
2161 QGDMDDEVSYSNINDHFMELLNKESRRSSSSSSSRQKRNSNDSLDGEDTTEYDKFLFAIY 2220
2221 YANRENAAAVGSARGGSAGSAAGGAVAAAAGGGAAGGMSPLGEGTATPGKKALPPPTYDE 2280
2281 RNTPAERMQKSDFSMDQRNQEMLVAKSSGSSLQHESNFQNGYPNLSDMNSSKKISNVSTC 2340
2341 PSLGVSSYCVGSHTQGEKKSVLSCSVSNERNKIGKDVTSEEKSDCVKEQLKRKSLNMVEA 2400
2401 RELFNSCAEKYSQLIFSERNVFSRHARGDGQEEGHRLAGSPSGAACGSAGGATCGSASGS 2460
2461 AEGTSNGSANGTASDKPRKGCSHGCADDNEPSHPPAREKIALENEKCVRCTDGRAEQRAH 2520
2521 QKNQREFLFFPSQSGNSGKDASSRSKNEPTSEGEQAEREAKLEADPLDSLKNGNAIGGEN 2580
2581 MKDTIDILVKSKYLRKPQYDYSLSSNVECSHERGRSNSRCLYCACMAGLGYNRSVEGALK 2640
2641 GGDPQDGYLPRGEEGVERHHLHPCEGEGSNPRSSSCHYSSKCGICSSCSSCSNCRNCNSC 2700
2701 SNFINCSNCSYCSFYSKLLQALRDNKRAERGNTYGMNKECECAGRHMRRHFGNSEGTSFT 2760
2761 RDDCSEEMFTPQASAGNNMPPVELCNRIFNYMMQKVGSEWVDRGSAKDMSIRSEQRLGEM 2820
2821 EDGAKQRAMADAMWNGRTSWRSKREMMNAFLQELFSSFPKDDQDVHGGTTHYSAMQCKDS 2880
2881 LPEYAEGMWGGGAQDEENERGEFADYTGESIPGYHHRNCGRPYEGMRNFPHSVASKDEFE 2940
2941 RRPPRLNVPYVQRNLQRQKESNQFARKSPTNGMLPLGEKQTDFIMNKDNFELITNFINDV 3000
3001 KVKGYSACEEVKDHRTFHDRGNSITDHFLPQRGGNQGGSQGSSRSSQHSSGQVAPKKKEH 3060
3061 NGPDGEGDNSEESEKAGDGEGEGDGEGESEGESGGESQNGASTVYRRKTLNEKRFKEKSR 3120
3121 GNLNGMLLADSINNNA 3136

Alignement of domain consensus (first line) on the sequence (colored line).

Each position reports the amino acid with highest probability; capital letters mean highly conserved residues (i.e. with probability > 50%).

Occurence 2047...2106
   1 GppGppGppGppGppGppGppGpaGapGppGppGepGpPGppGppGppGppGapGapGpp   59
2047 GCDGVRGRGESLGANGGVAQDGGSGQMGSGQMGSGQMRSGQMRSDHIRGELIESGRARRG 2106