Recipe 3.1 - Digesting a Protein


You want to digest a Protein in-silico.


The digestion is done by the object ProteinDigester.

Use ProteinDigester.Builder to build an immutable instance of ProteinDigester.

Building ProteinDigester

The Builder needs only on mandatory parameter: a Protease or a CleavageSiteMatcher.

Protease enum provides a selection of 27 proteases based on descriptions of their cleavage sites here.

ProteinDigester digester = new ProteinDigester.Builder(Protease.TRYPSIN).build();

It's also possible to create custom ProteinDigester by defining a custom CleavageSiteMatcher

// the cleavage site expects the following pattern: Pn ... P4 P3 P2 P1 | P1' P2' P3' ... Pm'
// where '|' represent the cleavage between amino-acid at position 1 and 1'
CleavageSiteMatcher csm = new CleavageSiteMatcher("N|G");

ProteinDigester hydroxylamine = new ProteinDigester.Builder(csm).build();

Building with more optional parameters: semi digestion, with/without missed-cleavages…

ProteinDigester digester = new ProteinDigester.Builder(Protease.TRYPSIN)
    .addFixedMod(AminoAcid.T, Modification.parseModification("H3PO4"))


Once instanciated, method digest(protein) is applied on a Protein object and returns products of digestion

Protein prot = ProteinFactory.newUniprotProtein("Q13231");

// standard mode (no missed-cleavage)
List<Peptide> digests = digester.digest(prot);

or alternatively method digest(protein, container) is applied on a Protein object and digests will be stored into the given container

// provide a container to store digests
List<Peptide> digests = new ArrayList<Peptide>();

digester.digest(prot, digests);

With Variable modifications

// standard mode (no missed-cleavage)
List<Peptide> digests = digester.digest(prot);

Assert.assertEquals(44, digests.size());

// this factory generate set of ProteinDigestProduct given variable modifications
ModifiedPeptideFactory factory = new ModifiedPeptideFactory.Builder()
    .addTarget(EnumSet.of(AminoAcid.T, AminoAcid.Y, AminoAcid.S), ModAttachment.SIDE_CHAIN, Modification.parseModification("PO3H-1"))

List<Peptide> varModProteinDigests = factory.createVarModPeptides(digests.get(1));

Counting cleavage sites

Counting the number of cleavage sites of any SymbolSequence

// count cleavage sites in a Protein
int count = digester.countCleavageSites(ProteinFactory.newUniprotProtein("Q13231"));

// count cleavage sites in a Peptide with Protease
count = Protease.TRYPSIN.countCleavageSites(Peptide.parse("RESALYTNIKALASKR"));
Assert.assertEquals(3, count);


See also

See also Recipe 1.8 to generate Peptide with variable modifications.