SIB

Recipe 2.2 - Customizing mgf reader

Problem

You want to customize MgfReader or MgfReaderGeneric to parse new tags and/or return custom parsed spectrum object.

Solution

MgfReaderGeneric provides a powerful and flexible way to make customizations. You simply extend this class and override methods that handle and parse specific parts of the MGF document.

Parsing TITLE tag

If you need to parse information from the TITLE tag, you can access the whole TITLE value from the Metadata comments. You can also provide your own implementation of TitleParser and give it to the MgfParser. MgfReader delegates the parsing of TITLE tag to TitleParsers.

Default implementations of TitleParser are found in package org.expasy.mzjava.proteomics.io.ms.reader.mgf and are automatically loaded by ServiceProvider

TitleParser instance just need to be added to the MgfParser Here is an example of our implementing class RegexScanNumTitleParser that can be used to match the data you need


String entry =
        "BEGIN IONS\n" +
        "TITLE=scan 1\n" +
        "PEPMASS=822.000\t946967.268\n" +
        "CHARGE=2+\n" +
        "198.053 38 0\n" +
        "199.141 34 0\n" +
        "711.524 29 0\n" +
        "715.513 48 0\n" +
        "738.915 1.86E2 0\n" +
        "739.374 81 1\n" +
        "743.954 37 0\n" +
        "744.386 88 0\n" +
        "7.51116E2 90 0\n" +
        "768.535 462 0\n" +
        "771.675 169 1\n" +
        "772.725 70 0\n" +
        "END IONS";

MgfReader reader = new MgfReader(new StringReader(entry), URI.create("file:/one_entry.mgf"), PeakList.Precision.DOUBLE);

// it extracts scan number from the TITLE tag and add it to metaData of parsed Spectrum
TitleParser titleParser = new RegexScanNumTitleParser(Pattern.compile("scan number=\\s+(\\d+).*"));

// just add it to the MgfReader instance
reader.addTitleParser(titleParser);

Here is how to make a custom TitleParser


TitleParser titleParser = new TitleParser() {

    @Override
    public boolean parseTitle(String title, MsnSpectrum spectrum) {

        // do process infos from TITLE and store it into spectrum

        return true;
    }
};

// add it to the MgfReader instance
reader.addTitleParser(titleParser);

Parsing custom tags

MgfReader also provides a set of default implementations for parsing different part of mgf (CHARGE, PEPMASS, SCANS, RTINSECONDS and other unknown tags).

It has been meant to be overridden by subclasses to handle custom parsing of any part of the entry.

It is really easy to custom the parsing of exotic tags by overriding method parseUnknownTag()


// here is the content of an entry
String entry =
        "BEGIN IONS\n" +
        "TITLE=scan 1\n" +
        "PEPMASS=822.000\t946967.268\n" +
        "CHARGE=2+\n" +
        "MYTAG=my content\n" +
        "198.053 38 0\n" +
        "199.141 34 0\n" +
        "711.524 29 0\n" +
        "715.513 48 0\n" +
        "738.915 1.86E2 0\n" +
        "739.374 81 1\n" +
        "743.954 37 0\n" +
        "744.386 88 0\n" +
        "7.51116E2 90 0\n" +
        "768.535 462 0\n" +
        "771.675 169 1\n" +
        "772.725 70 0\n" +
        "END IONS";

// we've created an anonymous *MgfReader*
MgfReader reader = new MgfReader(new StringReader(entry), new URI("file:/tmp/one_entry.mgf"), PeakList.Precision.DOUBLE) {

    // you have to override mthod parseUnknownTag() to handle the proper parsing
    @Override
    protected boolean parseUnknownTag(String tag, String value, MsnSpectrum spectrum) {

        // MYTAG=my tag value
        if (tag.startsWith("MYTAG")) return parseMyTag(value, spectrum);
        else return super.parseUnknownTag(tag, value, spectrum);
    }

    private boolean parseMyTag(String tagValue, MsnSpectrum spectrum) {

        // do parse "my tag value"

        return true;
    }
};

When you need to store information unknown of MsnSpectrum , you should define your custom MsnSpectrum and create your reader that extends MgfReaderGeneric. Doing so you would have to give an implementation of newSpectrum()


String mgf =
        "BEGIN IONS\n" +
        "TITLE=scan 1\n" +
        "PEPMASS=822.000\t946967.268\n" +
        "CHARGE=2+\n" +
        "MYTAG=my content\n" +
        "198.053 38 0\n" +
        "199.141 34 0\n" +
        "711.524 29 0\n" +
        "715.513 48 0\n" +
        "738.915 1.86E2 0\n" +
        "739.374 81 1\n" +
        "743.954 37 0\n" +
        "744.386 88 0\n" +
        "7.51116E2 90 0\n" +
        "768.535 462 0\n" +
        "771.675 169 1\n" +
        "772.725 70 0\n" +
        "END IONS";

// Creation of anonymous MgfReader*
AbstractMgfReader<PeakAnnotation, CustomMsnSpectrum> reader = new AbstractMgfReader<PeakAnnotation, CustomMsnSpectrum>(
    new StringReader(mgf), new URIBuilder("www.expasy.ch", "test").build(), PeakList.Precision.DOUBLE, new PeakProcessorChain<>()) {

        @Override
        protected CustomMsnSpectrum newSpectrum(AbstractMsReader.ParseContext context, PeakList.Precision precision) {

            return new CustomMsnSpectrum(precision);
        }

        @Override
        protected boolean parseUnknownTag(String tag, String value, CustomMsnSpectrum spectrum) {

            if (tag.startsWith("MYTAG")) return parseMyTag(value, spectrum);
            else return super.parseUnknownTag(tag, value, spectrum);
        }

        private boolean parseMyTag(String tagValue, CustomMsnSpectrum spectrum) {

            spectrum.setMyTagValue(tagValue);
            return true;
        }

        @Override
        protected boolean parseTitleTag(String value, CustomMsnSpectrum spectrum) {

            return false;
        }

        @Override
        protected void setRetentionTimes(CustomMsnSpectrum spectrum, RetentionTimeList retentionTimeList) {

            spectrum.addRetentionTimes(retentionTimeList);
        }

        @Override
        protected void setScanNumbers(CustomMsnSpectrum spectrum, ScanNumberList scanNumbers) {

            spectrum.addScanNumbers(scanNumbers);
        }
    };

// next returns your custom spectrum
CustomMsnSpectrum spectrum = reader.next();

// your custom spectrum object returned by your parser
public static class CustomMsnSpectrum extends MsnSpectrum {

    private String myTagValue;

    public CustomMsnSpectrum(Precision precision) {

        super(precision);
    }

    public void setMyTagValue(String myTagValue) {

        this.myTagValue = myTagValue;
    }
}

Discussion

See also

See also Recipe 2.1