Recipe 2.1 - Reading Mass Spectrometry data
Problem
You want to read MS data iteratively from a character stream.
Solution
Use our IterativeReader implementing AbstractMsReaders to go through each parsed Spectrum using the “foreach” statement.
In this recipe we will take as an example the MGF and MZXML readers.
Building MS Readers
Our readers can be associated with a File
String mgfFilename = "mgf_test.mgf";
MgfReader reader = new MgfReader(new File(mgfFilename), PeakList.Precision.DOUBLE);
or with a java.io.Reader
Reader sr = new StringReader("BEGIN IONS\n" +
"TITLE=B06-1151_p.00478.00478.2\n" +
"PEPMASS=476.23272705078125\n" +
"CHARGE=2\n" +
"135.032\t1.376\n" +
"146.042\t5.997\n" +
"158.052\t25.335\n" +
"175.047\t202.932\n" +
"186.047\t27.338\n" +
"203.140\t60.595\n" +
"213.230\t5.732\n" +
"221.182\t1.083\n" +
"231.047\t1159.671...");
MgfReader reader = new MgfReader(sr, new URI("http://somewhere.ch/mymsdata.mgf") , PeakList.Precision.DOUBLE);
We provide also some static factory methods that create the proper reader given the file type
IterativeReader reader =
MsReaderFactory.newMsnSpectraReader(new File(mgfFilename), PeakList.Precision.DOUBLE);
Some MS formats gives some informations that could be inconsistent with scanned spectra. As an example, mzxml format provides attributes like “peaksCount” or “totIonCurrent” that can conflict with the ones extracted from the decoded spectrum. MzxmlReader provides a system to control those consistency checks:
String mzxmlFilename = "mzxml_test.mzXML";
// default constructor strictly checks for inconsistencies
MzxmlReader reader = new MzxmlReader(new File(mzxmlFilename), PeakList.Precision.DOUBLE);
// we can then add/remove ConsistencyCheck
reader.removeConsistencyChecks(EnumSet.of(MzxmlReader.ConsistencyCheck.TOTAL_ION_CURRENT));
MzxmlReader also provides static factory methods:
MzxmlReader tolerantReader = MzxmlReader.newTolerantReader(new File(mzxmlFilename), PeakList.Precision.DOUBLE);
MzxmlReader strictReader = MzxmlReader.newStrictReader(new File(mzxmlFilename), PeakList.Precision.DOUBLE);
Defining processing for future reading action
// this filter will no retain peaks with intensity of 0
MzxmlReader reader = MzxmlReader.newStrictReader(new File(mzxmlFilename), PeakList.Precision.FLOAT, new PeakProcessorChain<>()
.add(new ThresholdFilter<>(0, ThresholdFilter.Inequality.GREATER)));
Reading MsnSpectrum
Reading is done iteratively until there is no more spectra to get. Here is a snip example with MgfReader
MgfReader reader = new MgfReader(new File(mgfFilename), PeakList.Precision.FLOAT);
// hasNext() returns true if there is more spectrum to read
while (reader.hasNext()) {
// next() returns the next spectrum or throws an IOException is something went wrong
MsnSpectrum spectrum = reader.next();
// do some stuff with your spectrum
Assert.assertTrue(spectrum.size()>0);
}
reader.close();
Discussion
See also
For an advanced use of MgfReader see also Recipe 2.2
|