Wiley Registry of Tandem Mass Spectral Data, MS for ID

Mass spectrometric-based identification of small molecules

Gas chromatography (GC) hyphenated to electron impact ionization mass spectrometry (EI-MS) represents the "golden standard" for general unknown screening. Over the last decades very large libraries of standardized spectra have been created for GC-MS techniques, enabling simultaneous screening for thousands of compounds. Despite its proven record of success, GC-MS is faced with problems regarding the detectability of polar, thermally labile, and high mass molecules. Hence, complementary ionization techniques have been developed (ESI, APCI). ESI and APCI are soft ionization techniques. Usually, only molecular ions are formed. Via accurate mass measurements the elemental formula of a molecule can be determined. Currently, time-of-flight (TOF)-MS represents the most cost-effective technique for performing accurate mass analysis on a routine basis. Due to the inability of MS to differentiate isobaric substances, the molecular formula represents an insufficient amount of information for unequivocal identification. Collision-induced dissociation (CID) can be used to obtain structural-related information of analytes. Diagnostic fragment ions are selectively in the collision cell of an instrument dedicated for tandem MS (MS/MS). For ESI and APCI, the energetic characteristics of ion production and activation are much less well defined compared to EI. The ions need to cross a high-pressure region, where their internal energy can be modified, before they enter the mass analyzer system. Consequently, CID spectra may strongly differ upon applied experimental conditions (pressure, acceleration voltages, nature of the solution and the gas phases), which makes them difficult to compare. Thus, transferable tandem mass spectral libraries have not been established yet.

Our strategy for the developing a tandem mass spectral library

To build up the spectral library, 1000 substances were used as reference samples. At the current stage of development the database mainly consists of drugs for therapeutic purposes as well as illicit substances. All investigated compounds are of forensic or toxicological interest as they are able to cause severe or even fatal intoxications. Tandem mass spectra were acquired on a QqTOF instrument. To increase the tolerance of the library towards the applied collision energy (CE), product ion spectra of reference compounds were acquired at ten different CE values between 5 eV and 50 eV. As expected, the applied CE affected the number of detected fragments as well as the measured relative signal intensities. Spectra showing low, medium, and high levels of fragmentation were observed. Because of saturation effects and to avoid false positive matches of the precursor ion with product ions associated to alternative compounds, all signals within a 4.0 u window around the m/z of the precursor ion were deleted from the obtained reference spectra. For a further increase of specificity, all signals found in a reference spectrum that could not positively contribute to the precursor identification were eliminated.  Only those signals with a relative intensity above 0.01%, and which were observed twice or more times within a collection of substance-specific product ion spectra were considered to be suitable for identification. The remaining species were deleted from the reference spectra. Finally, artefacts were erased that arose from improper centroiding and bypassed the already installed filtering steps.

The library search algorithm

Depending on the applied experimental conditions, the number of fragment ions and/or the corresponding signal intensities can vary between compound-specific MS/MS-spectra. Common library search algorithms were developed and optimized for the comparison of highly reproducible EI-spectra. Thus, they often malfunction if the identity of compounds needs to be proven via the comparison of MS/MS-spectra.

library search small molecules principle

We have developed a sophisticated procedure dedicated to the identification of an unknown by finding similarity and/or identity between its fragment ion spectrum and a collection of fragment ion mass spectra stored in a library. The measured product ion mass spectrum of an unknown compound represents the input for library search. The spectrum is compared with all mass spectra stored in the library. In each case the similarity is determined. The estimation of similarity starts with the identification of ions that are present in both of the two spectra compared. They are called 'matching fragments' (mf). For a match, the difference of the m/z-values must be smaller than a user defined value ( =0.1 amu). Next, the 'reference spectrum-specific match probability' (mp) is calculated. The mp-value increases with increasing correlation between the two spectra compared. As the mass spectral library consists of MS/MS-spectra that have been collected at several different collision energy values for each single reference compound a number of mp-values are obtained. The reference compound-specific mp-values are averaged to yield the compound-specific 'average match probability' (amp). To facilitate comparison, amp is converted into the 'relative average match probability' (ramp). Consequently, single ramp-values range between 0 and 100. The substance with the highest ramp is considered to represent the unknown compound if its ramp exceeds a value of 50.0. Next, the monoisotopic mass of the best matching compound is checked for accordance with the monoisotopic mass of the precursor ion. If the monoisotopic masses do not agree with each other, identity is excluded. Only the presence of some structural similarity between the unknown and the best matching reference compound can be considered. Provided that the 'top hit' passes this final check the correct compound should have been identified with high probability.


Performance of the library search approach

Our library is the most extensively tested tandem mass spectral library available. Studies performed include cross-validation with other tandem mass spectral libraries, library search with data extracted from literature, as well as several multicenter studies covering different types of instruments, including QqQ, IT, LIT, QqLIT, QqTOF, LIT-Orbitrap and LIT-FTICR. Furthermore, our library was found to be more sensitive, specific, robust and transferable than competitive tandem mass spectral libraries. Typically, sensitivity exceeds 95%.



The current version of the WRTMD contains >10,000 spectra of ~1,200 compounds, mainly pharmaceutical compounds, illicit drugs and metabolites thereof. Therefore, the most important fields of application of the library are forensic toxicology, environmental analysis, and clinical toxicology.


Österreichische Forschungsförderungsgesellschaft: dnatox – Die Kopplung der Flüssigkeitschromatographie mit der Massenspektrometrie als Werkzeug für die Toxin- und DNA-Analytik, KIRAS PL 2 Projekt 813786, 2008-2009.


Home » Research » Bioanalytical Mass Spectrometry » Tandem Mass Spectrometric Library