Evaluation of GC/MS Analyses
3 Evaluation of GC/MS Analyses
3.1 Display of Chromatograms
Chromatograms obtained by GC/MS are plots of the signal intensity against the retention time, as with classical GC detectors. Nevertheless, there are considerable differences be- tween the two types of chromatogram arising from the fact that data from GC/MS analyses are in three dimensions. Figure 3.1 shows a section of the chromatogram of the total ion cur- rent in the analysis of volatile halogenated hydrocarbons. The retention time axis also shows the number of continually registered mass spectra (scan no.). The mass axis is drawn above the time axis at an angle. The elution of each individual substance can be detected by evalu- ating the mass spectra using a ‘maximising masses peak finder’ program and can be shown by a marker. Each substance-specific ion shows a local maximum at these positions, which
Fig. 3.1 Three-dimensional data field of a GC/MS analysis showing retention time, intensity and mass axis
3 Evaluation of GC/MS Analyses
can be determined by the peak finder. The mass spectra of all the analytes detected are shown in a three dimensional representation for the purposes of screening. For further eva- luation the spectra can be examined individually.
Total Ion Current Chromatograms
The intensity axis in GC/MS analysis is shown as a total ion current (TIC) or as a calculated ion chromatogram (reconstructed ion chromatogram, RIC). The intensity scale may be gi- ven in absolute values, but a percentage scale is more frequently used. Both terms describe the mode of representation characteristic of the recording technique. At constant scan rates the mass spectrometer plots spectra over the pre-selected mass range and thus gives a three- dimensional data field arising from the retention time, mass scale and intensity. A signal parameter equivalent to FID detection is not directly available. (Magnetic sector mass spec- trometers were equipped with a total ion current detector directly at the ion source until the end of the 1970s!). A total signal intensity comparable to the FID signal at a particular point in the scan can, however, be calculated from the sum of the intensities of all the ions at this point. All the ion intensities of a mass spectrum are added together by the data system and stored as a total intensity value (total ion current) together with the spectrum. The total ion current chromatogram thus constructed is therefore dependent on the scan range used for data acquisition. When making comparisons it is essential to take the data acquisition condi- tions into consideration.
SIM/MID analyses give a chromatogram in the same way but no mass spectrum is retrie- vable. The total ion current in this case is composed of the intensities of the selected ions. Analyses where switching from individual masses to fixed retention times is planned often show clear jumps in the base line (see Fig. 2.138).
The appearance of a GC/MS chromatogram (TIC/RIC) showing the peak intensities is therefore strongly dependent on the mass range shown. The repeated GC/MS analysis of one particular sample employing mass scans of different widths leads to peaks of different heights above the base line of the total ion current. The starting mass of the scan has a sig- nificant effect here. The result is a more or less strong recording of an unspecific back- ground which manifests itself in a higher or lower base line of the TIC chromatogram. Peaks of the same concentration are therefore shown with different signal/noise ratios in the total ion current at different scan ranges. In spite of differing representation of the sub- stance peaks, the detection limit of the GC/MS system naturally does not change. Particu- larly in trace analysis the concentration of the analytes is usually of the same order of magni- tude or even below that of the chemical noise (matrix) in spite of good sample processing so that the total ion current cannot represent the elution of these analytes. Only the use of selective information from the mass chromatogram (see Section 3.1.2) brings the substance peak sought on to the screen for further evaluation.
In the case of data acquisition using selected individual masses (SIM/MID/SRM), only the changes in intensity of the masses selected before the analysis are shown. Already dur- ing data acquisition only those signals (ion intensities) are recorded from the total ion cur- rent which correspond to the prescriptions of the user. The greater part of the total ion cur- rent is therefore not detected using the SIM/MID/SRM technique (see Section 2.3.3). Only substances which give signals in the region of the selected masses as a result of fragment or
3.1 Display of Chromatograms 295
molecular ions are shown as peaks. A mass spectrum for the purpose of checking identity is therefore not available. For confirmation this should be measured using an alternating full scan/SIM mode or separately in a subsequent analysis. The retention time and the relative intensities of two or three specific lines are used as qualifying features. In trace analysis un- ambiguous detection is never possible using this method. Positive results of an SIM/MID analysis basically require additional confirmation by a mass spectrum. SRM analyses offer the recording of a MS/MS spectrum or require the monitoring of multiple transitions as ad- ditional qualifiers.
Mass Chromatograms
A meaningful assessment of signal/noise ratios of certain substance peaks can only be car- ried out using mass chromatograms of substance-specific ions (fragment/molecular ions). The three-dimensional data field of GC/MS analyses in the full scan mode does not only al- low the determination of the total ion intensity at a point in the scan. To show individual analytes selectively the intensities of selected ions (masses) from the total ion current are shown and plotted as an intensity/time trace (chromatogram).
The evaluation of these mass chromatograms allows the exact determination of the detec- tion limit above the signal/noise ratio of the substance-specific ion produced by a com- pound. With the SIM/MID mode this ion would be detected exclusively, but a complete mass spectrum for confirmation would not be available. In the case of complex chromato- grams of real samples mass chromatograms offer the key to the isolation of co-eluting com- ponents so that they can be integrated perfectly and quantified.
In the analysis of lemons for residues from plant protection agents a co-elution situation was discovered by data acquisition in the full scan mode of the ion trap detector and was evaluated using a mass chromatogram.
The routine testing with an ion trap GC/MS system gives a trace which differs from that using an element-specific NPD detector (Fig. 3.2). A large number of different peaks appear in the retention region which indicates the presence of Quinalphos as the active substance in the NPD evaluation (Fig. 3.3). The Quinalphos peak has a shoulder on the left side and is
Fig. 3.2 Analysis of a lemon extract using the NPD as detector. The chromatogram shows the elution of a plant protection agent component as well as the internal standard
3 Evaluation of GC/MS Analyses
Fig. 3.3 Confirmation of the identity of a lemon extract by GC/MS. The total ion current clearly shows the questionable peak with shoulder mass spectra. A co-eluting second active substance, Chlorfenvinphos, gives rise to the shoulder.
Fig. 3.4 Mass chromatograms for the specific masses show the co-elution of Quinalphos (m/z 146) and Chlorfenvinphos (m/z 267). The retention time range is identical with that in Fig. 3.3. The selective plot of the mass signals can be identified as part of the total ion current.
closely followed by another less intense component. In the mass chromatogram of the char- acteristic individual masses (fragment ions) it can be deduced from the total ion current that another eluting active substance is present (Fig. 3.4). Unlike NPD detection, with GC/MS analysis it becomes clear after evaluating the mass chromatogram and mass spectra that the co-eluting substance is Chlorfenvinphos.
In routine analysis this evaluation is carried out by the data system. If the information on the retention time of an analyte, the mass spectrum, the selective quantifying mass and a va- lid calibration are supplied, a chromatogram can be evaluated in a very short time for a large number of components (Fig. 3.5, see also Section 3.3).
3.2 Substance Identification
Fig. 3.5 The phosphoric acid esters are successfully identified by a library comparison after extraction of the spectra by a background subtraction. (A) Quinalphos is confirmed by comparison with the NBS library (FIT value 863) (B) Chlorfenvinphos is confirmed by a comparison with the NBS library (FIT value 899)
3.2 Substance Identification
Extraction of Mass Spectra
One of the great strengths of mass spectrometry is the immediate provision of direct infor- mation about an eluting component. The careful extraction of the substance-specific signals from the chromatogram is critical for reliable identity determination. For identification or confirmation of individual GC peaks recording mass spectra which are as complete as possi- ble is an important basic prerequisite.
3 Evaluation of GC/MS Analyses By plotting mass chromatograms co-elution situations can be discovered, as shown in Sec-
tion 3.1.2. The mass chromatograms of selected ions give important information via their maximising behaviour. Only when maxima are shown at exactly the same time can it be as- sumed that the fragments observed originate from a single substance, i. e. from the same chemical structure. The only exception is the ideal simultaneous co-elution of compounds. If peak maxima with different retention times are shown by various ions, it must be as- sumed that there are co-eluting components (see Figs. 3.7 and 3.17).
Manual Spectrum Subtraction
By subtraction of the background or the co-elution spectra before or behind a questionable GC peak the mass spectrum of the substance sought is extracted from the chromatogram as free as possible from other signals. All substances co-eluting with an unknown substance in- cluding the matrix components and column bleed are described in this context as chemical background. The differentiation between the substance signals and the background and its elimination from the substance spectrum is of particular importance for successful spectro- scopic comparison in a library search. In the example of the GC/MS analysis of lemons for plant protection agents described above this procedure is used to determine the identity of the active substances.
The possibilities for subtraction of mass spectra are shown in the following real example of the analysis of volatile halogenated hydrocarbons by purge and trap-GC/MS. Figure 3.6 shows part of a total ion current chromatogram. The peak marked with X shows a larger width than that of the neighbouring components. On closer inspection of the individual spectra in the peak it can be seen that in the rising slope of the peak ions with m/z 39, 75,
Fig. 3.6 Chromatogram of an analysis of volatile halogenated hydrocarbons by purge and trap GC/MS. The component marked with X has a larger half width than the neighbouring peaks
3.2 Substance Identification
Fig. 3.7 Continuous plotting of the mass spectra in the peak marked X in Fig. 3-6 110 and 112 dominate. As the elution of the peak continues other ion signals appear. The
ions with m/z 35, 37, 82, 84, 117, 119 and 121 appear in increased strength, while the pre- viously dominant signals decrease. Figure 3.7 shows this situation using the continuing pre- sentation of individual mass spectra in a characterised GC peak.
From the individual mass spectra (Fig. 3.7) it can be recognised that some signals ob- viously belong together. Figure 3.8 shows the mass chromatograms of the ions m/z 110/112
3 Evaluation of GC/MS Analyses
Fig. 3.8 Mass chromatograms for m/z 110/112 and m/z 117/119 shown above a total ion current chromatogram
Fig. 3.9 Analysis of a co-elution situation by inclusion of other fragment ions
3.2 Substance Identification
Fig. 3.10 Plot of a peak with selected areas for spectral subtraction
and m/z 117/119 (as a sum in each case) above the total ion current from the detected mass range of m/z 33-260. The mass chromatograms show an intense GC peak at the question- able retention time in each case. The peak maxima are not superimposed and are slightly shifted towards each other. This is an important indication of the co-elution of two compo- nents (Fig. 3.9).
If other ions are included in this first mass analysis, it can be concluded that the frag- ments belong together from their common maximising behaviour. After the individual mass signals have been assigned to the two components, the extrac- tion of the spectrum of each compound can be performed. Figure 3.10 shows the division of the peak into the front peak slope A and the back peak slope B. With the background sub- traction function contained in all data systems the spectra in the areas A and B are added and subtracted from one another.
The subtraction of the areas A and B gives the clean spectra of the co-eluting analytes. In Fig. 3.11 the subtraction A – B shows the spectrum of the first component, which is shown to be 1,3-dichloropropene (Fig. 3.12) from a library comparison. The reverse procedure, i. e. the subtraction B – A, gives the identity of the second component (Figs. 3.13 and 3.14).
A further frequent use of spectrum subtraction allows the removal of background signals caused by the matrix or column bleed. Figure 3.15 shows the elution of a minor component from the analysis of volatile halogenated hydrocarbons, which elutes in the region where col- umn bleed begins. For background subtraction the spectra in the peak and from the region of increasing column bleed are subtracted from one another.
The result of the background subtraction is shown in Fig. 3.16. While the spectrum clearly shows column bleed from the substance peak with m/z 73, 207 and a weak CO 2 background at m/z 44, the resulting substance spectrum is free from the signals of the interfering che-
3 Evaluation of GC/MS Analyses
Fig. 3.11 Spectral subtraction of the areas A–B: SMP
= Spectra of the rising peak slope (A), sample BKG
= Spectra of the falling peak slope (B), background SMP-BKG = Resulting spectrum of the component eluting first
Fig. 3.12 Identification of the first component by library comparison
mical background after the subtraction. This clean spectrum can then be used for a library search in which it can be confirmed as 1,2-dibromo-3-chloropropane.
In the subtraction of mass spectra it should generally be noted that in certain cases sub- stance signals can also be reduced. In these cases it is necessary to choose another back- ground area. If changes in the substance spectrum cannot be prevented in this way, the
3.2 Substance Identification
Fig. 3.13 Spectral subtraction of the areas B–A: SMP
= Spectra of the falling peak slope (B), sample BKG
= Spectra of the rising peak slope (A), background SMP-BKG = Resulting spectrum of the component eluting second
Fig. 3.14 Identification of the second component by library comparison
library search should be carried out with a small proportion of chemical noise. In the library search programs of individual manufacturers there is also the possibility of editing the spec- trum before the start of the search. In critical cases this option should also be followed to re- move known interfering signals resulting from the chemical noise from the substance spec- trum.
3 Evaluation of GC/MS Analyses
Fig. 3.15 The total ion current chromatogram of an analysis of volatile halogenated hydrocarbons shows the elution of a minor component at the beginning of column bleed (the areas of background subtraction are shown in black)
Fig. 3.16 Result of background subtraction from Fig. 3-15: SMP
= Spectra from the substance peak, sample BKG
= Spectra from the background (column bleed) SMP-BKG = Resulting substance spectrum
305 Deconvolution of Mass Spectra
3.2 Substance Identification
The advancements in full scan sensitivity especially in ion trap and time-of-flight MS instru- mentation as well as the increased application of fast and two-dimensional GC methods is creating a strong demand for post-acquisition deconvolution methods. The extraction of pure spectra from compounds co-eluting with other analytes or interferences with conven- tional background subtraction methods of GC/MS data systems is of very limited use and cannot recognize the transient dependence of ion intensities of multiple compounds eluting close together.
Fig. 3.17 Deconvolution example (Meruva 2006)
An automated mass spectra deconvolution and identification system (AMDIS) was devel- oped at the National Institute of Standards and Technology (NIST) with the support of the Special Weapons Agency of the Department of Defense, for the critical task of verifying the Chemical Weapons Convention ratified by the United States Senate in 1997. In order to meet the rigorous requirements for this purpose, AMDIS was tested against more than 30,000 GC/MS data files accumulated by the EPA Contract Laboratory Program without a single false positive for the target set of known chemical warfare agents. While this level of reliability may not be required for all laboratories, this shows the degree to which the algo- rithms have been tested. After two years of development and extensive testing it has been made available to the general analytical chemistry community for download on the Inter- net.
The AMDIS program analyses the individual ion signals and extracts and identifies the spectrum of each component in a mixture analyzed by GC/MS. The software comprises by an integrated set of procedures for first extracting the pure component spectra from the chromatogram and then to identify the compound by a reference library.
The overall process involves four sequential steps in spectrum purification and identifica- tion:
1. Noise analysis by a complete analysis of noise signals with the use of this information for component perception. A correction for baseline drift is done for each component in case the chromatogram does not have to have a flat baseline.
2. Component perception identifies the location of each of the eluted components on the re- tention time scale by investigating the elution peak profile.
3 Evaluation of GC/MS Analyses
3. True spectral “deconvolution” of the data. Even if there is no available constant back- ground for subtraction, AMDIS extracts clean spectra. The extraction of closely coeluting components is possible even for analytes that peak within a single scan of each other in a wide range of each component’s concentration.
4. Library search for compound identification to match each deconvoluted spectrum to a re- ference library spectrum.
Unlike a traditional identification algorithm, AMDIS includes uncertainties in the decon- volution, purity, and retention times in the match factor. The final match factor is a measure of both the quality of the match and of the confidence in the identification.
AMDIS can operate as a “black box” chemical identifier, displaying all identifications that meet a user-selectable degree of confidence. Identification can be aided by internal standards and retention times. Also employed can be retention index windows when identifying target compounds and internal and external standards as maintained in separate libraries. AMDIS
Fig. 3.18 Disulfoton, spectrum pur compound, NIST#: 118988
Fig. 3.19 Diazinone, spectrum pure compound, NIST#: 118996
3.2 Substance Identification
Fig. 3.20 Coelution spectrum 3 : 1 (at max Disulfoton in Fig. 3.17)
Fig. 3.21 Coelution spectrum 2 : 1 (at max Diazinone in Fig. 3.17)
Fig. 3.22 Coelution spectrum 1 : 1 (at right peakside in Fig. 3.17)
3 Evaluation of GC/MS Analyses
Fig. 3.23 Deconvolution of overlapping peaks in GCxGC/TOF-MS. Every vertical line indicates the peak of an identified component (Dimandja 2004, reprinted with permission from Analytical Chemistry, Copyright 2004 by American Chemical Society).
reads GC/MS raw data files in the formats of the leading GC/MS manufacturer or is already integrated in the instrument data systems.
With its unique deconvolution algorithms AMDIS has proven its capabilities for the efficient removal of overlapping interferences in many GC/MS applications. The deconvolution process turned out to be independent from the type of analyzer and scan rate used to resolve overlap- ping peaks for substance identification as well as multi component residue analysis (Dimandja 2004, Mallard 2005, Zhang 2006). Without time consuming manual data evaluation AMDIS provides sensitive compound information even with complex background present.
AMDIS has been designed to reconstruct “pure component” spectra from complex GC/MS chromatograms even when components are present at trace levels. For this purpose, observed chromatographic behavior is used along with a range of noise-reduction methods. AMDIS is distributed with specialized libraries (environmental, flavor and fragrance, and drugs and toxins) that were derived from the NIST Library. AMDIS has a range of other fea- tures, including the ability to search the entire NIST Library with any of the spectra extracted from the original data file. It can also employ retention index windows when identifying tar- get compounds and can make use of internal and external standards maintained in separate libraries. A history list of selected performance standards is also maintained.
As of version 2.62, AMDIS reads data files in the following formats:
Bruker (*.MSF)
Finnigan GCQ (*.MS)
Finnigan INCON (*.MI)
Finnigan ITDS (*.DAT)
HP ChemStation (*.D)
HP MS Engine (*.MS)
INFICON GCMS (*.acq)
JEOL/Shrader (*.lrp)
3.2 Substance Identification
Kratos Mach3 (*.run)
MassLynx NT (*.*)
Micromass (*.)
NetCDF (*.CDF)
PE Turbo Mass (*.raw)
Saturn SMS (*.sms)
Shimadzu MS (*.R##)
Shrader/GCMate (*.lrp)
Thermo Xcalibur Raw (*.raw)
Varian Saturn (*.MS)
The Retention Index
If the chromatographic conditions are kept constant, the retention times of the compounds remain the same. All identification concepts using classical detectors function on this basis. The retention times of compounds, however, can change through ageing of the column and more particularly through differing matrix effects.
The measurement of the retention times relative to a co-injected standard can help to over- come these difficulties. Fixed retention indices (RI) are assigned to these standards. An ana- lyte is included in a retention index system with the RI values of the standards eluting before and after it. It is assumed small variances in the retention times affect both the analyte and the standards so that the RI values calculated remain constant.
The first retention index system to become widely used was developed by Kovats. In this system a series of n-alkanes is used as the standard. Each n-alkane is assigned the value of the number of carbon atoms multiplied by 100 as the retention index (pentane 500, hexane 600, heptane 700 etc.). For isothermal operations the RI values for other substances are cal- culated as follows:
log t 0
log t R c
log t 0 R 24a c1 log t 0 R c
The t' R values give the retention times of the standards and the substance corrected for the dead time t 0 (t' R =t R –t 0 ). As the dead time is constant in the cases considered, uncorrected retention times are mostly used. The determination of the Kovats indices (Fig. 3.24) can be carried out very precisely and on comparison between various different laboratories is repro- ducible within +10 units. In libraries of mass spectra the retention indices are also given (see the terpene library by Adams, the pesticide library by Ockels, the toxicology library by Pfleger/Maurer/Weber).
On working with linear temperature programs a simplification is used which was intro- duced by Van den Dool and Kratz, whereby direct retention times are used instead of the logarithmic terms used by Kovats:
t R 0 x t 0 R c
t 0 R c1 t R 0 c
24b
3 Evaluation of GC/MS Analyses
Fig. 3.24 Determination of the Kovats index for a substance X by interpolation between two n-alkanes (after Schomburg)
The weakness of retention index systems lies in the fact that not all analytes are affected by variances in the measuring system to the same extent. For these special purposes homologous series of substances which are as closely related as possible have been developed. For use in trace analysis in environmental chemistry and particularly for the analysis of plant protection agents and chemical weapons, the homologous M-series (Fig. 3.25) of n-alkylbis(trifluoro- methyl)phosphine sulfides has been synthesised.
Fig. 3.25 n-Alkylbis(trifluoromethyl)phosphine sulfides (M series with n = 6, 8, 10, ^20)
The molecule in the M-series contains active groups which also respond to the selective de- tectors ECD, NPD, FPD and PID and naturally also give good responses in FID and MS (detec- tion limits: ECD ca. 1 pg, FID ca. 300 pg, Fig. 3.26). In the mass spectrometer all components of the M-series show intense characteristic ions at M-69 and M-101 and a typical fragment at m/z 147 (Fig. 3.27). The M-series can be used with positive and negative chemical ionisation.
3.2 Substance Identification
Fig. 3.26 Chromatograms of the M series and of pesticides (phosphoric acid esters) on columns of different polarities (after HNU/Nordion). Carrier gas He, detector NPD, program: 50 8C (2 min), 150 8C (208/min), 270 8C (68/min).
Components: M series M 6 ,M 8 , <: M 20 , 1 Dimethoate, 2 Diazinon, 3 Fenthion, 4 Trichloronate, 5 Bromophos-ethyl, 6 Ditalimphos, 7 Carbophenothion
Fig. 3.27 M series: EI Mass Spectrum of the component M 10 (HNU)
The use of retention indices in spite of, or perhaps because of, the wide use of GC/MS sys- tems is now becoming more important again as a result of the outstanding stability of fused silica capillaries and the good reproducibility of gas chromatographs now available. The broadening of chromatography data systems with optional evaluation routines is just begin- ning. These are especially dedicated to the processing of retention indices, e. g. for two-col- umn systems.
If the retention index of a compound is not known, it can also be estimated from empiri- cal considerations of the elements and partial structures present in the molecule (Tables 3.1 and 3.2). A first approximation can already be made using the empirical formula of an ana- lyte. This is particularly valuable for assessing suggestions from the GC/MS library search because, besides good correspondence to a spectrum, plausibility with regard to the reten-
3 Evaluation of GC/MS Analyses
Comparison of calculated retention indices with empirically determined values
Substance
Determined RI Atrazine
Calculated RI
Examples of retention index calculation
Contribution Total
Contribution Total
Contribution Total
14 C 100
14 H –
1 Cl
1 tert-C
Sum
3.2 Substance Identification
tion behaviour can be tested (see also Section 4.14). According to Weber (1992) the values de- termined give a correct estimation within 10 %.
Polar groups with hydrogen bonding increase the boiling point of a compound and are thus responsible for stronger retention. For the second and every additional polar group the retention index increases by 150 units. Branches in the molecule increase the volatility. For each quaternary carbon atom present in a t-butyl group the retention index is reduced by 100 units. Values can be estimated with higher precision from retention indices for known structures by calculating structure elements according to Tables 3.1 and 3.2.
Table 3.1 Contributions for the determination of the retention index from the empirical formula (Weber 1992).
Element
Index contribution
H, F C, N, O
Si in Si(CH 3 ) 3 0
Table 3.2 Retention behaviour of structural isomers (Weber 1992). Alkyl branches:
tertiary < secondary < n-alkyl
Disubstituted aromatics:
ortho < meta, para
Libraries of Mass Spectra
In electron impact ionisation (70 eV) a large number of fragmentation reactions take place with organic compounds. These are independent of the manufacturer’s design of the ion source. The focusing of the ion source has a greater effect on the characteristics of a mass spectrum, which leads to a particularly wide adjustment range, especially in the case of quadrupole analysers. The relative intensities of the higher and lower mass ranges can easily
be reversed. In the early days of use of quadrupole instruments this possibility was highly criticised by those using the established magnetic instruments. The problem is resolved in so far as both the manual and the automatic tuning of the instruments are aimed at giving the intensities of a reference compound (in contrast SIM tuning aims to give high sensitivity within a specific mass range, see Section 2.3.5.2). Perfluorotributylamine (FC43) is used as the reference substance in all GC/MS systems. Other influences on the mass spectrum in GC/MS systems are caused by the changing substance concentration during the mass scan (beam instruments). On running spectra over a large mass range (e. g. in the case of methyl
3 Evaluation of GC/MS Analyses
stearate with a scan of 50 to 350 u in 1 s) sharp GC peaks lead to a mismatch of intensities (skewing) between the front and back slopes of the peak. The skew of intensities is thus the opposite of the true situation. This effect can only be counteracted by the use of fast scan rates, which, however, result in lower sensitivity. In practice standardised spectra must be used for these systems in order to calculate the compensation (for background subtraction see Section 3.2.1). Ion trap mass spectrometers do not show this reversal of intensities be- cause there is parallel storage of all the ions formed. A mass spectrum should therefore not
be regarded as naturally constant, but the result of an extremely complex process. In practice the variations observed affect the relative intensities of particular groups of ions in the mass spectrum. The fragmentation processes itself are not affected (the same fragments are found with all GC/MS instruments) nor are the isotope ratios which result from natural distribution. Only adherence to a parameter window which is as narrow as pos- sible (so-called standard conditions) during data acquisition creates the desired indepen- dence from the external influences described.
The comparability of the mass spectra produced is thus ensured for building up libraries of mass spectra. All commercially available libraries were run under the standard conditions mentioned and allow the comparison of the fragmentation pattern of an unknown sub- stance with those available from the library. For the large universal libraries it should be as- sumed that most of the spectra were initially not run with GC/MS systems, and that still to- day many reference spectra are run using a solid sample inlet or similar inlet techniques. For example, the reference spectrum of Aroclor 1260 (a mixture of PCBs with a 60 % degree of chlorination) can only be explained in this way. Information on the inlet system used is rarely found in library entries.
EI spectra are particularly informative because of their fragmentation patterns. All search processes through libraries of spectra are mainly based on EI spectra. With the introduction of the highly reproducible advanced chemical ionisation into ion trap mass spectrometers, the first commercial CI library with over 300 pesticides was produced (Finnigan, 1992). The intro- duction of substructure libraries (MS/MS product ion spectra) is currently ongoing (NIST). The commercially available libraries are divided into general extensive collections and special task-related collections with a narrow range of applications.
3.2.3.1 Universal Mass Spectral Libraries
NIST/EPA/NIH Mass Spectral Library
The NIST/EPA/NIH Mass Spectral Library is probably the most popular and most widely distributed library for GC/MS instruments. The 2005 edition has been largely expanded by the number of EI mass spectra with the addition of Kovats retention indices. MS/MS mass spectra are increasingly included. Extensive spectra evaluation and quality control has been involved in the new edition of the NIST database. Each spectrum was critically examined by experienced mass spectrometrists, and each chemical structure has been examined for cor- rectness and consistency, using both human and computer methods. Spectra of stereo- isomers have been intercompared, chemical names have been examined by experts and IUPAC names provided. CAS registry numbers have been verified.
The NIST library is available with the full-featured NIST MS Search Program, which also includes integrated tools for GC/MS spectral deconvolution (see Section 3.2.1,
3.2 Substance Identification
AMDIS), mass spectral interpretation tools with thermodynamics-based interpretation of fragmentation and chemical substructure analysis. The binary format has not changed from the 2002 version, although several new files have been added that associate equiva- lent compounds and link individual compounds to the retention index library. Raw data files are provided in both an SDFile format (structure and data together) as well in earlier formats. The SDFile format holds the chemical structure as a MOLFile and the data in a simple ASCII format. The NIST MS Search Program is also part of many commercial in- strumental GC/MS software suites.
The 2005 edition of the NIST database is characterized by (NIST 2005):
190,825 EI spectra of 163,198 unique compounds
111 average peaks/spectrum
98 median peaks/spectrum
163,198 compounds with EI spectra
18,592 compounds with replicate spectra
27,627 replicate spectra from high quality sources
163,195 chemical structures
11% increase in coverage to earlier version
5,191 MS/MS spectra of 1,943 unique ions
1,920 ions (1,628 cations and 292 anions)
25,728 compounds with 121,112 retention indices
120,786 experimental values with references
Structure-based RI estimates The increase in the number of spectra was accomplished primarily by the addition of com-
plete, high quality spectra either measured specifically for the library or taken from major practical collections, including:
Chemical Concepts – including Prof Henneberg’s industrial chemicals collection (Max- Planck-Institute for Coal Research, Muehlheim, Germany), see details below
Georgia and Virginia Crime Laboratories
TNO Flavors and Fragrances
AAFS Toxicology Section, Drug Library, see details below
Association of Official Racing Chemists
St. Louis University Urinary Acids
VERIFIN & CBDCOM Chemical Weapons The new addition of Kovats retention index values contains 121,112 Kovats retention index
values for 25,893 compounds on non-polar columns, 12,452 of which are compounds repre- sented in the electron ionization library. Full annotation is provided, including literature source and measurement conditions. These are provided in a format accessible by the NIST Search Program and separately as an ASCII SDFile.
The new addition of MS/MS Spectra provides 5,191 spectra of 1,943 different ions (1,671 positive and 341 negative ions). A range of instruments is represented, including ion trap and triple quadrupole mass spectrometers. Spectra have been provided by contributors, mea- sured at NIST and extracted from the literature. It also documents spectrum variations be- tween instrument classes at different conditions. It was found that at sufficiently high sig-
3 Evaluation of GC/MS Analyses
nal-to-noise measurement conditions, modern instruments are capable of providing very re- producible, library searchable spectra. While collision energy can be an important variable, spectra varies in an understandable way depending on compound and instrument class and conditions. This library is provided in formats equivalent to the electron ionization library but with new fields added to describe the instrument and analysis conditions. A small num- ber of MS1 spectra are also included for reference purposes. These generally contain the ions used for MS/MS.
Wiley Registry of Mass Spectral Data
The Wiley Registry (TM) of Mass Spectral Data has recently been published in its 8th Edition (Wiley 2006). Is is one of the most comprehensive mass spectral libraries available in the file format for many mass spectral data systems for applications in forensics, environmental analysis, toxicology, and homeland security.
The 8th edition of the Wiley Registry contains nearly 400,000 mass spectra with over 183,000 searchable structures sourced from leading laboratories throughout the world. Most spectra are accompanied by the structure and trivial name, molecular formula, molecular weight, nominal mass and base peak. New in the edition are:
Chemical warfare precursors
Combinatorial library compounds
High molecular diversity for fragmentation analysis
New high resolution organics
Structure and substructure searchable
Spectra with retention indices Also available is the combination of the large Wiley Registry with the current NIST database.
The Wiley Registry 8th Edition/NIST 2005 (W8/N05) provides the most extensive mass spec- tral library with:
532,573 mass spectra
319,256 mass spectra with assigned, searchable structures
More than 2 million names and synonyms
High resolution spectra with most new spectra containing over 125 peaks per spectrum The W8/N05 provides comprehensive coverage of small molecule organics, pharmaceutical
drugs, illegal drugs, poisons, pesticides, steroids, natural products, organic compounds, and chemical warfare agents for different applications:
Toxicology/Forensics/Public Health: The library contains a wide scope of spectra covering drugs, poisons, pesticides, and metabolites.
Industrial R&D/Quality Assurance: The library contains a comprehensive collection of small organic compounds and their metabolites, including a combinatorial library appro- priate for fragmentation analysis.
Research/Teaching: The library contains data for fragmentation analysis as well as com- prehensive coverage of most compounds measurable by GC/MS.
Environmental: The library contains most known pesticides and includes precursors being used in the production of new pesticide classes.
3.2 Substance Identification
Available formats and compatibility: The library is available in two formats: Chemstation
(Agilent) and the NIST MS Search (Bruker, JEOL, LECO, Perkin Elmer, Thermo, Varian, Waters,). Other formats are available on request.
The Palisade Complete Mass Spectra Library
The world’s largest commercial mass spectral database is offered by Palisade with more than 600,000 mass spectra. All commercially available reference spectra – including those in the NIST and Wiley libraries – are contained in a single library. New spectra and structure up- dates will be distributed annually and new spectral collections are announced on the Inter- net for downloading by subscribers.
The Palisade Complete 600K includes all spectra of the NIST 2002 and Wiley Registry col- lections plus over 150,000 new spectra available only through Palisade:
606,000 total spectra
Spectra type EI 70 eV
495,000 unique compounds
327,000 CAS assignments
437,000 compounds with structures
985,000 chemical names
350 Mb Disk storage space
3.2.3.2 Application Libraries of Mass Spectra
Mass Spectra of Geochemicals, Petrochemicals and Biomarkers
This database is focused on organic, geochemical, and petrochemical applications and com- prises (De Leeuw 2004):
1,100 mass spectra of well-defined compounds.
Information including mass spectra, chemical structure, chemical name, molecular for-
mula, molecular weight (nominal mass), base peak, reference, and measurement condi- tion.
Chemical structures elucidated, if necessary, by a variety of techniques including NMR spectroscopy and single-crystal X-ray structure analysis (Wiley).
Chemical Concepts Library of Mass Spectra
The CC Mass Spectral Data collection (Chemical Concepts 2006) has been recently updated and consists of mass spectra of more than 40,000 compounds. It is included in the new re- lease of the NIST 2005 library.
The main part of this mass spectra reference library comes from the Industrial Chemicals Collection of Prof. Henneberg, Max-Planck-Institut for Coal Research, Muelheim, Germany. Also universities and institutes such as ETH, Zürich, Switzerland and ISAS, Dortmund, Germany have contributed their research spectra to this collection. Prior to being included into the library the data pass consistency and quality checks performed at the Max-Planck- Institut.
3 Evaluation of GC/MS Analyses
Additional information included with the mass spectra are (Wiley):
Chemical structure
Chemical name
Molecular formula
Molecular weight (Nominal mass)
Base peak
Reference
Measurement condition
Alexander Yarkov – Mass Spectra of Organic Compounds
The new specialized data collection contains 37,055 mass spectra of physiologically active or- ganic compounds. The data resulted from quality control in combinatorial synthesis and cover a wide range of compound classes.
Additional information included with the mass spectra are (Wiley):
Chemical structure
Chemical name
Molecular formula
Molecular weight (nominal mass)
Base peak
Reference
Measurement condition
Mass Spectra of Designer Drugs
This mass spectrum collection edited by Peter Rösner covers the entire range of designer drugs up to December 2006. It is the first database featuring systematic structures in depth. Carefully compiled by the mass spectral experts at the Regional Departments of Criminal In- vestigation in Kiel, Hamburg, and the Federal Criminal Laboratory in Wiesbaden, Germany, this database includes 67,321 mass spectra of 5,789 chemical compounds like designer drugs and medicinal drugs. Chemical warfare agents are added due to the recent interest in homeland security. All data has been taken from both legal and underground literature, pro- viding the most comprehensive picture of these compounds available worldwide. Highly po- tential hallucinogens like the Bromo-DragonFLY are covered (Wiley).
Mass Spectra of Volatiles in Food
This mass spectral database is dedicated to the application areas of the food and flavour in- dustries, and was selected and quality controlled by the mass spectral experts at the Central Institute of Nutrition and Food Research in the Netherlands. The database is now available in its 2nd edition (Wiley).
Mass Spectral and GC Data of Drugs, Poisons, Pesticides, Pollutants and Their Metabolites
This specialized collection is dedicated to environmental and forensic analysis, occupational toxicology and food analysis and contains data obtained from clinical samples over the course of more than 20 years. It encompasses 7,500 potentially harmful substances, from simple analgesics to designer drugs, and from pesticides and pollutants to chemical warfare
3.2 Substance Identification
agents, including metabolites to allow the identification of the mother substance. Karl Pfle- ger is the former, and Hans H. Maurer the current, head of the Clinical Toxicology Labora- tory at the clinical campus of Saarland University in Homburg, Germany. Together with Ar- min Weber, they have developed this unique and most comprehensive toxicological data- base.
In 2000, the 2nd edition was expanded by 2,000 new mass spectra to more than 6,300. Covered also in the printed hard cover edition parts 1–4:
Data of nearly all the new drugs relevant to clinical and forensic toxicology, doping control, food chemistry, etc.
Nearly complete coverage of trimethylsilylated, perfluoroacylated, perfluoroalkylated and methylated compounds.
Sections on sample preparation and GC-MS methods. The new 3rd edition 2007 gives all spectra in order of their molecular mass, since this
has become the prime benchmark criterion for the identification of unknown substances (Wiley).
Mass Spectra of Drugs, Pharmaceuticals and Metabolites
The collection edited by Rolf Kuehnle contains 2,200 mass spectra of Drugs, Pharmaceuti- cals and Metabolites. The inclusion of the silylated derivatives as used for GC/MS analyses is of special value in this collection. Additional information included are chemical structure, chemical name, molecular formula, molecular weight (nominal mass), base peak and the re- ference (Wiley).
Mass Spectra of Pharmaceuticals and Agrochemicals 2006
The collection of 4,563 unreduced spectra includes compounds that are subject to the drug trafficking laws as well as their precursors, by-products and metabolites. Other compound groups included are medical drugs, drugs with psychotropic effect, anabolics and pesticides. Chemical structures, synonym and systematic name, molecular weight, formula and experi- mental conditions complete the data record (Wiley).
Mass Spectra of Androgenes, Estrogens and other Steroids
The collection edited by Hugh L. J. Makin contains 2,979 EI mass spectra of androgens and estrogens and their trimethylsilyl-, O-methoxyoxime- and acetal derivatives. Each spectrum is accompanied by the structure and trivial name, molecular formula, molecular weight, nominal mass and base peak. All spectra of androgens and estrogens have been obtained on the same mass spectrometer under identical conditions (Wiley).
AAFS Drug Library
The American Academy of Forensic Sciences, Toxicology Section, committee was set up to coordinate the generation of reliable mass spectra of new drugs and metabolite standards, and to make these available to the profession on a timely basis. The mass spectral database as zip file and the list of entries is available for download from the Internet.
This library is a “subset” of one that has been compiled over a period of many years by Dr. Graham Jones and colleagues in Edmonton, Alberta, Canada. Pure drug spectra, plus GC breakdown products and pure metabolite standards have been edited into this compilation
3 Evaluation of GC/MS Analyses
of over 2,300 mass spectra, including many replicate entries. All spectra were run on Agilent quadrupole GC/MS instruments tuned against PFTBA. The current version of the full spec- tra library was last updated March 2006 (AAFS).
The Lipid Library
The Archives of Mass Spectra by W.W. Christie comprise approx. 1670 mass spectra in total. They are made available on the web for study but without interpretation for the following compound groups:
Methyl esters of fatty acids
Picolinyl esters
DMOX derivatives
Pyrrolidine derivatives
Miscellaneous fatty acid derivatives, lipids, artefacts, etc. All the mass spectra illustrated in these pages were obtained by electron-impact ionization
at an ionization potential of 70 eV on quadrupole mass spectrometers. The website also of- fers the Bibliography of Mass Spectra with lists of references mainly concerning the use of mass spectrometry for structural analysis of fatty acids mainly.
Library Search Procedures
In general it is expected that the identity of an unknown compound will be found in a library search procedure. However it is better to consider the results of a search procedure from the aspect of similarity between the reference and the unknown spectrum. Other information for confirmation of identity, such as retention time, processing procedure, and other spectro- scopic data should always be consulted. A short review in the journal Analytical Chemistry (W. Warr, 1993) began with the sentence ‘Library searching has limitations and can be dan- gerous in novice hands’. Examples of critical cases are different compounds which have the same spectra (isomers), the same compounds with different spectra (measuring conditions, reactivity, decomposition) or the fact that a substance being searched for is not in the library but similar spectra are suggested. In particular the limited scope of the libraries must be ta- ken into account. According to a short press publication by the Chemical Abstract Service in 1994 the total of CAS registry numbers had passed the 12 million mark. Every year ca. 600 000 new compounds are added!
As a result of the different software equipment used in current benchtop GC/MS systems two search procedures have become widely established: INCOS and PBM. The SISCOM procedure (Search for Identical and Similar Compounds) developed by Henneberg/Wei- mann is also available on stand-alone work-stations. It stands out on account of its excellent performance for data-system-supported interpretation of mass spectra. The procedures for determination of similarity between spectra are based on very different considerations. The INCOS and PBM procedures aim to give suggestions of possible substances to explain an unknown spectrum. Both algorithms dominate in the qualitative evaluation using magnetic sector, quadrupole and ion trap GC/MS systems. Other search procedures, such as the Bie- mann search, have been replaced by newer developments and broadening of the algorithms by the manufacturers of spectrometers.
3.2 Substance Identification
The newest development in the area of computer-supported library searches, the further devel- opment of the INCOS procedure, has been presented by Steven Stein (NIST) through targeted optimisation of the weighting and combination with probability values. An improvement in the hit rate was demonstrated in a comparison of test procedures with more than 12000 spectra.
3.2.4.1 The INCOS/NIST Search Procedure At the beginning of the 1970s the INCOS company (Integrated Control Systems) presented a search procedure which operated both on the principles of pattern recognition and with the com- ponents of classical interpretation techniques and which could reliably process data from differ- ent types of mass spectrometer. The early years of GC/MS were characterised by the rapid devel- opment of quadrupole instruments which were ideal for coupling with gas chromatographs, be- cause of their scan rates, which were high compared with the magnetic sector instruments of that time. The spectral libraries then available had been drawn up from spectra run on magnetic sector instruments.
From the beginning the INCOS procedure was able to take into account the relatively low intensities of the higher masses in spectra run on quadrupole systems, besides the typical high mass intensity magnetic sector spectra. The INCOS search has remained virtually un- changed since the 1970s. The search is known for its high hit probability, even with mass spectra with a high proportion of matrix noise obtained in residue analysis, and its complete independence from the type of instrument.
After a significance weighting (square root of the product of the mass and the intensity) and data reduction by a noise filter and a redundancy filter, the extensive reference database is searched for suitable candidates for a pattern comparison in a rapid pre-search. The pre- sence of up to eight of the most significant masses counts as an important starting criterion. The intensity ratios are not yet considered. It is required that only those reference spectra which contain at least eight of the most significant masses of the unknown spectrum are considered. Depending on the requirements of the user, spectra with less than eight match- ing masses are also further processed (Fig. 3.28 E: pre-search report, the number of candi- dates is marked with **). At the start of the search the parameter ‘minimum number to search’ must be adapted by the user. Reference spectra which only contain a small number or no matching masses or whose molecular weight does not match an optional suggestion, are excluded from the list of possible candidates and are not further processed.
The main search is the critical step in the INCOS algorithm, in which the candidates found in the pre-search are compared with the unknown spectrum and arranged in a priori- tised list of suggestions. Of critical importance for the tolerance of the INCOS procedure for different types of mass spectrometer and marginal conditions of data acquisition (and thus for the high hit rate) is a process known as local normalisation.
Local normalisation introduces an important component into the search procedure which is comparable to the visual comparison of two patterns (Fig. 3.29). Individual clusters of ions and isotope patterns are compared with one another in a local mass window. The cen- tral mass of such a window from the reference spectrum is compared with the intensity of this mass in the unknown spectrum in order to assess the matching of the line pattern in windows a few masses to the left and right. In this way the nearby region of each mass sig- nal is examined and, for example, the matching of isotope patterns (Cl, Br, S, Si for example) and cleavage reactions are assessed.
3 Evaluation of GC/MS Analyses
Fig. 3.28 (A) Complex chromatogram from the analysis of the soil at the location of a coking plant
Fig. 3.28 (B) A large peak with a mass spectrum with intense ions at m/z 180 and 186 appears at a retention time of ca. 20 min
3.2 Substance Identification
Fig. 3.28 (C) The mass chromatograms m/z 180 and 186 show peak maxima at the same retention time. Both peaks show the same intensity pattern
Fig. 3.28 (D) The spectrum of the peak is compared with the NIST library: INCOS sorting according to purity, all molecular weights permitted
3 Evaluation of GC/MS Analyses
Fig. 3.28 (E) The first suggestion of the INCOS search with tables showing the pre-search (left) and the spectrum comparison (main search, right). The pre-search table shows that 207 candidates were taken over into the main search. The main search table shows the order of rank of the first 10 hits sorted according to purity. All suggestions have high FIT values and low RFIT values
Fig. 3.28 (F) Spectra of the first three suggestions. Isomeric compounds are identified
3.2 Substance Identification
Fig. 3.28 (G) Difference spectrum of the first suggestion compared with the unknown spectrum. The positive part of the difference can be used for a new search
Fig. 3.28 (H) Search result from the difference spectrum with a high FITand PURity value
3 Evaluation of GC/MS Analyses The advantage of this procedure lies in the fact that deviating relative intensities caused by a
high proportion of chemical noise or the type of data acquisition do not have any effect on the result of the search. A variance in the relative signal intensities in a mass spectrum can be caused by varying the choice of spectra from the rising or falling slopes of the peak in the case of quadrupole and magnetic sector instruments and by changes in the tuning parameters of the ion source or its increasing contamination. Furthermore, local normalisation has a posi- tive effect on spectra with a high proportion of noise (trace analysis, chemical background).
Local normalisation is the reason why spectral libraries which are searched using INCOS only require one mass spectrum per substance entry. Two values are determined for spectral comparison as a result of local normalisation. The FIT value gives a measure of how well the reference spectrum is represented in terms of its masses in the unknown spectrum (reverse search procedure). The reversed mode of view- ing, whereby the presence of the unknown spectrum in the reference spectrum is examined (forward search procedure), is expressed as the RFIT value (reversed fit). The combination of the two values gives information on the purity of the unknown spectrum (Fig. 3.29). If the FIT value is high and the RFIT value much lower, it can be assumed that the spectrum measured contains considerably more lines than the reference spectrum used for compari- son. Using mass chromatograms or background subtraction it would be necessary to find out whether a co-eluate, chemical noise, the presence of a homologous substance, or another reason is responsible for the appearance of the additional lines.
All the candidates found in the pre-search are processed in the main search as described. As a result sorted lists according to PURity, FIT and RFIT are available (Table 3.3). The initial sorting according to purity is recommended because with this value the best estimation of the possible identity is achieved. Further sorting according to FIT values gives additional so- lutions which generally supplement the further steps towards identification, with valuable information on partial structures or identifying a particular class of compound.
For the subsequent manual processing the difference between a reference spectrum and the measured unknown spectrum can be established (see Fig. 3.28 G).
A new library search is possible with the remaining portion of the spectrum. In certain cases the co-elution of components at identical times can thus be established which, even
Fig. 3.29 Diagram showing local normalisation. FIT value high: All masses in the library spectrum are present in the unknown spectrum and the isotope
pattern also fits after ‘local’ normalisation of the intensities. RFIT value low: Only a few masses from the unknown spectrum are present in the library spectrum.
3.2 Substance Identification
Table 3.3 Results of the INCOS spectral comparison
FIT RFIT PUR
Assessment
high high high
Identification or that of an isomer very probable
high low low Identification possible, but homologues, co-elution, noise present low
high low
Possibly an incomplete spectrum
Sorting of the suggestions should first be carried out according to PUR and then according to FIT
with careful capillary gas chromatographic procedures, is observed with complex samples, for example in environmental analysis. Using the example of a particular case (Figs. 3.28 A–H) the individual steps of the INCOS spectral comparison are shown.
In the 1990s Steven Stein from NIST took the INCOS approach and extended it to the most common situations when unknown compounds are not present in the library. Structu- rally similar compounds can appear in the NIST library search hit list. By using the results of the library search Stein added probabilities to the hit list that give information about com- mon substructures which may be present or absent in the unknown compound. Based on this advanced performance, the NIST library search is recommended as the first step in the structural elucidation of compounds not found in reference libraries.
INCOS Library Search Principle
Pattern recognition (after Joel Karnovsky, INCOS)
Course
p
1. Significance weighting
2. Noise filter
Window ± 50 u, 640 masses
3. Redundancy filter
Window ± 7 u, 6 6 masses
4. Pre-search
8 masses + molecular weight
5. Main search local normalisation FIT, RFIT and PUR calculation
6. Sorting and display