Previous Article | Next Article ![]()
Applied and Environmental Microbiology, March 2005, p. 1626-1637, Vol. 71, No. 3
0099-2240/05/$08.00+0 doi:10.1128/AEM.71.3.1626-1637.2005
Copyright © 2005, American Society for Microbiology. All Rights Reserved.
Institut für Physikalische Chemie, Friedrich-Schiller-Universität Jena, Jena,1 Lehrstuhl für Mustererkennung und Bildverarbeitung, Institut für Informatik, Albert-Ludwigs-Universität Freiburg, Freiburg,2 Schering AG,3 rap.ID Particle Systems GmbH, Berlin,4 Kayser-Threde GmbH, Munich, Germany5
Received 10 June 2004/ Accepted 29 September 2004
|
|
|---|
|
|
|---|
An alternative approach to the analysis of microorganisms is the application of vibrational spectroscopic techniques (infrared [IR] and Raman spectroscopy), which have a long tradition since the vibrational spectrum displays a fingerprint of the chemical composition of each bacterium (35). While an IR spectroscopic investigation of microorganisms requires a few hundred cells from controlled cultivation conditions for an analysis and a drying step (37), this is not necessary when applying Raman spectroscopy (36). In particular, when only a small sample amount is available, a special Raman technique called SERS spectroscopy (surface-enhanced Raman scattering spectroscopy) is especially suited. For an investigation of bacteria, various different SERS substrates or SERS microchips in combination with antibodies were tested (14, 15, 19, 24, 25, 53, 54, 63). By applying UV-resonance Raman spectroscopy, direct investigation of macromolecules such as proteins or DNA becomes possible. However, this Raman technique involves extensive experimental costs and extremely careful sample handling (4, 7, 8, 13, 18, 26, 28-30, 38, 39, 58, 59). In 1990, Puppels et al. developed a confocal Raman microscope, capable of recording Raman spectra of single human cells and polytene chromosomes (42). Since then, many biological phenomena in single human cells have been studied by Puppel's group; e.g., Raman spectra of the cell nucleus and the cell cytoplasm in human white blood cells were obtained (43-46). Various groups have reported the classification of bacteria by means of Raman spectroscopy (3, 11, 12, 17, 27, 31-34, 47). Very recently, Maquelin et al. (31) performed for the first time a clinical Fourier transform IR and near-IR-Raman study of bacterial contamination in blood cultures by using microcolonies obtained after 6 to 8 h of cultivation. Other papers have reported Raman and SERS investigations of single yeast cells, bacteria, or spores (1, 10, 21, 60-62). Various investigations of cell components of single bacteria or spores by means of Raman spectroscopy have also been reported (16, 20, 22, 40, 50, 51). In this paper we describe, to the best of our knowledge for the first time, a fast, nondestructive, and very reliable approach to the identification of bacteria on a single-microparticle level by means of a combination of a micro-Raman analysis together with a data classification approach, the so-called support vector machine (SVM) technique.
|
|
|---|
Bacteria and growth conditions.
The microorganisms were chosen according to the conditions present in clean rooms. The microorganisms Micrococcus luteus (DSM 348 and DSM 20030), Micrococcus lylae (DSM 20315 and DSM 20318), Bacillus subtilis (DSM 10 and DSM 347), Bacillus pumilus (DSM 27 and DSM 361), Bacillus sphaericus (DSM 28 and DSM 396), Escherichia coli (DSM 423, DSM 498, and DSM 499), Staphylococcus cohnii (DSM 6669, DSM 20260, DSM 6718, and DSM 6719), Staphylococcus warneri (DSM 20036 and DSM 20316), and Staphylococcus epidermidis (RP 62A) were purchased from the Deutsche Sammelstelle für Mikroorganismen und Zellkulturen and from the Institut für Infektionsbiologie, Universität Würzburg. They were cultivated on a standard or nutrition agar (Micrococcus and Bacillus) or on caseine-peptone soymeal-peptone agar (Staphylococcus) for different growing conditions, such as growing time and temperature, respectively. To simulate samples from clean rooms, the Raman measurements were directly performed on single cells from smears on fused silica plates.
SVMs.
The analysis of Raman spectra was performed in two steps: (i) preprocessing of the spectra and (ii) classification by using SVMs. The preprocessing was tested by different methods such as baseline correction, normalization, first derivative, and median filtering. Normalization for bulk data and median filtering for both bulk and single-bacterium analysis have obtained the best results. The classification was based on the regions from 850 to 1,750 and 2,650 to 3,150 cm1. The limits of those regions were chosen by using the optimization procedure of linear SVMs described below.
The classification step was achieved by using SVM. These large-margin classifiers are widely used in pattern analysis and are already well understood (5, 57). It has been shown that standard SVMs perform as well as or better than neural networks (NN) even in domains such as hand-written character recognition, where several years of research were spent to optimize the NN for a certain problem (48).
Since a classification task can always be broken down into several two-class problems (using a one-versus-one approach), a SVM solves only a two-class problem. The basic idea is as follows. The traditional approach to classification usually tries to build a model during the training step for each class independently from the other classes. The classifier then tests how well an unknown spectrum matches the different models and assigns the spectrum to the best-fitting one. This could also be interpreted as first using the two models for calculating a border between the two classes and second testing on which side of this border the spectrum lies. The idea of an SVM is therefore to combine these two steps and directly model the border between the two classes. This omits modeling of irrelevant parts and therefore needs much less training data. Since there are many possible hyperplanes, which separate the two classes in the feature space (Fig. 1A), the distance to the training samples is introduced as a quality criterion. With this criterion, there is only one global optimumthe hyperplane with the largest marginwhich could reliably be found in the training process. This is a big advantage of SVMs compared to NN, where several suboptimal solutions are found during the training process. The samples that are touched by the margin of the hyperplane are called "support vectors." Therefore, for training and classification, only these support vectors are necessary, while all other vectors could be removed from the training set without changing the result. If an SVM is trained, for example, to classify yeasts and bacteria, it will select the most yeast-like bacteria and the most bacterium-like yeasts as support vectors and will use only those to classify an unknown microorganism. The optimal separation plane with the largest margin and the support vectors (adjacent training samples marked by circles) are shown in Fig. 1B. In cases where the training set includes outliers, i.e., samples that are beyond the separating plane, a cost value is introduced to give those data points a disadvantage. In that way, SVM can model a real-world training data set very efficiently.
![]() View larger version (10K): [in a new window] |
FIG. 1. (A) Possible planes for separating the two classes. (B) The optimal separation plane has the largest margin and is defined only by the adjacent training samples. Support vectors are marked by circles. (C) Classification of simulated spectra. The SVM automatically detects relevant and irrelevant peaks. The third peaks of class +1 differ in size, and so compared to the third peak in class 1, those peaks contain no discriminative information and are irrelevant for the SVM classification.
|
The output of linear SVMs can be interpreted geometrically, so that one can find out which parts of the spectrum were used for the classification by looking at the direction of the normal vector of the separating hyperplane. In that way, one can identify relevant and irrelevant peaks; this is one of the main advantages of using SVM for classification of spectral data.
This is shown in a simple simulation in Fig. 1C, where three spectra (two of class +1 and one of class 1) and a plot of the normal vector are given. For the spectra, only the first two peaks contain relevant information, while the third does not. Training an SVM with those three spectra, the SVM will automatically find the relevant parts of the spectrum and ignore the irrelevant parts. The height of the peaks in the hyperplane plot shows how important this peak is for the classification, whereas the sign of the peak tells whether it belongs to class +1 or class 1.
Leave-one-out test.
For the estimation of the classification error probability of the final system (which will use all recorded spectra as the training set), the leave-one-out error was chosen (which uses N 1 samples as the training set) instead of the widely used "holdout method" (which uses only a certain fraction, e.g., 50%, of the samples as the training set). While it is mathematically proven that the leave-one-out error is an "almost unbiased" estimate for the real classification error probability (49), the holdout method is proven to always return a biased (too high) estimate of the classification error probability (55). (The term "almost" refers to the fact that the leave-one-out error provides an estimate for training on sets of size N 1 rather than N [49].)
Since the a priori probability for the occurrence of each cell species may vary from clean room to clean room, the reported "average recognition rate" is always the arithmetic mean of the recognition rates for each species and therefore equalizes the varying number of samples per species in our database.
|
|
|---|
Bulk spectra.
Since the origin of the microorganisms present in a clean room is unknown, a single-bacterium analysis requires careful testing of various parameters such as different culture media or growth times. In the experiments described below, typical clean-room samples are modeled as smears of various microorganisms on fused silica plates. In a first approach, 20 different strains of nine bacterial species which are typical of clean-room contaminants (Motzkus, personal communication) were chosen. Among the chosen bacteria, both colored and noncolored species can be found. Colored bacteria can be easily identified by the presence of carotenoids, which is the pigment in most colored microorganisms. However, this does not allow a distinct identification of the species or strain that is present. Identification of the noncolored bacteria is expected to be even more difficult. Different regions can be found on such a smear: (i) multilayer regions, which can be used to record bulk spectra, and (ii) regions with isolated single cells, where single-bacterium investigations can be performed.
In a first attempt, Raman measurements within a multilayer region on the smear were recorded in order to obtain bulk Raman spectra. Figure 2 shows the Raman spectra of nine different strains, typical of each species. The spectra were recorded with an integration time of 60 s on different multilayer regions on the plate (10 to 20 repetitions [Table 1 ]). The spectrum of the colored strain (M. luteus DSM 20030) is dominated by the carotene bands at 1,525, 1,154, and 1,002 cm1. Almost no vibrations due to the cell matrix can be seen. For the noncolored strains, the Raman spectra of the four genera are very similar. The Raman spectrum of E. coli DSM 499 reveals higher intensities of the amid-I band than do the spectra of the Bacillus strains, as well as a very intense C-H band (around 2,900 cm1). The three Bacillus strains (B. subtilis DSM 10, B. pumilus DSM 361, and B. sphaericus DSM 28) exhibit very similar spectra. The signal-to-noise ratio of the three Staphylococcus spectra is very low, which is due to the high fluorescence background of these strains (S. warneri DSM 20316, S. cohnii DSM 20260, and S. epidermidis RP 62 A). For a distinct identification of the strains, a reliable data analysis method is required. Therefore, a chemometric data analysis was performed.
![]() View larger version (25K): [in a new window] |
FIG. 2. Micro-Raman spectra of nine different strains (bulk). The numbers in the figure are the strain numbers.
|
|
View this table: [in a new window] |
TABLE 1. Recognition rate for bulk Raman spectra of various bacterial strains
|
The recognition rate (median filtering, normalization, linear SVM, leave-one-out test) shows very good results (98.0%; Table 1) for all bacterial strains. Of the 339 Raman spectra, 7 were misclassified; e.g., within E. coli, one DSM 498 strain spectrum was not classified correctly but was assigned as a spectrum of E. coli DSM 499 (i.e., the species was identified correctly whereas the assigned strain was wrong). Therefore, the overall identification at the species level reveals 98.9%. These results nicely prove that a reliable identification can be obtained for microcultures. However, it requires up to 6 h to obtain those microcultures by cultivation. It would be greatly preferable to identify single bacteria by means of micro-Raman spectroscopy. Therefore, experiments with isolated single cells were performed to test if a reliable identification of the bacteria is possible on the single-cell level.
Single-cell spectra.
Bulk Raman spectra are the result of an averaging over several bacteria. However, for isolated single bacteria, individual variations within the various cells need to be considered. Before performing a single-bacterium analysis, various experiments are needed to investigate if and how different parameters influence the identification of microorganisms on the single-cell level. As already mentioned, no information about the origin of the bacteria is available. Therefore, to create a reliable data set, the variation of different parameters, e.g., nutrition, temperature, and growth time, needs to be considered. Furthermore, it is also necessary to investigate the spatial heterogeneity within a single microorganism, i.e., whether there are any variations within the Raman spectra if the laser focus slightly shifts on the investigated single bacterium.
Raman mapping (heterogeneity effect).
Figure 3A shows a micrograph of an isolated single B. sphaericus DSM 28 cell on a fused silica plate. Figure 3B displays a Raman spectrum of this bacterium in comparison with a background spectrum, which was recorded beside the microorganism. To test if the Raman spectrum depends on the spatial position of the focus within the bacterium, Raman mapping experiments over the area displayed by the white square in Fig. 3A were performed. For the mapping experiments, a step size of 0.3 by 0.3 µm2 (total, 20 by 28 points) was chosen. These parameters are smaller than the spatial resolution of the Raman microscope (0.7 µm) but were chosen to increase the spatial overlap of the Raman mapping experiments. Each spectrum was measured with an integration time of 120 s, which leads to a total measuring time of 20 h. To minimize the background of fused silica, a pinhole of 500 µm was used.
![]() View larger version (53K): [in a new window] |
FIG. 3. Raman mapping experiment of a single bacterium (B. sphaericus DSM 28). (A) Micrograph of a single bacterium. The white frame indicates the mapping area (0.3 by 0.3 µm2) for taking the Raman images shown in panel C. (B) Micro-Raman spectra from selected positions within the marked scan area. The marked bands are used to calculate the Raman images plotted in panel C. (C) Raman maps for three different wavenumber regions labeled in panel B: a, 2,851 to 2,964 cm1; b, 1,604 to 1,671 cm1; and c, 1,410 to 1,455 cm1. For details, see the text.
|
2,900 cm1 (a), the amide I vibration at
1,650 cm1 (b), and the CH2 deformation vibration at
1,420 cm1 (c). As can be clearly seen in the Raman images, no dependency on the spatial position of the measurement could be observed; i.e., the bacterium shows spatial homogeneity. This can be explained by the fact that bacteria normally have no compartments. Some bacteria might contain vesicles where, for example, sulfur or poly-ß-hydroxybutyric acid is stored. As was shown by Schuster et al. (50, 51), Raman spectra of single Clostridium cells differ with different amounts of starchlike granulose. When line scans over the cell axis were used, no variations with the measuring position could be observed. Another example of structured bacteria involves resistant dominant bodies (spores), which are known to be more complex than vegetative cells since they exhibit several layers, which are schematically displayed in Fig. 4A. The marker substance of bacterial spores is calcium dipicolinate (CaDPA; the structure is shown in Fig. 4B). In Fig. 4C, two spectra of isolated B. sphaericus DSM 28 cells (vegetative cell and spore) are displayed. Distinct differences due to CaDPA can be observed in the spectra. The Raman spectrum of the spore shows a band at 1,651 cm1, which is due to the amide I band. The very intense signals at 1,565, 1,440, 1,383, and 1,007 cm1 can be assigned to CaDPA (6, 16, 18, 30, 41, 53). According to Carmona (6), the vibration at 1,007 cm1 can be assigned to the ring-breathing vibration of the pyridine ring. The C-O-C stretching vibration can be observed at 1,385 cm1, whereas both signals at 1,440 and 1,565 cm1 can be assigned to ring vibrations.
![]() View larger version (20K): [in a new window] |
FIG. 4. (A) Schematic diagram of a spore. (B) Chemical structure of CaDPA, which is a marker substance and can be found in all spores. (C) Raman spectra of a vegetative cell and a spore of B. sphaericus DSM 28.
|
![]() View larger version (56K): [in a new window] |
FIG. 5. Raman mapping experiment on single spores and vegetative cells (B. sphaericus DSM 28). (A) Micrograph of two spores surrounded by vegetative cells. The white frame indicates the mapping area. (B) Micro-Raman spectra from selected positions within the marked scan area. The spectrum at the bottom corresponds to a background spectrum, while the other spectra are taken from a vegetative cell or at two different positions within the spore, respectively. The marked bands are used to calculate the Raman images plotted in panel D. (D) Raman maps for the two different wavenumber regions labeled in panel B (a, 2,871 to 2,991 cm1; b, 993 to 1,034 cm1) for three different depths positions indicated by the three horizontal lines within the schematic sketch of a spore shown in panel C. Position 3, 1.0 µm; position 2, 0.5 µm; position 1, 0 µm. For details, see the text.
|
Cultivation conditions.
Another issue which might be of relevance for an analysis at the single-cell level is that of different nutritional conditions. This has also been tested by measuring Raman spectra of various single bacteria of different strains (not shown here) in different culture media and included in the identification data set.
Furthermore, the influence of the growth time on the Raman spectrum needs to be evaluated. Figure 6 shows representative Raman spectra of single B. subtilis DSM 10 (Fig. 6A) and M. luteus DSM 348 (Fig. 6B) cells recorded for different growth times as indicated. The spectra of single cells from very young cultures exhibit a low signal-to-noise ratio with broad bands, while Raman spectra of cells of older cultures show sharp distinct signals. These signals belong to vegetative cells and not to spores (compare Fig. 4C). For an unambiguous analysis on the single-cell level, these variations have to be taken into account, being included in the applied database for the chemometric identification (see below).
![]() View larger version (33K): [in a new window] |
FIG. 6. (A) Raman spectra of single B. subtilis cells recorded for different growth times as indicated. (B) Raman spectra of single colored M. luteus bacteria for various growth times as indicated.
|
Photobleaching.
The concentration variation of colored bacteria is accompanied by bleaching effects, which occur exclusively when working with a single colored bacterium and when the Raman excitation laser lies within the absorption band of the bacterium's chromophore. This is illustrated in Fig. 7A, where Raman spectra of a single M. luteus cell (cell marked by a circle in Fig. 7B, showing a microphotograph of several M. luteus DSM 348 cells) are plotted for three different irradiation times, as indicated. Each Raman spectrum was recorded with an integration time of 60 s. The top spectrum shows the initial Raman spectrum, while the two other spectra were recorded directly after irradiating the same single M. luteus bacterium after 60 and 360 s, respectively, with the 532-nm laser. The intensity of the carotenoid bands at 1,532, 1,157, and 1,005 cm1 decreases with increasing irradiation time. Figure 7C shows the dependency of the intensity of three different bands labeled by a, b, and c in Fig. 7A as a function of the irradiation time. As can be clearly seen from the marked signals, only the mode corresponding to the C
C stretch vibration of sarcinaxanthin at 1,532 cm1 (band a), which is resonantly enhanced, shows a time dependency. The intensities of the other two bands, b and c, which are not chromophore vibrations, are almost unaffected by the irradiation process. These measurements have been repeated several times, and the same time behavior could always be observed; i.e., this irradiation time behavior is absolutely reproducible. However, this bleaching effect of the chromophore modes is advantageous for a single-cell analysis, since after the bleaching has taken place, only the characteristic Raman bands due to the cell matrix are left. This is especially important since many bacteria produce pigment structures of the carotenoid type and since the production of these structures depends strongly on the cultivation and growth state. Thus, pigmentation of bacteria alone is generally useless for microbiological identification, but when it is used in combination with the information about the cell matrix obtained after bleaching, an exact identification can be made.
![]() View larger version (20K): [in a new window] |
FIG. 7. (A) Micro-Raman spectra of M. luteus DSM 348 recorded after irradiating the single M. luteus cell at 0, 60, and 360 s with the 532-nm laser, which is resonant with an electronic absorption of the chromophore sacinaxanthin of this microorganism. (B) Micrograph of various single M. luteus bacteria. The cell with which the Raman spectra in panel A was obtained is marked by a circle. (C) Dependency of the three bands labeled a, b, and c in panel A on the irradiation time. Only the band which corresponds to the C
|
Single-cell identification.
In Fig. 8, examples of micro-Raman spectra of single cells of nine representative different species are shown, with an integration time of 60 s per spectrum. The spectra show characteristic differences from the corresponding bulk spectra plotted in Fig. 2: a lower background, a lower signal-to-noise ratio, or additional signals due to the fused silica plate (asterisk in Fig. 8), which all occur from the very low sample volume of 0.5 µm3 (Micrococcus and Staphylococcus) to 2.5 µm3 (Bacillus and E. coli) of a single bacterium. The poor signal-to-noise ratio for each Raman spectrum of the various single cells is a result of the short integration time. However, the quality of the single-cell spectra shown in Fig. 8 is sufficient for an identification of the bacteria by means of an SVM (see below) (Table 2). Since time is a critical issue for the analysis of clean-room samples, the overall investigation time should be kept as short as possible.
![]() View larger version (28K): [in a new window] |
FIG. 8. Micro-Raman spectra of single living bacteria of nine different strains. The numbers in the figure are the strain numbers. *, fused silica.
|
|
View this table: [in a new window] |
TABLE 2. Recognition rate for Raman spectra of single bacteria
|
For an identification, spectra from single vegetative cells of different agar types and growth times are used and each single bacterium is represented by one spectrum; however, for M. luteus, three spectra recorded in a row were always used for the identification. Performing a classification with 2,257 spectra of the 20 different strains from nine species (median filtering, no normalization, linear SVM, leave-one-out test), we obtain 2,136 correctly identified spectra at the strain level (average recognition rate, 89.2%) and 2,180 correctly identified spectra at the species level (average recognition rate, 93.6%). The lowest recognition rate for strains was obtained for B. sphaericus DSM 362, with 76.7%, and that for species was for B. sphaericus DSM 27, with 82.5%. All the results at the single-cell level are summarized in Table 2. The decrease in recognition rate for single cells compared to the bulk samples was mostly because of less characteristic spectra and low signal-to-noise ratios.
With this first approach, it can be shown that micro-Raman spectroscopy in connection with SVMs is capable of rapid identification of bacteria at the single-cell level. When going from a bulk environment to a single-cell analysis, several points need to be considered. For this reason, different culture methods, which include different growth times and different agar types, are used to maximize the variation in the single strains. Furthermore, possible heterogeneity effects within single cells were evaluated, and it could be shown that single bacteria exhibit a spatial homogeneity. This is not the case for spores. If spores or bacteria with, for example, poly-ß-hydroxybutyric acid inclusions are included in the data set, more than one sample spot is necessary. It is possible, for example, to identify the principal axis of spores and to measure three times along this axis. For an investigation of single colored bacteria, where the Raman laser is resonant with the electronic absorption of the pigment, possible bleaching effects must be taken into account. However, it could be shown that such bleaching effects are advantageous since, after the bleaching process, only the Raman bands corresponding to the cell matrix necessary for an unambiguous identification of single cells are left. Overall, a total of 2,257 Raman spectra of single cells were used to differentiate among 20 strains belonging to nine different species, and a recognition rate of 89.2% for strains and 93.6% for species using an SVM technique could be achieved. These results make us hopeful that by increasing the number of species, a reliable database allowing for a rapid identification of bacteria in clean rooms can be established.
Within the scope of the main research "Biophotonic" supported by the German Ministry of Education and Research, we are currently developing a technique for the rapid detection of airborne biological contaminations within clean rooms (S. Hofer et al., 23 February 2004, German Patent Office). In this technique, the airborne microparticles are deposited on special filters and, in a successive monitoring step, the particles are differentiated into biological and nonbiological particles by means of fluorescence detection. Once the biological particles have been identified on the filter, the actual identification step by means of Raman spectroscopy and SVM can take place. The investigated basic principles of this method are supported by the companies Kayser-Threde (Munich) and RapId (Berlin). With these companies, a first functional model has already been realized.
The presented method can be readily used for all fields where a limited number of bacteria need to be identified. The ultimate goal of our work, however, is a generalization of the technique to all applications, e.g., food-processing technologies and medical applications, where microorganism contaminations are troublesome. To reach this goal, the diversity of microorganism needs to be extended.
We thank W. Kiefer for helpful scientific discussions. We are most grateful to D. Naumann and G. Puppels for many helpful and fruitful discussions and for the thorough review of the manuscript.
|
|
|---|
cjlin/libsvm.
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2009 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»