Previous Article | Next Article ![]()
Applied and Environmental Microbiology, October 2008, p. 5891-5897, Vol. 74, No. 19
0099-2240/08/$08.00+0 doi:10.1128/AEM.00791-08
Copyright © 2008, American Society for Microbiology. All Rights Reserved.

National Center for Immunizations and Respiratory Diseases,1 National Center for Environmental Health, Centers for Disease Control and Prevention, Atlanta, Georgia 303332
Received 7 April 2008/ Accepted 6 August 2008
|
|
|---|
|
|
|---|
Pneumococcal conjunctivitis, an infection of the conjunctiva, is of significant public health concern in highly populated environments such as college campuses, nursing homes, and day care centers. Through the years, there have been large outbreaks of conjunctivitis that have occurred in various regions of the United States, including New York, California, New Hampshire, New Jersey, and Maine. Martin et al. (15) and Carvalho et al. (3) reported microbiological, biochemical or genetic evidence that all of the Pnc strains from these outbreaks lacked a detectable polysaccharide capsule. Lack of a capsule, as well as the insensitivity of pneumococcal culture and diagnostic assays, presents a challenge to correctly diagnose pneumococcal conjunctivitis.
Molecular and immunological technologies (real-time PCR and enzyme-linked immunosorbent assays) detecting expression of Pnc genes or antibodies in bodily fluids have been used with a limited degree of sensitivity for detection and diagnosis of pneumococcal disease (4, 23). However, advances in the field of proteomics and bioinformatics have now made it possible to identify novel diagnostic targets or biomarkers aimed at improved detection. These expressed-gene or protein targets could prove useful in differentiating infectious strains that have been associated with previous conjunctivitis outbreaks and could reduce transmission of this infection.
Mass spectrometry (MS), a rapid, powerful, and sensitive analytical tool has been used recently for the differentiation, identification, and characterization of microbial pathogens. In particular, MS techniques such as matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) MS have been used to analyze whole bacterial cells that have not been modified chemically or by mechanical disruption (6). In recent years, MALDI-TOF MS has been used to differentiate significant human pathogens such as Helicobacter pylori, Bacillus cereus, Escherichia coli, and Coxiella burnetii (1, 6, 9, 11-14, 16, 20, 21, 24, 25). Studies by Friedrich and colleagues employed MALDI-TOF MS for rapid identification of 10 different species of viridans streptococci (7). Additionally, the MALDI technology has been used to identify Mycobacterium and moreover distinguish between multiple strains within a species (18). By use of high-throughput measures such as MALDI-TOF, protein/peptide fingerprints can be generated based on a proteomic profile. These proteins or patterns could serve as uniquely expressed pathogen-specific peptide or protein biomarkers that may prove useful for diagnostic purposes.
In this report, we describe a differential proteomic analysis using MALDI-TOF MS of representative Pnc conjunctival (cPnc) U.S. outbreak isolates. The unique cPnc outbreak isolates were compared with other nonconjunctival, pneumococcal and streptococcal isolates and a limited number of nonstreptococcal strains and species. Additionally, statistical algorithms as well as traditional cluster analysis were used to identify similarities among these isolates, in particular the cPnc isolates. A list of peptides/proteins found among the isolates was compiled in which at least one peptide/protein was common and exclusively expressed in the cPnc isolates. These cPnc proteomic signatures or biomarkers could ultimately be useful in the diagnosis of this infection.
|
|
|---|
Bacterial strains.
All strains were from the CDC Streptococcus Reference Laboratory. Study strains consisted of 13 cPnc outbreak isolates as well as controls Streptococcus pneumoniae serotype 4, Pnc TIGR4, and Streptococcus pneumoniae unencapsulated strain R6; other streptococcal species, including Streptococcus oralis, Streptococcus mitis, Streptococcus pseudopneumoniae, and Streptococcus pyogenes (group A); and strains from heterologous genera Escherichia coli (group B), Staphylococcus aureus (group C), and Enterococcus faecalis (group D). In addition, pneumococcal serotypes contained within the 7-valent pneumococcal conjugate vaccine and NT pneumococcal sterile-site isolates were also used in the study for comparison (Table 1). The controls used in the study were not associated with the conjunctivitis outbreaks and were used to validate the methods' abilities to differentiate at the species and genus level. Groups A, B, C, and D were included as outgroups for statistical purposes. The 13 cPnc isolates described in this study are a limited sampling population and are considered representatives of all the clinical conjunctival isolates from the aforementioned U.S. outbreaks (New York in 1980, California in 1981, New Hampshire in 2002, New Jersey in 2002, and Maine in 2003).
|
View this table: [in a new window] |
TABLE 1. Bacterial strains used in this study
|
0.4) at 37°C with 5% CO2 for 4 to 5 h. The bacterial suspension was centrifuged at 4,600 x g for 10 min at 4°C. The supernatant was decanted, and the pellet was washed twice in sterile distilled water, followed by centrifugation at 10,000 x g at room temperature for 10 min. The pellet (
1012 cells) was resuspended in 50 µl of water, aliquoted (2 µl) in microcentrifuge tubes, and stored at –70°C until further use. To ensure purity among the isolates, the resuspended bacterial inoculum was streaked on a Trypticase soy agar blood plate and incubated overnight at 37°C with 5% CO2. All strains were cultured and grown three separate times over a 3-day period. The strains were grown to the same OD (mid-log phase at OD420 of
0.4) to ensure consistency in growth.
Preparing bacterial cell suspensions for MALDI-TOF analysis.
The MALDI matrix consisted of saturated solutions (20 mg/ml) of 3,5-dimethoxy-4-hydroxycinnaminic acid (sinapinic acid [SA]) (Sigma-Aldrich). SA was mixed with 50% acetonitrile and Milli-Q-grade water containing 10% trifluoroacetic acid. A 192-well stainless steel MALDI target plate (Applied Biosystems [AB], Framingham, MA) was used in the study. The plates were washed with Milli-Q-grade water, treated with methanol, and allowed to dry at room temperature. When dry, 0.5 µl of premixed suspensions containing matrices and whole bacterial forms or mass standards for calibration (Sequazyme peptide mass standards kit; AB) were spotted in four separate wells to create quadruplicates of samples and controls. In addition, 0.5 µl of bovine cytochrome c (1 mM) was added to one well of each sample and used as an internal standard. After air drying, the plates were inserted into the instrument for MALDI-TOF MS analysis.
MALDI-TOF MS analysis.
Mass spectra were acquired using a MALDI-TOF/TOF mass spectrometer (AB 4700 Proteomics Analyzer) equipped with a nitrogen laser (Nd:YAG) at 337 nm and a 200-Hz repetition rate. Analyses were performed at least 3 different days in linear delayed-extraction positive-ion mode at an accelerating voltage of 20 kV. The instrument was calibrated and checked before analysis with several calibration mixtures from either the peptide mass standards kit or the 4700 standard kit (AB), depending on the analysis mass range. Mass accuracy for each standard was within 0.05% of the corresponding average molecular weight. After initial manual laser intensity optimization and baseline data acquisition, spectra were acquired in automatic control mode, using uniform parameters to improve consistency and reproducibility. For optimum data quality of mass spectra in the m/z range of 2,000 to 14,000, SA was used as the matrix. The instrument was programmed to examine signals from at least 12 to a maximum of 100 randomly positioned nonoverlapping locations in each sample well, and the signals from the first 10 acquisitions for each spot that met the acceptance criteria were accumulated into one final-profile mass spectrum. A minimum of 11 individual spectra representing 10 accumulated subspectra were obtained from each well. The acceptance criteria, based on 1,000 laser shots per spot, were signal intensities between 2,000 and 55,000 counts and a signal/noise ratio of 10 or greater.
Data processing.
Mass spectra from three harvestings were processed in the following manner. Spectral data were exported as text format m/z-intensity lists with a unified m/z scale, using custom Microsoft Visual Basic for Applications (VBA) macros in Data Explorer, the AB viewing application. The text data were further processed and viewed by use of a suite of custom Microsoft Visual Basic .NET (VB.NET) programs. One custom program, MultiSpec Viewer, was designed to display hundreds of spectra at once in a number of formats, including a simulated gel view for visual analysis of the data set, which comprised several thousand individual spectra. Spectra failing to meet the quality requirements (usually containing no recognizable peaks due to failures of the automatic acquisition algorithms [approximately 10% of the total]) were discarded. The remaining spectra were subjected to background subtraction and then were summed by MALDI by well or by organism (to give
12 spectra or 1 representative high-quality spectrum, respectively); normalized to the base peak; smoothed using a 21-point, 2-pass Gaussian algorithm; and finally standardized and denoised using a custom Fortran program (22). The output of the standardizing and denoising programs was a set of profile spectra containing relative intensities of only the statistically significant peaks (22), with zeros at all other m/z values. Thus, these data sets were in an ideal format for further analysis by a range of commercial statistical and data-mining applications. To decrease the time required for statistical analyses, the summed spectra were typically compressed by a factor of 20, reducing
18,000 points to
900 for a typical m/z 2,000 to 14,000 spectrum. We used PAST software v1.34 (http://folk.uio.no/ohammer/past/doc1.html) for hierarchical cluster analysis, with the single summed spectra (one summed spectrum representing each organism) for input. We used a Fortran program, Random Forest (RF) v 5.1 (2; http://www.stat.berkeley.edu/users/breiman/RandomForests/cc_home.htm) for classification and identification, in this case with
9 summed spectra from three harvestings of each organism as a training set and
3 separate summed spectra as unknowns. Recompiling the Fortran RF code for each experimental condition was automatically driven by VB.NET programs, and custom viewing applications were developed to aid in interpreting the RF results.
Tentative peak matching and database searching.
A tentative identification of prominent peaks was done using the Tag-ident proteomics tool or ExPASy sequence retrieval system (http://us.expasy.org). In addition, "MS DB Filter," a custom VB.NET algorithm, was used to construct a CDC-modified database filtered from UniProt (http://www.ebi.ac.uk/uniprot/index.html). MS DB Filter excludes any Swiss-Prot and TrEMBL or UniProt entry described as a fragment, strips out signal and prepeptide sequences, and applies a rule to add or remove initial methionine as described by Pineda (19). The CDC-modified filtered database was used for data mining the deduced proteome from several bacterial species used in this study which have had the whole genome sequenced. As of April 2008, information for TIGR4 and R6 species/isolates used in this study could be found in the Swiss-Prot and TrEMBL databases (UniProt). Custom algorithms within MultiSpec Viewer were also used to generate peak lists from the acquired mass spectra. In addition a manual screen of an extensive Microsoft Excel spreadsheet consisting of the 45 isolates from 2 to 14 kDa was used to correlate generated peaks with the CDC-modified database in order to provide tentative protein identifications.
|
|
|---|
![]() View larger version (72K): [in a new window] |
FIG. 1. Differentiation of cPnc outbreak isolates and nonconjunctival bacterial controls by MALDI MS. The mass spectrum (A) and simulated-gel (B) views were prepared using a custom program, MultiSpec Viewer. The peak masses (2,000 to 14,000) in the spectrum and simulated-gel views are represented as m/z, and the relative intensity (0 to 100 [white to blue]) is expressed as a percentage. The three distinct colored lines along the right y axis are illustrated to easily distinguish the three main groups in the study (cPnc isolates, red; pneumococcal and streptococcal control isolates, green; control isolates for heterologous genera, blue). Lanes 1 to 13, cPnc outbreak isolates Sp 165, Sp 166, Sp 168, Sp 169, Sp 170, Sp 245, Sp 246, Sp 247, Sp 248, Sp 263, Sp 264, Sp 265, and Sp 266, respectively. Lanes 14 to 22, Pnc TIGR4, Pnc R6, S. mitis, S. oralis, S. pseudopneumoniae, E. coli, S. pyogenes, S. aureus, and E. faecalis, respectively. Each trace is the sum of all individual spectra (typically 10 to 20) for that organism, after background subtraction and smoothing.
|
![]() View larger version (31K): [in a new window] |
FIG. 2. Strain differentiation among cPnc isolates and identification of tentative ribosomal proteins present in cPnc isolates by MALDI MS. The spectrum view was prepared using a custom program, MultiSpec Viewer. The peak masses (2,000 to 14,000) in the spectrum are represented as m/z, and the relative intensity (0 to 100) is expressed as a percentage. Black arrows indicate the absence of ion peaks in isolate Sp 246. In addition, an overlay representing ribosomal proteins, obtained from a UniProt Pnc protein mass library, is illustrated (orange lines). Lanes 1 to 8, cPnc outbreak isolates Sp 165, Sp 169, Sp 170, Sp 246, Sp 247, Sp 248, Sp 263, and Sp 265, respectively. Each trace is the sum of all individual spectra (typically 10 to 20) for that organism, after smoothing.
|
|
View this table: [in a new window] |
TABLE 2. Tentative peak list (representatives) of conjunctival and nonconjunctival isolatesa
|
![]() View larger version (27K): [in a new window] |
FIG. 3. Hierarchal cluster analysis of cPnc outbreak isolates and nonconjunctival bacterial controls. The PAST program, using the Jaccard similarity coefficient (expressed as a percentage), was used to assess the relatedness of the cPnc outbreak isolates and controls. A dendrogram of cPnc outbreak isolates compared with pneumococcal, streptococcal, and nonstreptococcal species is presented. Input data had been summed (all spectra for each organism), background subtracted, smoothed, standardized, and denoised. Shown are results for cPnc outbreak isolates (group 1), S. mitis (group 2), S. oralis (group 3), S. pseudopneumoniae (S. pseudopn. [group 4]), Pnc R6 and TIGR4 (group 5), Pnc sterile-site isolated strains (group 6), and Pnc 7-valent vaccine serotypes (group 7) and heterologous genera, including, E. coli, S. pyogenes (SMIC), S. aureus, and E. faecalis (group 8).
|
|
|
|---|
Previous studies using molecular techniques, such as pulse-field gel electrophoresis, multilocus sequence tagging, and PCR, have revealed that the cPnc isolates are similar genotypically (3, 15). Using MS, proteins are the most characteristic macromolecule that can be assessed without extraction, separation, or amplification (6), as required by the aforementioned technologies. In this proteomic study, albeit confirmatory with previous genetics-based investigations (3, 15), MALDI-TOF MS analysis as evident by visual spectrum analyses and hierarchal cluster analysis also demonstrated that the cPnc outbreak isolates are very similar. The conjunctival isolate clustering is a reflection of unique strain characteristics of cPnc within the subset of proteins being examined in this study. Moreover, uniquely expressed genes that are identified will make ideal candidates for biomarker evaluation.
Additionally, RF was able to separate the strains in this study into groups at the genus, species, and, to a certain extent, strain level (Sp 246) with minimal error. The low error rate of 3.13% among the cPnc isolates indicates that the RF algorithm is able to correctly identify and categorize mass spectra to the given appropriate class (individual strains or isolates) or group (similar strains, i.e., specific cPnc outbreaks). The spectra that are consistently being misclassified after successive screenings resulting in error rates may be due to low-quality spectra that were not filtered appropriately. Interestingly, from a biological perspective, error rates may not necessarily be a negative. In our case, mismatched spectra which resulted in low error rates can simply imply that the cPnc isolates are biologically related and are too similar for the algorithm to distinguish.
MALDI-TOF MS is a tool with great promise for the medical, public health, and scientific communities. Mass spectral fingerprinting using MALDI-MS has been used to detect biomarkers from whole unfractionated microorganisms, including viruses, prokaryotes, and a few unicellular eukaryotes (1, 6, 9, 11-14, 16, 20, 21, 24, 25). These biomarkers have proven useful for rapidly identifying and differentiating microbial pathogens. For instance, small acid-soluble proteins have been used to characterize Bacillus species (5). Additionally, Shaw et al. reported the identification of biomarkers in unfractionated C. burnetii cells phase I purified from embryonic egg yolk sac preparations (24). Furthermore, spectral markers in the mass range of 2,000 to 8,000 Da were obtained from MALDI-TOF MS analysis of four human microsporidian isolates (16). Biomarkers for Mycobacterium species have also been detected by MALDI primarily in the 500- to 2,000-Da range, most likely representing lipid molecules or small polypeptides (18).
Protein biomarkers identified by MALDI-TOF MS are often basic, such as the highly conserved and abundant ribosomal protein families (19). In the present study, several ribosomal proteins, as illustrated in Fig. 2, were tentatively identified in the range of 2,000 to 14,000 Da by database searching and spectrum overlay. The tentative proteins appeared to be conserved, based on mass, among the cPnc isolates as well as in other penumococcal strains. In addition, there was a peak at m/z 2,944 that was common to and uniquely expressed in the cPnc isolates relative to other strains tested. This biomarker candidate will require amino acid sequencing for validation as a clinical diagnostic marker.
In conclusion, MALDI-TOF MS, a rapid and sensitive methodology, was successfully utilized for differentiating cPnc U.S. outbreak isolates. Through statistical algorithms and hierarchal clustering, it was demonstrated that the cPnc outbreak isolates from California and the northeastern United States are very similar. Based on their MALDI-TOF MS fingerprints, putative peptide/protein biomarkers were tentatively identified, one of which was common and exclusively expressed in cPnc isolates. These cPnc proteomic signatures or biomarker candidates could ultimately be fruitful in the diagnosis of this infection. These expressed biomarkers are advantageous compared to genetic markers that would provide only information based on their expressive potential. Conjunctival isolate protein biomarkers would be a true indication of the organisms' ability to cause disease. Moreover, MALDI-TOF MS, with its high sensitivity, may also prove useful in gaining insight into the pathogenic mechanisms of disease, in particular mechanisms by which these NT cPnc strains cause large sporadic outbreaks. For instance, cPnc surface proteins associated with adherence or attachment to host cells that would subsequently initiate infection could be used as biomarkers. Furthermore, understanding how and why these cPnc strains cause disease can aid in the development of better treatments and even prophylactic measures to minimize the spread of infection during future outbreaks.
We thank Rickard Facklam for insight.
The findings and conclusions in this report are those of the authors and do not necessarily represent the officials of the Centers for Disease Control and Prevention.
Published ahead of print on 15 August 2008. ![]()
|
|
|---|
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Copyright © 2010 by the American Society for Microbiology. For an alternate route to Journals.ASM.org, visit: http://intl-journals.asm.org | More Info»