JOURNAL OF CLINICAL MICROBIOLOGY, Oct. 2010, p. 3614–3623 Vol. 48, No. 10 0095-1137/10/$12.00 doi:10.1128/JCM.00157-10 Copyright ? 2010, American Society for Microbiology. All Rights Reserved.
Genomic Signatures of the Haarlem Lineage of Mycobacterium tuberculosis:
Implications of Strain Genetic Variation in Drug and
11223,4Andre?s Cubillos-Ruiz,† Andrea Sandoval,† Viviana Ritacco,Beatriz Lo?pez,Jaime Robledo, 3,4111,4Nidia Correa,Iva?n Hernandez-Neuta,Maria Mercedes Zambrano,and Patricia Del Portillo*
1Corporacio?n Corpogen, Carrera 5 no. 66A-34, Bogota?, Colombia; Instituto Nacional de Enfermedades Infecciosas ANLIS “Carlos G. Malbra?n,” 2Velez Sars？eld 563, Buenos Aires, Argentina; Corporacio?n para Investigaciones Biolo?gicas, CIB, Universidad Ponti？cia Bolivariana, 34Carrera 72a no. 78B-141, Medellín, Colombia; and Centro Colombiano de Investigacio?n en Tuberculosis, Medellı?n, Colombia
Received 25 January 2010/Returned for modi，cation 5 April 2010/Accepted 7 July 2010
Tuberculosis is the world’s leading cause of death due to a single infectious agent, and efforts aimed at its control require a better understanding of host, environmental, and bacterial factors that govern disease outcome. Growing evidence indicates that certain Mycobacterium tuberculosis strains of distinct phylogeo- graphic lineages elicit unique immunopathological events. However, identifying the genetic basis of these phenotypic peculiarities has proven dif；cult. Here we report the presence of six large sequence polymorphisms which, together with two single-nucleotide changes previously described by our group, consistently differentiate Haarlem strains from the remaining M. tuberculosis lineages. The six newly found Haarlem-speci；c genetic events are four deletions, which altogether involve more than 13 kb, and two intragenic insertions of the element IS6110. The absence of the genes involved in these polymorphisms could have an important physio- logical impact on Haarlem strains, i.e., by affecting key genes, such as Rv1354c and cyp121, which have been recently proposed as plausible drug targets. These lineage-speci；c polymorphisms can serve as genetic mark- ers for the rapid PCR identi；cation of Haarlem strains, providing a useful tool for strain surveillance and molecular epidemiology studies. Strain variability such as that described here underscores the need for the de；nition of a core set of essential genes in M. tuberculosis that are ubiquitously present in all circulating lineages, as a requirement in the development of effective antituberculosis drugs and vaccines.
Mycobacterium tuberculosis is the causative agent of tuber- the principal sources of phenotypic variation in M. tuberculosis, culosis, the leading cause of death by a single bacterial agent in the speci，c genomic changes that de，ne each lineage have not the world (36). Infection with M. tuberculosis has historically yet been fully de，ned. shown to result in a variety of clinical outcomes that are usually There are currently six phylogeographic lineages that make associated with host inherited susceptibility and environmental up the M. tuberculosis global population (10). One is the Euro- risk factors (2, 31, 32). Moreover, increasing evidence suggests American group, which includes all the spoligotype families that genetic variation in the tubercle bacilli also plays an im- predominating in the Western world, such as Haarlem, LAM, portant role in the outcome of the disease (4, 19, 33). Due to and the ill-de，ned T group (3). In particular, the Haarlem the absence of exchange of genetic material with a global genotype is ubiquitous worldwide (15) and represents about microbial gene pool, M. tuberculosis had long been considered 25% of the isolates in Europe, Central America, and the Ca- to have a clonal population structure. However, a signi，cant ribbean, suggesting a link with the post-Columbus European strain-to strain genetic variation within M. tuberculosis has re- colonization (8). Haarlem strains are actively transmitted in cently been unveiled (11, 19). urban settings in Colombia, causing major public health prob- Changes in neutral regions of the chromosome, such as the lems (N. E. Correa, E. Zapata, V. Go?mez, G. E. Mejia, A. direct repeat (DR) locus, and in the mycobacterial inter- Restrepo, J. Robledo, and CCITB, presented at the 107th spersed repetitive units (MIRUs) are useful in epidemiological General Meeting of the American Society for Microbiology, and phylogenetic analyses and in describing the most conspic- Toronto, Canada, 2007) and have also been responsible for a uous M. tuberculosis lineages (3, 21). In addition to the varia- prolonged outbreak of multidrug-resistant tuberculosis in Ar- tion in neutral regions, genetic polymorphisms involving gentina (26, 29). coding regions have been described to occur through single- An intriguing question is whether M. tuberculosis strains nucleotide changes and through deletion and insertion events, differ in terms of pathogenic characteristics as a consequence the latter mediated mainly by the IS6110 element (23, 30). of long-standing interactions of particular lineages with speci，c Although these genomic alterations are thought to be among human populations. Animal models that take advantage of an identical genetic background, and therefore a uniform host immune response, have given insight regarding the contribu- * Corresponding author. Mailing address: Corporacio?n Corpogen, tion of strain genetic diversity to the outcome of the infectious Carrera 5 no. 66a-34, Bogota?, D.C., Colombia. Phone: 57-1-8050106. process (7, 20). It is currently accepted that genetically differ- Fax: 57-1-3484607. E-mail: email@example.com. ent M. tuberculosis strains produce markedly different immu- † These authors contributed equally to this work. nopathological events in isogenic mice (4, 18). Thus, under- Published ahead of print on 14 July 2010.
VOL. 48, 2010 GENOMIC SIGNATURES OF M. TUBERCULOSIS HAARLEM STRAINS 3615
of insertions, PCRs were carried out for 35 cycles consisting in 45 s of denatur- standing genotypic differences and mechanisms underlying ation at 94?C, 45 s of annealing at 66?C for HSI1 and 71?C for HSI2, and 120 s infection variability and identifying speci，c changes or genes of extension at 72?C. PCR products were veri，ed by 1.5% agarose gel electro- associated with both virulence and immunopathogenicity of phoresis for the presence of a single ampli，cation band. Five randomly chosen the different M. tuberculosis lineages have important implica- products for each region were sequenced using the BigDye terminator cycling tions for the future effective control of tuberculosis (7, 33). conditions (Macrogen, South Korea) in order to con，rm that the target region was ampli，ed. For the detection of single-nucleotide polymorphisms (SNPs) in In a recent bioinformatic study using multiple genome align- the ogt and ung genes, the primers and conditions reported previously for allelic ments of six fully sequenced M. tuberculosis strains belonging discriminatory PCR were used (25). to different lineages, we showed a trend toward accumulation Statistical analysis. The Fisher exact test was applied to determine signi，cant of a limited number of genome-speci，c polymorphisms pref- associations between polymorphisms and M. tuberculosis lineages. erentially associated with circulating strains and underrepre- sented in laboratory strains. This suggests that such polymor- RESULTS phisms arise as active mechanisms of adaptation to the human
Speci；c polymorphisms in strains of the Haarlem lineage of host (5). We speculated that some of these genome-speci，c
polymorphisms might be common to strains of a particular M. tuberculosis. Of 12 deletions and 6 insertions identi，ed in a
lineage rather than being an exclusive property of the isolate previous bioinformatic study as unique to the sequenced Haar- examined. To test this, in the present study we examined lem strain (5) (www.broadinstitute.org/), we selected the most whether genome-speci，c polymorphisms previously identi，ed conspicuous to investigate if they were lineage-wide mutations. in fully sequenced strains were present in a broader group of Speci，cally, the IS6110 insertions and the largest deletion strains and could thus represent a lineage-wide condition. In polymorphisms were chosen for a preliminary analysis using particular, we explored whether polymorphisms identi，ed as PCR with four Haarlem strains. Polymorphisms spanning re-
petitive regions, such as Pro-Pro-Glu (PPE) family genes, were speci，c to the sequenced M. tuberculosis Haarlem strain (5)
excluded from the present analysis in order to avoid possible were prevalent in additional members of the Haarlem lineage
and absent from other lineages. In the present paper, we report misinterpretation. Likewise, polymorphisms of 200 bp were the presence of eight genomic signatures highly exclusive to the excluded because they cannot be unequivocally differentiated M. tuberculosis Haarlem lineage that can prove important for from intrinsic errors occurred during sequencing and ，nishing
of the Haarlem strain genome. Results from this preliminary the rapid identi，cation of these strains and also contribute to
our understanding of the genetic variations underlying pheno- analysis indicated that only six polymorphisms were in fact typic differences among the various lineages of the tubercle present in the four analyzed Haarlem strains (Table 2). The bacilli. occurrence of these mutations was therefore further inspected
using a larger panel of epidemiologically unrelated isolates
from Argentina and Colombia. For this analysis we used these MATERIALS AND METHODS
six large-sequence polymorphisms and two additional SNPs M. tuberculosis isolates. A set of 40 M. tuberculosis clinical isolates belonging to located in the ogt and ung DNA repair genes previously re- the Haarlem lineage and 62 non-Haarlem isolates, including LAM, S, T, X, EIA, and Beijing, were selected from the collection of the Instituto Nacional de ported by our group as speci，c to the Haarlem lineage (25). Enfermedades Infeccciosas ANLIS “Carlos G. Malbra?n” in Buenos Aires, Ar- The analysis of the 102 strains indicated that the presence of gentina, and from the collection of the Centro Colombiano de Investigacio?n en all eight studied polymorphisms correlated highly with the Tuberculosis (CCITB) held at the Corporacio?n para Investigaciones Biolo?gicas Haarlem lineage (Table 3). For this reason, the regions dis- (CIB) in Medellín, Colombia. Isolates were selected based on different IS6110 restriction fragment length polymorphism (RFLP) patterns to ensure that they playing deletions were designated Haarlem-speci，c deletions represented the most conspicuous patterns of strains circulating in both settings (HSD1 to HSD4), and the two IS6110 element insertions were between 1997 and 2005. Laboratory strain H37Rv was also included in the named Haarlem-speci，c insertions (HSI1 and HSI2). Like- non-Haarlem group (see Fig. 2). DNA was obtained from culture lysates as wise, SNPs present in genes ogt and ung were named Haarlem- described previously (25). speci，c SNPs (HSSNP1 and HSSNP2, respectively). When an- IS6110 RFLP typing and phylogenetic analysis. IS6110 RFLP and spoligotype patterns (14, 35) were available at genotype databases in Buenos Aires and alyzed individually, each of these genetic events showed a Medellin laboratories. Computer-assisted analysis of IS6110 RFLP patterns was highly signi，cant association with the Haarlem lineage: HSD1 performed with the software BioNumerics 5.1 (Applied Maths, Sint-Martens- was found in 37/40 Haarlem versus 1/62 non-Haarlem strains Latem, Belgium) as described previously (12). Similarity between patterns was (P 0.00001), HSD2 and HSD3 were found in 38/40 Haarlem calculated using the Dice coef，