Catherine M. SteinAs reviewed recently by Möller et al. [22], the body of work showing statistical associations between candidate genes and TB continues to grow. This does not include potential unpublished studies that failed to find significant associations and are not readily available due to publication bias [22]. Even in the published body of literature, however, there is a great deal of inconsistency between marker-trait associations, so we are far from reaching a consensus regarding genes involved in TB risk.
This review focused on methodological reasons for inconsistency across studies. One important factor is the diagnostic criteria for TB disease, which have differed dramatically across studies. Resources available for TB diagnosis differ by country, which is confounded when there has been conflict [72]. Differences in diagnostic criteria across studies can reflect differences in TB severity and may lead to misclassification of cases as controls; this would have a significant impact on the type I and type II error of studies. It is impossible to standardize the diagnostic definitions used across all study sites, but researchers should be mindful of such differences when interpreting their findings. We strongly recommend that researchers characterize the level of exposure to Mtb in individuals without disease, which should include TST/IGRA and careful epidemiological characterization. New studies could utilize the household contact design, which facilitates the characterization of all stages of Mtb exposure, infection, and disease [41]. When the household contact study design is not feasible, spousal controls are also ideal because of persistent and prolonged exposure.
Recall that TB follows two stages of pathogenesis, and LTBI precedes TB disease. Recent studies suggest that LTBI may have unique genetic influences [15], [28], [29]. Persons with LTBI constitute a major impediment to TB control efforts [73]. Since many ongoing vaccine development efforts will focus either preventing LTBI or progression to TB, it is important to understand host factors that influence containment of Mtb infection. However, the study of the genetics of LTBI is also not trivial. Indication of T cell memory response via positive TST and/or IGRA does not necessarily imply the presence of viable Mtb bacilli. In the US as well as other public health systems, individuals with positive TST are treated as though there are viable organisms present, adding further confusion to this phenotype. According to Parrish et al., there is a 2%–23% lifetime probability of developing TB after acquisition of Mtb infection (LTBI) [73]. This illustrates the heterogeneity in this clinical group, since the risk of progression to active TB may depend on a variety of known and unknown risk factors. Furthermore, prophylaxis of LTBI with isoniazid (INH) is the standard of care in many research settings, so that many individuals with “LTBI” based on positive TST/IGRA, genetically predisposed to develop TB, may not. One way to investigate the role of host genetics in LTBI would be to compare TST (or IGRA) positive individuals that develop incident TB to those that do not. Ideally, such a study would not include individuals on INH prophylaxis, though that is unethical in many settings. For these reasons, some may argue that it is more relevant to study TB genetics, and not LTBI, from a public health standpoint.
Thus, it is essential to take a multidisciplinary approach [74] to develop an all-encompassing picture of the natural history of Mtb infection and disease. Few studies have examined the genetics of TB immunology [15], [31], [75]–[77]. Gene expression studies using microarrays may also shed light on host responses to Mtb [78]. Proteomic studies will further elucidate host factors involved in pathogenesis. These various approaches should be analyzed together to hopefully identify more meaningful clinical groups. For example, genomic, proteomic, and immunologic data, collectively, may better capture the heterogeneity in latently infected individuals.
Additional complicating factors in comparing geographically diverse studies are potential population substructure and LD differences among populations. We recommend that future studies analyze enough SNPs to capture LD in their study population. Analyses of a few markers within a gene no longer advance the field, particularly in light of LD differences between populations. Even with advances in genotyping, many studies of “old” markers continue to be published. The choice of a reference population for tag SNP selection is not trivial [62]; thus, dense SNP mapping may be necessary, particularly in studies of African populations. If it is impossible to rigorously examine genes in this way, publishing the LD patterns in the study data [28], [45], [48], [49] is a good start. Furthermore, studies in admixed populations should attempt to examine population substructure to minimize this source of bias. Populations also differ in the Mtb strain lineage that caused TB; future studies examining host gene by Mtb gene interaction are warranted. Finally, as in all genetic epidemiological studies of complex traits, genes may act in complex ways. Genes may interact with other genes and/or epidemiological factors; these potential relationships should not be overlooked. Furthermore, too many researchers (authors and journal reviewers alike) focus too much on p-values. All p-values must be reported, even if greater than 0.05. Markers with p-values greater than 0.05 may still be important in their interaction with other markers or environmental factors. Researchers should collect sufficient data to explore these meaningful biological effects.
There are GWAS of TB forthcoming. Given the issues discussed in this review, we must interpret the findings of those GWAS cautiously. Will these studies be underpowered due to the heterogeneity among TB cases and controls? A recent summary analysis of published GWAS found the reported SNP–trait associations attaining significance (p<10−5) had a median odds ratio of 1.33, with an interquartile range of 1.20–1.61 [54]; thus, the effect sizes of SNPs identified through GWAS are relatively small. Furthermore, the proportion of heritability explained by these variants ranges between 1% and 50% [79]. TB GWAS may provide new clues into the host biology of TB pathogenesis, but the overall clinical relevance of these SNPs will be limited. In addition, GWAS of other complex traits have often merged data across ongoing research studies. Because of the dramatic heterogeneity among studies described in this review, meta-analyses of TB genetic association studies should be conducted with care.
In sum, we have barely scratched the surface in understanding the genetic determinants of TB pathogenesis. Because of the significant public health impact of TB, additional studies are necessary, and should be multidisciplinary in nature. Future studies should carefully consider phenotype definition and genetic epidemiological principles when designing, analyzing, and interpreting findings. Ideally, culture confirmation for pulmonary TB should be conducted where feasible, thorough epidemiological data should be collected in individuals without TB to better understand LTBI and risk of progression to TB, and population genetic factors should be carefully characterized and considered in the analysis.
http://www.plospathogens.org/article/info%3Adoi%2F10.1371%2Fjournal.ppat.1001189
This review focused on methodological reasons for inconsistency across studies. One important factor is the diagnostic criteria for TB disease, which have differed dramatically across studies. Resources available for TB diagnosis differ by country, which is confounded when there has been conflict [72]. Differences in diagnostic criteria across studies can reflect differences in TB severity and may lead to misclassification of cases as controls; this would have a significant impact on the type I and type II error of studies. It is impossible to standardize the diagnostic definitions used across all study sites, but researchers should be mindful of such differences when interpreting their findings. We strongly recommend that researchers characterize the level of exposure to Mtb in individuals without disease, which should include TST/IGRA and careful epidemiological characterization. New studies could utilize the household contact design, which facilitates the characterization of all stages of Mtb exposure, infection, and disease [41]. When the household contact study design is not feasible, spousal controls are also ideal because of persistent and prolonged exposure.
Recall that TB follows two stages of pathogenesis, and LTBI precedes TB disease. Recent studies suggest that LTBI may have unique genetic influences [15], [28], [29]. Persons with LTBI constitute a major impediment to TB control efforts [73]. Since many ongoing vaccine development efforts will focus either preventing LTBI or progression to TB, it is important to understand host factors that influence containment of Mtb infection. However, the study of the genetics of LTBI is also not trivial. Indication of T cell memory response via positive TST and/or IGRA does not necessarily imply the presence of viable Mtb bacilli. In the US as well as other public health systems, individuals with positive TST are treated as though there are viable organisms present, adding further confusion to this phenotype. According to Parrish et al., there is a 2%–23% lifetime probability of developing TB after acquisition of Mtb infection (LTBI) [73]. This illustrates the heterogeneity in this clinical group, since the risk of progression to active TB may depend on a variety of known and unknown risk factors. Furthermore, prophylaxis of LTBI with isoniazid (INH) is the standard of care in many research settings, so that many individuals with “LTBI” based on positive TST/IGRA, genetically predisposed to develop TB, may not. One way to investigate the role of host genetics in LTBI would be to compare TST (or IGRA) positive individuals that develop incident TB to those that do not. Ideally, such a study would not include individuals on INH prophylaxis, though that is unethical in many settings. For these reasons, some may argue that it is more relevant to study TB genetics, and not LTBI, from a public health standpoint.
Thus, it is essential to take a multidisciplinary approach [74] to develop an all-encompassing picture of the natural history of Mtb infection and disease. Few studies have examined the genetics of TB immunology [15], [31], [75]–[77]. Gene expression studies using microarrays may also shed light on host responses to Mtb [78]. Proteomic studies will further elucidate host factors involved in pathogenesis. These various approaches should be analyzed together to hopefully identify more meaningful clinical groups. For example, genomic, proteomic, and immunologic data, collectively, may better capture the heterogeneity in latently infected individuals.
Additional complicating factors in comparing geographically diverse studies are potential population substructure and LD differences among populations. We recommend that future studies analyze enough SNPs to capture LD in their study population. Analyses of a few markers within a gene no longer advance the field, particularly in light of LD differences between populations. Even with advances in genotyping, many studies of “old” markers continue to be published. The choice of a reference population for tag SNP selection is not trivial [62]; thus, dense SNP mapping may be necessary, particularly in studies of African populations. If it is impossible to rigorously examine genes in this way, publishing the LD patterns in the study data [28], [45], [48], [49] is a good start. Furthermore, studies in admixed populations should attempt to examine population substructure to minimize this source of bias. Populations also differ in the Mtb strain lineage that caused TB; future studies examining host gene by Mtb gene interaction are warranted. Finally, as in all genetic epidemiological studies of complex traits, genes may act in complex ways. Genes may interact with other genes and/or epidemiological factors; these potential relationships should not be overlooked. Furthermore, too many researchers (authors and journal reviewers alike) focus too much on p-values. All p-values must be reported, even if greater than 0.05. Markers with p-values greater than 0.05 may still be important in their interaction with other markers or environmental factors. Researchers should collect sufficient data to explore these meaningful biological effects.
There are GWAS of TB forthcoming. Given the issues discussed in this review, we must interpret the findings of those GWAS cautiously. Will these studies be underpowered due to the heterogeneity among TB cases and controls? A recent summary analysis of published GWAS found the reported SNP–trait associations attaining significance (p<10−5) had a median odds ratio of 1.33, with an interquartile range of 1.20–1.61 [54]; thus, the effect sizes of SNPs identified through GWAS are relatively small. Furthermore, the proportion of heritability explained by these variants ranges between 1% and 50% [79]. TB GWAS may provide new clues into the host biology of TB pathogenesis, but the overall clinical relevance of these SNPs will be limited. In addition, GWAS of other complex traits have often merged data across ongoing research studies. Because of the dramatic heterogeneity among studies described in this review, meta-analyses of TB genetic association studies should be conducted with care.
In sum, we have barely scratched the surface in understanding the genetic determinants of TB pathogenesis. Because of the significant public health impact of TB, additional studies are necessary, and should be multidisciplinary in nature. Future studies should carefully consider phenotype definition and genetic epidemiological principles when designing, analyzing, and interpreting findings. Ideally, culture confirmation for pulmonary TB should be conducted where feasible, thorough epidemiological data should be collected in individuals without TB to better understand LTBI and risk of progression to TB, and population genetic factors should be carefully characterized and considered in the analysis.
http://www.plospathogens.org/article/info%3Adoi%2F10.1371%2Fjournal.ppat.1001189