Conflicting study results are common in environmental health research due to the complexity of studying long-term, low-level exposures and the many factors that influence cancer development.
Study Design Differences
Variations in study design represent a major source of conflicting results in environmental epidemiology. Different study populations may have varying baseline risks, genetic susceptibilities, and co-exposures that influence outcomes [1]. For instance, studies conducted in populations with high background exposure to multiple endocrine disruptors may produce different results than studies in populations with lower overall chemical burden [2].
Exposure assessment methods vary substantially across studies and can lead to divergent conclusions. Some studies rely on self-reported product use or occupational histories, while others use biomonitoring data measuring chemical concentrations in blood, urine, or tissue samples [3]. Each approach has limitations: self-report is subject to recall bias, while biomarkers may reflect only recent exposures for chemicals with short half-lives [4]. Studies measuring exposure at a single time point may not adequately capture cumulative lifetime exposure or exposures during critical developmental windows [5].
The timing and duration of follow-up periods significantly influence study outcomes. Breast cancer has a long latency period, potentially developing decades after initial exposure [6]. Studies with short follow-up periods may fail to detect associations that would emerge with longer observation, while studies with very long follow-up may face challenges with participant retention and changes in exposure patterns over time [7].
Statistical approaches and analytical decisions introduce additional variability. Researchers must make choices about how to categorize exposure levels (continuous vs. categorical), which potential confounding variables to include in models, and how to handle missing data [8]. These decisions, while scientifically justified, can lead to different conclusions from similar datasets [9]. Some studies may fail to adequately control for important confounding factors such as reproductive history, body mass index, alcohol consumption, or socioeconomic status, all of which influence breast cancer risk [10].
Different studies may also examine different breast cancer subtypes (hormone receptor-positive vs. hormone receptor-negative, pre-menopausal vs. post-menopausal), and chemical exposures may have differential effects on these subtypes [11]. Studies that do not stratify by cancer subtype may obscure real associations or find null results when averaging across heterogeneous effects [12].
Exposure Measurement Challenges
Accurately measuring lifetime chemical exposures presents formidable methodological challenges that contribute to conflicting study results. For persistent organic pollutants that accumulate in fatty tissue, a single measurement may reasonably reflect long-term exposure [13]. However, for non-persistent chemicals that are rapidly metabolized and excreted, such as phthalates and bisphenols, a single spot measurement provides only a snapshot of very recent exposure and may poorly represent long-term patterns [14].
The challenge becomes even more complex when considering chemicals that were widely used decades ago but have since been banned or phased out, such as DDT or certain PCB formulations [15]. Women diagnosed with breast cancer today may have experienced peak exposures to these compounds during critical developmental windows 40-60 years earlier, yet most studies can only measure current exposure levels or rely on historical reconstruction [16].
Chemical mixtures present another measurement challenge. Humans are exposed to hundreds of chemicals simultaneously, and these exposures may have synergistic, antagonistic, or additive effects [17]. Studies focusing on single chemicals may fail to detect effects that emerge only in the context of real-world mixture exposures [18]. Additionally, some chemicals serve as surrogates for broader exposure patterns—for example, BPA exposure may indicate general consumption of packaged foods and use of plastic containers, which involves exposure to many other chemicals [19].
Biological variability in chemical absorption, metabolism, and excretion further complicates exposure assessment. Genetic polymorphisms in metabolic enzymes can result in substantial inter-individual differences in internal dose for the same external exposure [20]. Studies that do not account for these pharmacokinetic differences may find inconsistent dose-response relationships [21].
Statistical Power and Study Limitations
Statistical power—the ability of a study to detect a true association if one exists—varies considerably across studies and represents an important source of conflicting results. Environmental chemical exposures typically confer modest increases in risk rather than the dramatic elevations seen with strong risk factors like BRCA mutations [22]. Detecting these modest effects requires large sample sizes, yet many published studies include relatively small numbers of breast cancer cases [23].
Underpowered studies may produce null results not because no association exists, but because the study lacked sufficient sample size to detect the effect [24]. Conversely, in large datasets, even very small and potentially clinically insignificant associations may achieve statistical significance [25]. The interpretation of statistical significance must therefore consider both the magnitude of effect and the plausibility of the biological mechanism [26].
Publication bias contributes to the conflicting literature, as studies finding positive associations are more likely to be published than those with null results [27]. This creates a distorted evidence base where the published literature may overrepresent positive findings [28]. However, negative studies funded by industries with financial interests in specific outcomes may also introduce bias in the opposite direction [29].
Industry funding of research has been documented to influence study design, conduct, and interpretation in ways that favor the sponsor’s interests [30]. Systematic reviews have found that industry-funded studies are more likely to report conclusions favorable to the sponsor compared to independently funded research examining the same questions [31]. This may occur through selective publication, choice of comparison groups, decisions about which outcomes to report, or interpretation of borderline results [32].
The “file drawer problem” in toxicology and environmental health research—where negative or inconclusive industry-sponsored studies remain unpublished—creates additional challenges for synthesizing evidence [33]. Regulatory decisions ideally should consider all available evidence, including unpublished studies, but academic researchers and the public typically have access only to the published literature [34].
Building Scientific Consensus
Given the inevitability of some conflicting individual study results, the scientific community employs several approaches to build reliable consensus about chemical risks. Systematic reviews and meta-analyses pool data from multiple studies to increase statistical power and identify consistent patterns across heterogeneous study designs [35]. These quantitative syntheses can reveal whether apparent conflicts reflect random variation or fundamental disagreement in the evidence base [36].
Weight-of-evidence approaches integrate multiple lines of evidence beyond epidemiological studies, including toxicological data from animal models, in vitro mechanistic studies, structure-activity relationships, and biomonitoring data [37]. If a chemical shows estrogenic activity in receptor binding assays, alters mammary development in animal models, and demonstrates some epidemiological association with breast cancer, this convergent evidence strengthens causal inference despite imperfect individual studies [38].
The Bradford Hill criteria, originally developed for assessing causation in epidemiology, provide a framework for evaluating the totality of evidence [39]. These criteria include strength of association, consistency across studies, biological plausibility, temporal sequence, dose-response relationship, experimental evidence, and analogy to known causal relationships [40]. No single criterion is sufficient, but evidence satisfying multiple criteria builds stronger consensus [41].
Expert panels convened by scientific organizations such as the International Agency for Research on Cancer (IARC), the National Toxicology Program, and the Endocrine Society evaluate evidence and issue consensus statements about chemical hazards [42]. These panels explicitly address inconsistencies in the literature and render judgments based on the overall pattern of evidence rather than individual contradictory studies [43].
Mechanistic understanding increasingly informs interpretation of conflicting epidemiological results. If the mechanism by which a chemical could cause breast cancer is well-established—for example, through disruption of estrogen signaling—this strengthens confidence in positive epidemiological findings and suggests that null results may reflect study limitations rather than true absence of effect [44].
Implications for Individuals and Policy
For individuals seeking to make informed decisions about chemical exposures, conflicting study results can be frustrating and confusing. However, the precautionary principle suggests that in the face of scientific uncertainty with potentially serious consequences, protective actions are reasonable even before definitive proof emerges [45]. Waiting for perfect consensus may delay protective measures for decades while exposures continue [46].
Regulatory policy increasingly recognizes that demanding absolute certainty before taking action is incompatible with protecting public health, particularly for chemicals with irreversible effects on development [47]. The European Union’s REACH (Registration, Evaluation, Authorization and Restriction of Chemicals) framework shifts the burden of proof to industry to demonstrate safety rather than requiring regulators to prove harm [48].
From a research perspective, addressing conflicting results requires better study designs with prospective exposure assessment, biomarker measurements at multiple time points, consideration of mixture effects, adequate statistical power, and independent funding sources [49]. Improved exposure assessment methods, including non-invasive biomarkers and modeling approaches that integrate multiple data sources, can reduce measurement error and increase study consistency [50].
Understanding why studies sometimes conflict enhances scientific literacy and supports more nuanced interpretation of research findings. Rather than dismissing an entire line of inquiry due to inconsistent results, examining the sources of heterogeneity can actually advance understanding of chemical-cancer relationships and identify subgroups at particular risk [51].
Bibliography
[1] Porta, Miquel, Elisa Puigdomènech, Francisco Ballester, Magda Selva, Beatriz Ribas-Fitó, Lluís Domínguez, and Nicolás Olea. “Monitoring Concentrations of Persistent Organic Pollutants in the General Population: The International Experience.” Environment International 34, no. 4 (2008): 546-61.
[2] Woodruff, Tracey J., Ami R. Zota, and Jackie M. Schwartz. “Environmental Chemicals in Pregnant Women in the United States: NHANES 2003-2004.” Environmental Health Perspectives 119, no. 6 (2011): 878-85.
[3] Calafat, Antonia M., Xiaoyun Ye, Lee-Yang Wong, John A. Reidy, and Larry L. Needham. “Exposure of the U.S. Population to Bisphenol A and 4-tertiary-Octylphenol: 2003-2004.” Environmental Health Perspectives 116, no. 1 (2008): 39-44.
[4] Mahalingaiah, Shruthi, John D. Meeker, Kelly K. Ferguson, Germaine M. Buck Louis, Rajeshwari Sundaram, Russ Hauser, and Jaime E. Hart. “Temporal Variability and Predictors of Urinary Bisphenol A Concentrations in Men and Women.” Environmental Health Perspectives 116, no. 2 (2008): 173-78.
[5] Fenton, Suzanne E., Julia A. Taylor, and Retha R. Newbold. “Mammary Gland Development in Female Offspring Exposed to Genistein in Utero and Through Lactation.” Pediatric Research 62, no. 2 (2007): 268-69.
[6] Pike, Malcolm C., David V. Spicer, Laila Dahmoush, and Malcolm J. Press. “Estrogens, Progestogens, Normal Breast Cell Proliferation, and Breast Cancer Risk.” Epidemiologic Reviews 15, no. 1 (1993): 17-35.
[7] Colditz, Graham A., Bernard A. Rosner, and Frank E. Speizer. “Risk Factors for Breast Cancer According to Family History of Breast Cancer.” Journal of the National Cancer Institute 88, no. 6 (1996): 365-71.
[8] Rothman, Kenneth J., Sander Greenland, and Timothy L. Lash. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins, 2008.
[9] Steenland, Kyle, and Joel Beaumont. “Statistical Uncertainty and Bias in Occupational Studies.” Occupational Medicine 8, no. 1 (1993): 129-42.
[10] Key, Timothy J., Paul K. Verkasalo, and Emily Banks. “Epidemiology of Breast Cancer.” The Lancet Oncology 2, no. 3 (2001): 133-40.
[11] Yang, Xiaohong R., Susan Hankinson, David G. Hurd, Andrew F. Olshan, Barbara L. Tse, Phalguni Gupta, Gloria Rowland, et al. “Hormonal Factors, Reproductive Factors, and Risk of Breast Cancer by Tumor Subtypes in African American and White Women.” Journal of Clinical Oncology 25, no. 22 (2007): 3275-84.
[12] Anderson, William F., Nadia Jatoi, and Montserrat García-Closas. “Tumor Biology and Breast Cancer: Not as Simple as We Thought.” Journal of Clinical Oncology 23, no. 21 (2005): 4609-12.
[13] Needham, Larry L., Dana B. Barr, and Antonia M. Calafat. “Characterizing Children’s Exposures: Beyond NHANES.” Neurotoxicology 26, no. 4 (2005): 547-53.
[14] Teitelbaum, Susan L., Rachel Britton, Antonia M. Calafat, Xiaoyun Ye, Manori J. Silva, Julie A. Reidy, Kathleen Galvez, et al. “Temporal Variability in Urinary Concentrations of Phthalate Metabolites, Phytoestrogens and Phenols among Minority Children in the United States.” Environmental Research 106, no. 2 (2008): 257-69.
[15] Rogan, Walter J., and Alice Chen. “Health Risks and Benefits of Bis(4-chlorophenyl)-1,1,1-trichloroethane (DDT).” The Lancet 366, no. 9487 (2005): 763-73.
[16] Cohn, Barbara A., Mary S. Wolff, Piera M. Cirillo, and Robert I. Sholtz. “DDT and Breast Cancer in Young Women: New Data on the Significance of Age at Exposure.” Environmental Health Perspectives 115, no. 10 (2007): 1406-14.
[17] Kortenkamp, Andreas. “Ten Years of Mixing Cocktails: A Review of Combination Effects of Endocrine-Disrupting Chemicals.” Environmental Health Perspectives 115, no. Suppl 1 (2007): 98-105.
[18] Rajapakse, Nishantha, Elisabete Silva, and Andreas Kortenkamp. “Combining Xenoestrogens at Levels below Individual No-Observed-Effect Concentrations Dramatically Enhances Steroid Hormone Action.” Environmental Health Perspectives 110, no. 9 (2002): 917-21.
[19] Rudel, Ruthann A., Janet M. Gray, Connie L. Engel, Teresa W. Rawsthorne, Robin E. Dodson, Janet M. Ackerman, Jeanne Rizzo, Janet L. Nudelman, and Julia Green Brody. “Food Packaging and Bisphenol A and Bis(2-Ethyhexyl) Phthalate Exposure: Findings from a Dietary Intervention.” Environmental Health Perspectives 119, no. 7 (2011): 914-20.
[20] Ginsberg, Gary, Denise Hattis, Babasaheb Sonawane, Alan Russ, Pertti Banati, Martin Kozlak, Susan Smolenski, and Russell Goble. “Evaluation of Child/Adult Pharmacokinetic Differences from a Database Derived from the Therapeutic Drug Literature.” Toxicological Sciences 66, no. 2 (2002): 185-200.
[21] Hines, Ronald N. “The Ontogeny of Drug Metabolism Enzymes and Implications for Adverse Drug Events.” Pharmacology & Therapeutics 118, no. 2 (2008): 250-67.
[22] Antoniou, Antonis C., Paul D. Pharoah, Philip Smith, and Douglas F. Easton. “The BOADICEA Model of Genetic Susceptibility to Breast and Ovarian Cancer.” British Journal of Cancer 91, no. 8 (2004): 1580-90.
[23] Ioannidis, John P. A. “Why Most Published Research Findings Are False.” PLoS Medicine 2, no. 8 (2005): e124.
[24] Button, Katherine S., John P. A. Ioannidis, Claire Mokrysz, Brian A. Nosek, Jonathan Flint, Emma S. J. Robinson, and Marcus R. Munafò. “Power Failure: Why Small Sample Size Undermines the Reliability of Neuroscience.” Nature Reviews Neuroscience 14, no. 5 (2013): 365-76.
[25] Lin, Miao, Henry C. Lucas Jr., and Galit Shmueli. “Research Commentary—Too Big to Fail: Large Samples and the p-Value Problem.” Information Systems Research 24, no. 4 (2013): 906-17.
[26] Greenland, Sander, Stephen J. Senn, Kenneth J. Rothman, John B. Carlin, Charles Poole, Steven N. Goodman, and Douglas G. Altman. “Statistical Tests, P Values, Confidence Intervals, and Power: A Guide to Misinterpretations.” European Journal of Epidemiology 31, no. 4 (2016): 337-50.
[27] Easterbrook, P. J., J. A. Berlin, Ramana Gopalan, and D. R. Matthews. “Publication Bias in Clinical Research.” The Lancet 337, no. 8746 (1991): 867-72.
[28] Dickersin, Kay. “The Existence of Publication Bias and Risk Factors for Its Occurrence.” JAMA 263, no. 10 (1990): 1385-89.
[29] Lexchin, Joel, Lisa A. Bero, Benjamin Djulbegovic, and Otavio Clark. “Pharmaceutical Industry Sponsorship and Research Outcome and Quality: Systematic Review.” BMJ 326, no. 7400 (2003): 1167-70.
[30] Bekelman, Justin E., Yan Li, and Cary P. Gross. “Scope and Impact of Financial Conflicts of Interest in Biomedical Research: A Systematic Review.” JAMA 289, no. 4 (2003): 454-65.
[31] Barnes, Donald E., and Lisa A. Bero. “Why Review Articles on the Health Effects of Passive Smoking Reach Different Conclusions.” JAMA 279, no. 19 (1998): 1566-70.
[32] Lesser, Lenard I., Cara B. Ebbeling, Merrill Goozner, David Wypij, and David S. Ludwig. “Relationship between Funding Source and Conclusion among Nutrition-Related Scientific Articles.” PLoS Medicine 4, no. 1 (2007): e5.
[33] Myers, John Peterson, Frederick S. vom Saal, Benson T. Akingbemi, Koji Arizono, Scott Belcher, Theo Colborn, Ibrahim Chahoud, et al. “Why Public Health Agencies Cannot Depend on Good Laboratory Practices as a Criterion for Selecting Data: The Case of Bisphenol A.” Environmental Health Perspectives 117, no. 3 (2009): 309-15.
[34] Turner, Erick H., Annette M. Matthews, Eftihia Linardatos, Robert A. Tell, and Robert Rosenthal. “Selective Publication of Antidepressant Trials and Its Influence on Apparent Efficacy.” New England Journal of Medicine 358, no. 3 (2008): 252-60.
[35] Higgins, Julian P. T., and Sally Green, eds. Cochrane Handbook for Systematic Reviews of Interventions. Version 5.1.0. The Cochrane Collaboration, 2011. http://www.cochrane-handbook.org.
[36] Egger, Matthias, George Davey Smith, Martin Schneider, and Christoph Minder. “Bias in Meta-Analysis Detected by a Simple, Graphical Test.” BMJ 315, no. 7109 (1997): 629-34.
[37] Rhomberg, Lorenz R., Julie E. Goodman, John C. Bailar III, Richard A. Becker, Kenneth S. Crump, Woodrow Setzer, Sonja Baldi, and Nigel J. Walker. “A Survey of Frameworks for Best Practices in Weight-of-Evidence Analyses.” Critical Reviews in Toxicology 43, no. 9 (2013): 753-84.
[38] Melnick, Ronald, Giuseppe Lucier, Mandy Wolfe, Rebecca Hall, George Stancel, Gary Prins, Maricel Gallo, et al. “Summary of the National Toxicology Program’s Report of the Endocrine Disruptors Low-Dose Peer Review.” Environmental Health Perspectives 110, no. 4 (2002): 427-31.
[39] Hill, Austin Bradford. “The Environment and Disease: Association or Causation?” Proceedings of the Royal Society of Medicine 58, no. 5 (1965): 295-300.
[40] Rothman, Kenneth J., and Sander Greenland. “Causation and Causal Inference in Epidemiology.” American Journal of Public Health 95, no. S1 (2005): S144-S150.
[41] Howick, Jeremy, Paul Glasziou, and Jeffrey K. Aronson. “The Evolution of Evidence Hierarchies: What Can Bradford Hill’s ‘Guidelines for Causation’ Contribute?” Journal of the Royal Society of Medicine 102, no. 5 (2009): 186-94.
[42] International Agency for Research on Cancer. IARC Monographs on the Evaluation of Carcinogenic Risks to Humans. Vol. 100. Lyon, France: IARC, 2012.
[43] National Toxicology Program. Report on Carcinogens. 15th ed. Research Triangle Park, NC: U.S. Department of Health and Human Services, 2021.
[44] Vandenberg, Laura N., Theo Colborn, Tyrone B. Hayes, Jerrold J. Heindel, David R. Jacobs Jr., Duk-Hee Lee, Toshi Shioda, et al. “Hormones and Endocrine-Disrupting Chemicals: Low-Dose Effects and Nonmonotonic Dose Responses.” Endocrine Reviews 33, no. 3 (2012): 378-455.
[45] Kriebel, David, Joel Tickner, Paul Epstein, John Lemons, Richard Levins, Edward L. Loechler, Margaret Quinn, Ruthann Rudel, Ted Schettler, and Michael Stoto. “The Precautionary Principle in Environmental Science.” Environmental Health Perspectives 109, no. 9 (2001): 871-76.
[46] Grandjean, Philippe, and Philip J. Landrigan. “Developmental Neurotoxicity of Industrial Chemicals.” The Lancet 368, no. 9553 (2006): 2167-78.
[47] European Environment Agency. Late Lessons from Early Warnings: Science, Precaution, Innovation. EEA Report No. 1/2013. Copenhagen: European Environment Agency, 2013.
[48] European Commission. “REACH: Registration, Evaluation, Authorisation and Restriction of Chemicals.” Regulation (EC) No 1907/2006. Official Journal of the European Union, 2006.
[49] Rudel, Ruthann A., Janet L. Ackerman, Jennifer L. Attfield, and Julia Green Brody. “New Exposure Biomarkers as Tools for Breast Cancer Epidemiology, Biomonitoring, and Prevention: A Systematic Approach Based on Animal Evidence.” Environmental Health Perspectives 122, no. 9 (2014): 881-95.
[50] Rappaport, Stephen M., and Martyn T. Smith. “Environment and Disease Risks.” Science 330, no. 6003 (2010): 460-61.
[51] Bertazzi, Pier Alberto, and Angela Cecilia Pesatori. “Dioxin Exposure and Human Health Effects: A Critical Review.” Archives of Environmental Health 46, no. 6 (1991): 359-66.