Chemical toxicity prediction for major classes of industrial chemicals: Is it possible to develop universal models covering cosmetics, drugs, and pesticides?
Graphical abstract
Distribution of cosmetics, drugs, and pesticides in the chemical space.
Introduction
Chemical toxicity assessment is a critical point in regulatory decision making that concerns the release of drugs or industrial chemicals into production, which enables their human or environmental exposure (Parasuraman, 2011). There exists also a variety of natural and synthetic substances that are exposed to humans and/or the environment that have never been evaluated in any toxicity testing protocol (Chuprina et al., 2010, Egeghy et al., 2012). Over the years, the society has tolerated the use of animals in laboratory toxicity testing. However, in recent years, there has been an increased pressure on scientists and regulatory agencies to replace potentially hazardous chemicals by safer alternatives (Collins, 2003, Schulte et al., 2013). In addition, there has been a strong push on the part of both regulatory agencies such as FDA and EPA in the United States and their counterparts around the world to avoid animal testing of every chemical as such testing has become increasingly unsustainable in terms of both cost and time needed to conduct animal trials (Burden et al., 2015).
The development of the alternative in vitro and in silico approaches has been encouraged and supported by both NIH and EPA through large-scale programs such as ToxCast project (Dix et al., 2007) and the Tox21 consortium (Tice et al., 2013). Similar programs such as Endocrine Disruptors Prioritization List (http://ec.europa.eu/environment/chemicals/endocrine/index_en.htm) and the priority substances for water safety (European Union, 2013) have been funded by the European Union. Since the acceptance of Registration, Evaluation, Authorization, and Restriction of Chemicals (REACH) legislation in 2006 by the European Union (European Union, 2007, Nicolotti et al., 2014), the use of structural alerts and statistical QSAR models (often collectively referred to as (Q)SAR) have become a major computational approach to chemical safety assessment and regulatory decision support.
The majority of publicly available models for toxicity prediction have been built for drugs or drug candidates (Benfenati et al., 2009, Melnikov et al., 2016) or environmental chemicals (Naven and Louise-May 2015). In contrast, computational toxicity models for another large group of industrial chemicals, namely cosmetics products have been developed to a much lesser extent as the animal testing has been used as a preferred approach. However, with recent regulations banning the use of animals for testing of the cosmetics products (European Commission, 2013), there has been a resurgence of interest in employing computational models for their toxicity assessment (Bois et al., 2016, Cronin et al., 2012).
Naturally, a question can be posed as to whether toxicity prediction models built for environmental chemicals or drug molecules could be employed for the cosmetics products. The answer to this question depends on the overlap of the chemical spaces occupied by cosmetics, drugs, and environmental chemicals and the size of the applicability domain (AD) of the respective models. AD is commonly defined as the threshold of similarity between a new chemical and molecules in the training set used to develop the respective QSAR model (Netzeva et al., 2005, Tropsha, 2010, Tropsha and Golbraikh, 2007); only predictions for new molecules within the AD of QSAR models, i.e., relatively similar to the modeling set are considered reliable. Importantly, the size of the AD is fully defined by the size and diversity of the modeling set and the computational method used to develop QSAR models. For instance, it is known that the chemical space of drugs has been changing over the past few decades (Deng et al., 2013) creating a challenge for “old” models’ ability to evaluate new compounds. The applicability of current models to many new compounds was also questioned due to limited size and diversity of data available publicly for model building (Kulkarni et al., 2016).
The considerations above capture both significant advantages and challenges associated with the idea of using models developed with one group of industrial chemicals to evaluate toxicity of another group. Obvious advantages deal with significant savings in time and effort afforded by the opportunity to use previously developed models of multiple toxicity endpoints relevant to drugs and/or environmental chemicals (e.g., pesticides) to evaluate toxicity of cosmetic products. However, since chemicals used in different areas of commerce such as drug, chemical, or cosmetic industries are developed with very different applications in mind, there is no a priori reason to expect that their respective chemical spaces overlap. Taking the issue of the AD into account, investigations into studying the degree of such overlap and the applicability of models developed for one group of chemicals to predict toxicity of another group are potentially highly impactful for the respective industries, especially, cosmetics. To the best of our knowledge, such investigations have not been conducted in the public domain with large groups of industrial chemicals.
Herein, we have aimed to compare chemical spaces occupied by cosmetics, drugs, and pesticides, and analyze whether current computational models of different toxicity endpoints can be universally applied to all chemicals. To achieve these aims, we have (i) compiled, curated, and integrated chemical structures of known cosmetics, drugs, and pesticides; (ii) analyzed the distribution of these compounds in chemical space and estimated the structural similarity between the datasets; (iii) performed cluster analysis followed by toxicity annotation comparison for structurally similar compounds in the same clusters; (iv) predicted toxicities of investigated compounds with QSAR models for endpoints developed earlier by us; (v) and analyzed the coverage of these models separately for drugs, cosmetics, and pesticides. We observed that, with some exceptions, the majority of compounds in all three groups of industrial chemicals were found within the AD of QSAR models built previously for twenty different toxicity endpoints. These findings open the door for the development and employment of global toxicity models applicable to the majority of chemicals in commerce while suggesting the need to develop local models that could capture AD outliers of the global models.
Section snippets
Cosmetic ingredients (Dataset A)
The cosmetics ingredients were retrieved from the CosIng, the European Commission database for information on cosmetic substances and ingredients (https://ec.europa.eu/growth/sectors/cosmetics/cosing_en). This dataset included 5166 chemical records with a defined chemical structure. After curation (vide infra), 3930 unique chemical substances were kept for this study.
Drugs (Dataset B)
We retrieved 7000 chemical records from the 2014 Leadscope Marketed Drugs Database (//www.leadscope.com/marketed_drugs_database/
Analysis of chemical space of cosmetics, drugs, and pesticides
A plot of calculated logP (ClogP) vs. molecular weight (MW) is shown in Fig. 2. As one can see, there is a big overlap between all the industrial classes of compounds, as well as with compounds from datasets used to develop historical QSAR models. At higher MW, drugs and cosmetics separate from pesticides. Drugs present the same range of ClogP, even at higher MW, while cosmetics tend to have higher ClogP, i.e., include compounds with low solubility. In Fig. 2B, the difference between drugs and
Compounds simultaneously used as cosmetics, drugs, or pesticides
We shall note that compounds labeled as cosmetics, drugs, and pesticides may not be the active ingredients, but rather excipients used in the formulations of final products, e.g., mannitol or stearic acid. This explains the big overlap between these three categories. In addition, we fully realize that defining these labels as “categories” is an oversimplification, since these terms do not reflect chemical classes, but rather their final use. For instance, several compounds such as methane,
Conclusions
Vast majority of current QSAR models of various toxicity endpoints have been developed to predict toxicity of drugs, drug-like compounds, and, less frequently, pesticides, or other environmental chemicals. The ability of these models to predict toxicity for another big class of industrial chemicals – cosmetics, was not examined previously. The analysis of chemical space revealed a huge overlap between cosmetics, drugs, and pesticides. Our results also show that drugs and cosmetics are more
Conflict of interests
The authors declare no actual or potential conflict of interests.
Abbreviations
AD, applicability domain; AhR, aryl hydrocarbon receptor; AR, androgen receptor; ARE, nuclear factor (erythroid-derived 2)-like 2/antioxidant responsive element; AR_LBD, androgen receptor—ligand binding domain; ATAD5, ATPase family AAA Domain containing 5; ClogP, calculated logP; ER, estrogen receptor alpha—full; ER_LBD, estrogen receptor alpha—ligand binding domain; HSE, heat shock factor response element; MMP, mitochondrial membrane potential; MW, molecular weight; PASS, prediction of
Acknowledgements
This study was supported in part by NIH (grant 1U01CA207160). VA thanks FAPEG (grant 201310267001095), CNPq (grant 400760/2014-2), and CAPES. A.Z. acknowledges Intramural Research Program, National Center for Advancing Translational Sciences, National Institutes of Health (1ZIATR000058-02). The authors express sincere gratitude to Drs. Glenn Myatt and Nora Aptula for providing datasets used in this study. The authors are also grateful for Drs. Vladimir Poroikov and Dmitri Filimonov providing
References (57)
- et al.
Predicting chemically-induced skin reactions. Part I: QSAR models of skin sensitization and their application to identify potentially hazardous compounds
Toxicol. Appl. Pharmacol.
(2015) - et al.
Predicting chemically-induced skin reactions. Part II: QSAR models of skin permeability and the relationships between skin permeability and skin sensitization
Toxicol. Appl. Pharmacol.
(2015) - et al.
The exposure data landscape for manufactured chemicals
Sci. Total Environ.
(2012) - et al.
Induction of glutathione S-transferase by phenobarbital and pesticides in various house fly strains and its effect on toxicity
Pestic. Biochem. Physiol.
(1982) - et al.
REACH and in silico methods: an attractive opportunity for medicinal chemists
Drug Discov. Today
(2014) - et al.
QSAR models of human data can enrich or replace LLNA testing for human skin sensitization
Green Chem.
(2016) - et al.
Predictive models for carcinogenicity and mutagenicity: frameworks, state-of-the-art, and perspectives
J. Environ. Sci. Health. C. Environ. Carcinog. Ecotoxicol. Rev.
(2009) - et al.
Multiscale modelling approaches for assessing cosmetic ingredients safety
Toxicology
(2016) - et al.
Pred-hERG A Nov. web-Accessible Comput. Tool Predict. Cardiac Toxic. Mol. Inf.
(2015) - et al.
Tuning HERG out: antitarget QSAR models for drug development
Curr. Top. Med. Chem.
(2014)