Quantitative structure activity relationships predict toxicity of substances

Published: September 25, 2017

Stefan Pudenz, Expert Consultant at Envigo, explains what makes a good (Q)SAR model, and how these tools can be used, as an alternative to animal testing, to predict toxicity.

Structure–activity relationship (SAR) and quantitative structural–activity relationship (QSAR) models, collectively referred to as (Q)SARs, are theoretical models that predict a chemical’s biological activity or physicochemical properties.1 They are recognised worldwide as alternatives to animal toxicity testing, due to their ability to help predict the potential health and safety hazards of chemicals, cosmetics and pharmaceuticals. The increased use of in silico modelling techniques, such as (Q)SARs, has been driven both by worldwide regulation and by the desire to reduce, refine and replace animals in toxicity testing.

What makes a good (Q)SAR model? First rule – it’s not about a single model Different models may have different accuracies depending on the structural features of the substance in question and those in the training set used to build the model. In most cases, predictions require the use of several models. The number used is often dictated by budget, so it is important to have a clear understanding of what can be achieved with individual systems and how best to combine them efficiently to achieve a robust prediction in a cost- effective manner.

All (Q)SAR models are validated using well-known criteriaThe OECD (Organization for Economic Co-operation and Development) and the JRC (Joint Research Centre) of the European Commission published guidance on the validation of (Q)SAR models, establishing five principles for regulatory decision-making on the reliability of a prediction:2

- A defined endpoint
- An unambiguous algorithm
- A defined domain of applicability
- Appropriate measures of goodness-of-fit, robustness and predictivity
- A mechanistic interpretation, if possible.

Approaches to (Q)SAR modelling at EnvigoAt Envigo, specialists from different disciplines work together, which means we can leverage the expertise of chemists, chemometrics experts, toxicologists and biologists to plan a comprehensive and creative approach to addressing specific questions about the predicted toxicity of a substance. We have also invested on some dedicated computing machines and we are using a wide range of models freely available or commercially available (Q)SARs.

The starting point for any model is the structure of the substance and the endpoints to be identified. Typically, the selection of endpoints is determined by the authority/regulation. For example, in the specific case of pesticides, all impurities with a content of 0.1% w/w in the technical grade active ingredient need to be investigated.

Typical toxicity testing (animal testing) required to register an active substance includes developmental/reproductive toxicity, carcinogenicity, genotoxicity/mutagenicity, acute toxicity (oral, dermal, inhalation), skin sensitization, skin irritation, eye irritation and several ecotoxicity endpoints. For an impurity within that active substance, some, but not all, of the same endpoints will need to be predicted.

Utilize a battery of modelling systems to improve predictive power Various modelling systems are available, which differ in terms of numbers of endpoints covered, cost (some are free), their modelling approach (SAR versus QSAR) and the availability of reliability indices. Choice is determined by the question being asked, the structure of the chemical, and the endpoint; for example, we have one model for bee toxicity but over 10 for mutagenicity.

For each in silico assessment, we employ a range of (Q)SAR models to predict requested endpoints for the query substance. The results are compared to assess the predictions. If a weight of evidence can be established, the results will be reported; if not, additional procedures and models are undertaken. For example, the model’s training set will be extended with experimental data from structurally similar compounds, if available.

Box 1 – Case study: prediction of Ames mutagenicity for an impurity found in sulfentrazone Envigo was asked to predict the mutagenicity of an impurity found in sulfentrazone, a reportedly non-mutagenic herbicide.

We used four different QSAR models (TOPKAT, DEREK, VEGA and TEST), which resulted in three positive and one negative prediction; however, reliability of the positive predictions was assessed as ‘low’. The TOPKAT system yielded the negative prediction and was associated with moderate reliability so we applied the TOPKAT model extension to improve prediction reliability. Since the impurity is structurally similar to sulfentrazone, the training set was extended to include structure and experimental results from this substance and other compounds belonging to the same, and similar, chemical classes. The extension increased confidence in the reliability of the prediction and strengthened the weight of evidence argument.

However, the impurity contained an alkyl chloride moiety that returned an alert in DEREK and VEGA, but not in TOPKAT. Although the low confidence of these results from DEREK and VEGA could lead one to argue that the likelihood of a mutagenic effect was low; it could not be excluded. We therefore also employed the OECD Toolbox, but could not create a coherent category of compounds for read across.

Despite predictions from five different (Q)SAR systems, results were inconclusive, and the mutagenic potential of sulfentrazone, due to the presence of the impurity, could not be excluded. Formal laboratory experiments may therefore be advisable to increase the breadth of results used in the weight of evidenced discussion.

 An example of this approach is given in Box 1.


 To support formal regulatory decision-making, it is important to include detailed results and the reliability on which predictions and conclusions are based. The OECD have proposed two templates; the (Q)SAR Prediction Reporting Format (QPRF), which describes how an estimate was obtained and includes the model prediction(s), the endpoint, identification of the substance modelled, the relationship between the modelled substance and the defined applicability domain, and the identities of close analogues; and the (Q)SAR Model Reporting Format (QMRF), a harmonized template that contains information on the model algorithm, descriptors, the structures and data used for internal and external validation, and the result of the validation

QMRFs are useful sources of information on the models themselves. The JRC have created a QSAR Model Database to which developers can submit QMRFs. Currently, this database is poorly populated with few available even for platforms such as VEGA, which have been developed specifically for regulatory use and public availability. We encourage all model developers to upload their QMRFs to help advance innovation in the development of (Q)SARs.

Looking forward

 Although we are some way from a situation where submissions will be accepted based only on (Q)SAR-derived predictions for a specific endpoint, the importance of (Q)SARs as a tool is set to increase worldwide. As we build models, share QMRFs and expand training sets with better data, so our (Q)SAR models and predictions will improve.

What will impact our ability to deliver really good (Q)SAR predictions are advances in mechanistic modelling—our understanding of the mechanisms that lead to toxicity. This is being driven by computer simulation of biological systems, giving us greater insight into complex processes and pathways.

Together, such improvements will impact regulation, driving innovation and changing the role of (Q)SARs. Overall, the future for (Q)SAR modelling is predicated to be a bright one.

Envigo has published a white paper, based on a recent journal publication,3 about (Q)SAR for toxicity screening. For more information, visit www.envigo.com.

1.European Chemicals Agency. Practical guide – How to use and report (Q)SARs. Version 3.1. Helsinki: ECHA, July 2016.
2.OECD. Guidance document on the validation of (quantitative) structure-activity relationship [(Q)SAR] Models. Paris: OECD Publishing, 2007 (www.oecd-ilibrary.org).
3.S Pudenz, A Ruzgyte Frère. Toxicological & Environmental Chemistry 2017; [Epub ahead of print; dx.doi.org/10.1080/02772248.2016.1265649].



Copyright © 2017 Mack Brooks Speciality Publishing.