Assessment of chemicals

Introduction to (Quantitative) Structure Activity Relationships




Structure-Activity Relationship (SAR) is an approach designed to find relationships between chemical structure (or structural-related properties) and biological activity (or target property) of studied compounds.  As such it is the concept of linking chemical structure to a chemical property (e.g., water solubility) or biological activity including toxicity (e.g., fish acute mortality).  Qualitative SARs and quantitative SARs, collectively are referred to as (Q)SARs. Qualitative relationships are derived from non-continuous data (e.g., yes or no data), while quantitative relationships are derived for continuous data (e.g., toxic potency data).  The approach is not new as A.F.A. Cros in 1863 noted in “Action de l’alcool amylique sur l’organisme”, the relationship between the toxicity of primary aliphatic alcohols and their water solubility.


The central axiom of SAR is that the activity of molecules is reflected in their structure.  Hence, similar molecules have similar activities.  The SAR approach therefore assumes that the structure of a molecule (e.g., its geometric , electronic properties etc.) contains the features responsible for its physical, chemical, and biological properties.  It relies on the ability to represent the chemical by one or more descriptors of which 2-dimension structure is one.  The underlying problem is how to define differences at the molecular level, since each kind of activity might depend on different molecular similarities.


Biological activity (e.g., toxicity) of substances is governed by their properties, which in turn are determined by their chemical structure.The objectives of SAR are two-fold.  First, to determine as accurately as possible the limits of variation in the structure of a chemical that are consistent with the production of a specific effect (e.g., can a chemical elicit a specific toxic endpoint).  Second, to define the ways, which alterations in structure and thereby the overall properties of a compound influence endpoint potency. 


(Q)SARs are also models or mathematical relationship (often a statistical correlation), which relates a structure-related property to the presence or absence, or potency of another property or activity of interest.  (Q)SAR's most basic mathematical form is:


Activity = f (physiochemical or structural properties)


The development of a (Q)SARs model requires three components:

1) A data set that provides activity (usually measured experimentally) for a group of chemicals (i.e., the dependent variable).  This group of chemicals is typically defined by some selection criteria.

2) A structural criteria or structure-related property data set for the same group of chemicals (i.e., the independent variables).

3) A means of relating (usually a statistical analysis method) these two data arrays. Methods for relating structure to activity range from the simple, linear regression, through more complex aproaches such as partical least squares analyisis to the most complex, machine learning techniques such as neural networks.


Uses of (Q)SAR to fill data gaps
(Q)SAR may be used to predict properties and activities for untested compounds, which are in the same group of chemcials.



Compound A

Compound B

Compound C

Compound D

Compound E

Structure X












Property Y












 Activity Z






 Activity T







Using the data in the table above demonstrates how the (Q)SAR approaches are used.  An examination of the data in the table, in particular for Structure X reveals chemicals A, B, D, and E form a group of similar chemical as Structure X are common to all four compound (but not to chemical C).


For this group of chemicals a qualitative relationship is observed between Structure X and Activity Z.  Using this relationship, measured values of Activity Z for compounds A, B and D can be use to fill the data gap of Activity Z for the untested compound E.  This is done by reading-across from compound A, B, and D to compound E (predicting Activity Z to be positive for Compound E).


For this same group of similar chemicals the relationship between Property Y and Activity T is quantitative and modeled as [Activity T = 5.0 (Property Y) + 5.0].  Using this (Q)SAR model the potency of Activity T for compound D is predicted to be 25.


Further readings on (Q)SAR


QSAR models:


• Selassie CD. 2003. History of Quantitative Structure-Activity Relationships In: Abraham, DJ (ed.) Burger’s Medicinal Chemistry and Drug Discovery Sixth Edition, Volume 1: Drug Discovery. John Wiley&Sons, Inc, pp. 1-48.


• Cronin MTD, Walker JD, Jaworska JS, Comber MHI, Watts CD, and Worth AP. 2003. Use of QSARs in international decision-making frameworks to predict ecological effects and environmental fate of chemical substances. Environ. Health Perspect. 111:1376–1390.


• Cronin MTD, Jaworska JS, Walker JD, Comber MHI, Watts CD and Worth AP. 2003. Use of QSARs in international decision-making frameworks to predict health effects of chemical substances. Environ.  Health Perspect. 111: 1391-1401.


• OECD Guidance Document on the Validation of (Q)SAR Models

• Web site of 


Grouping of chemicals:


• OECD Guidance on Grouping of Chemicals

• Web site of the former European Chemicals Bureau

•  Environment, health and safety brief on (Q)SARs 


Related Documents