CADD Lecture

COMPUTER-AIDED
DRUG DESIGN
Thursday, April 19, 2012

Kazi Shefaet Rahman
k.s.rahman@gatech.edu

Outline

• The pharmaceutical pipeline

• Structure-based drug design: docking

• Ligand-based drug design: pharmacophore modeling
and QSAR

What is a drug?

A substance that, when absorbed,
alters normal bodily function

In pharmacology: FDA-approved for the diagnosis,
treatment, or prevention of disease.

Major Drug Classes

Small Molecule Protein Vaccine

Aspirin
Insulin

Lovastatin
Trastuzumab

Amprenavir

Erythropoietin

Where do drugs come from?
Vaccines Natural
4% 5%

Biologics
14% Natural
Derivatives
23%

Synthetic
Synthetic
40%
Natural Mimics
14%

Newnan & Cragg. J Nat Prod 70, 461–477 (2007)

Natural Sources

Aspirin is derived Taxol was discovered Morphine was purified
from willow bark in the Pacific yew tree from opium poppies

Success Rate in Pharma
10,000

1,000

100

10

1
marketable drug

Compound Libraries

Commercial, government or 10,000 – 10,000,000
academic compounds

Target Identification

Farooque & Lee. Annu Rev Physiol 71, 465–487 (2009)

Assay Development

Stockwell. Nature 432, 846-854 (2004)

Ligand Design Pathway

Identify

High- in vitro
Validate Compound
throughput
libraries
screening in vivo
Develop
assay

Target Discovery Lead Discovery Lead Optimization

Computer-Aided Drug Design

• Enrich existing compound
libraries

• Reduce amount of chemical
waste

• Faster progress

• Lower costs

Computers in the Ligand Design Pathway

Identify

High- in vitro
Validate Compound
throughput
libraries
screening in vivo
Develop
assay

Target Discovery Lead Discovery Lead Optimization

Bioinformatics Computer-Aided Drug Design

Structure-based and Ligand-based

Receptor Structure?

Known Unknown

Ligand-Based Design
Structure-Based Design
e.g. pharmacophore modeling &
e.g. Docking
QSAR

Video: Molecular Docking using Glide

Protein-Ligand Docking

• For a receptor-ligand complex, we want to predict:
1. Preferred orientation (pose)
2. Binding affinity (score)

• A docking program has least two functional components
1. Search algorithm
2. Scoring function

• Docking can be used for virtual screening, lead
optimization, or de novo design of ligands

Molecular Representations in Docking
1. Atomic
• Every atom represented
• Usually used with a molecular mechanics scoring function
• Computationally complex

2. Surface
• Molecules represented by solvent excluded surface
• Align points by minimizing surface angle
• Commonly used in protein-protein docking

3. Grid
• Receptor’s energetic contributions stored in grid points
• van der Waals, electrostatic, H-bonding terms
• May be combined with atomistic representation at binding
site surface

Search Algorithms: Ligand Flexibility
1. Systematic • DOCK
• FlexX
• Cycle through values for each degree of freedom • Glide
• Quickly leads to combinatorial explosion! • Hammerhead
• Often implemented in anchor-and-grow algorithms • FLOG

2. Stochastic
• AutoDock
• MOE-Dock
• Make random changes, and evaluate using Monte Carlo or
genetic algorithms • GOLD
• Tabu search algorithms minimize repetition of dead-ends • PRO_LEADS

3. Simulation
• DOCK
• Glide
• Molecular dynamics • MOE-Dock
• Energy minimization often used with other search
• AutoDock
technique
• Hammerhead

Alchemical Free Energy Calculations

ΔGI

Laq + Paq

ΔGbind

Paq + Lg ΔGII

ΔGbind = ΔGI – ΔGII (PL)aq

Scoring Functions
1. Force-field-based
• D-Score
• Quantify sum of receptor-ligand interaction energy and internal ligand • G-Score
energy • GOLD
• Ligand-receptor potential contains van der Waals and Coulomb • AutoDock
electrostatic terms (and H-bonding, in some cases) • DOCK
• Limited by lack of solvation and entropic terms

2. Empirical
• LUDI
• Binding energies as sums of uncorrelated terms, similar to but simpler • F-Score
than force-field terms • ChemScore
• Parameterized to fit regression analysis of experimental data • SCORE
• Often contain terms to approximate (de)solvation and entropic penalties • Fresno
• X-Score
1. Knowledge-based
• PMF
• Use potentials of mean force derived from libraries of protein-ligand
complexes • DrugScore
• Computationally simple • SMoG

Force-field-based Scoring Function
Autodock v4.0

Kitchen et al. Nat Rev Drug Discov 3, 935–949 (2004)

Empirical Scoring Functions
• LUDI
Böhm. J Comput Aided Molec Des 8, 243-256 (1994)

• ChemScore
Eldridge et al. J Comput Aided Molec Des 11, 425-444 (1997)

Knowledge-Based Scoring Functions

Example: ligand carboxyl O to protein histidine N

Procedure:
1. Find all PDB structures with ligand carboxyl O
2. Compute all distances to protein histidine N’s
3. Plot histogram of all O-N distances: p(rO-N)
4. Calculate E(r) using inverse Boltzmann

Boltzmann: p(r) ~ exp[ -E(r)/(RT) ]
Inverse Boltzmann: E(r) = -RT ln[ p(r) ]

Muegge & Martin. J Med Chem 42, 791-804 (1999)

Scoring: General Caveats
• Ligand flexibility and size
• For rigid molecules, correct pose predicted 90-100% of the time
• Drops to 45-80% for molecules with more rotatable bonds and MW

• Binding strength
• Most strong binders (Kd <100 nM) correctly predicted
• Difficult to predict weaker-binding ligands (Kd ~ 1μM)

• Binding site
• Hydrophobic binding sites yield better results than hydrophilic ones
• Placement of water molecules play an important part
• Active sites that require a conformational change (induced-fit) fair
poorly in rigid protein models
• Ideally want to start with holo- structure

Lead Optimization and de novo Design

• Lead Optimization
• Evaluate small changes in structure to distinguish between a μM and a
nM compound
• Need highly accurate docking and scoring functions
• Typically implemented as an “anchored search” to reduce number of
analogues
• Can be used to prioritize sites for experimental
modification

• De novo Design:
• Multiple-copy simultaneous search (MCSS):
small fragments simultaneously docked, and
preferred fragments combined
• Difficult to predict synthetic availability of
designed molecule

Docking and ADME Evaluation

• Absorption, distribution, metabolism and excretion
• Concentrate on drug interaction with models of
cytochrome P450

Lipinski’s Rule of Five

A rule of thumb to evaluate the likely activity of an oral drug
candidate

1. ≤5 hydrogen bond donors
2. ≤10 hydrogen-bond acceptors
3. MW < 500 Da
4. log P ≤ 5 (P = octanol-water partition coefficient)

Pharmacophore modeling

• Pharmacophore: Set of
features common to all known
ligands of a particular target

• Methodology:
1. Model conformational space of
ligands in training set to
simulate flexibility
2. Align generated conformations
3. Extract common features

Richmond et al. J Comput Aided Mol Des 20, 567-587 (2006)

Pharmacophore-based Virtual Screening

• Screen compound library for ligands containing
pharmacophore of interest

• Methodology
1. Generate ensemble of conformations for each ligand to be tested
2. Perform pharmacophore pattern matching (“substructure
search”) on every conformer
• Procedures from graph theory: Ullman, backtracking algorithm, GMA
algorithm

Challenges in Pharmacophore Modelling
• Modeling ligand flexibility
• Conformers pre-enumerated or generated on-the-fly
• Systematic torsional grids, genetic algorithms, Monte Carlo

• Molecular alignment
• Point-based: superimpose pairs of atoms. Anchor points need to be
defined
• Property-based: use molecular field descriptors to align

• Choosing a training set
• Choice of training set has big impact on generate pharmacophore
model

• New research: Pharmacophore-based de novo design

Quantitative Structure/Activity Relationships

• A QSAR is a mathematical relationship between the
geometrical and chemical characteristics of a molecule
and its biological activity

• Chemical descriptors are correlated with biological activity
in terms of an equation

• A valid QSAR should allow prediction of the biological
activity of new ligands prior to synthesis and in vitro and in
vivo assays

QSAR Requirements
1. Dataset

• Experimental measurements of the biological activity of a group of
chemicals

2. Descriptors

• Numerical values that encode relevant structure and property data
for this group of chemicals

3. Statistical methods

• To find relationships between these two sets of data

Molecular Descriptors in QSAR
1. Constitutional
• Total number of atoms, atoms of a certain type, number of bonds, number of rings

2. Topological
• Molecular shape, degree of branching

3. Electronic
• Partial atomic charges, dipole moments

4. Geometrical
• van der Waals volume, molecular surface

5. Quantum Mechanical
• Total energy, interaction energy between two atoms, nuclear repulsion between atoms

6. Physicochemical
• Liquid solubility, log P, boiling point

QSAR Methodology
• Thousands of descriptors can be generated for each molecule

• Several descriptors will be correlated
• E.g. MW and boiling point

• Statistically analyze descriptors to isolate 3 to 5 independent
descriptors that best correlate with biological activity

• Use regression methods to express activity in terms of
descriptors
• BA = a + bX1 + cX2 + …

• This model can now be used to predict activity in other test
compounds

3D QSAR

• If the structure of ligands are known, one can map
important chemical descriptors into 3D space

• This will generate a 3D pattern of functionally significant
regions of the ligand

• Visual identification of regions responsible for
(un)favorable interactions

Comparative Molecular Field Analysis

• Principle: Differences in binding/activity often due to
differences in the shape of the non-covalent fields
surrounding the molecule

• Methodology:
1. Align all test molecules
2. Place in a 3D grid (2 Å spacing)
3. Measure steric (van der Waals) and electrostatic (Coulomb)
energy for each molecule with a probe atom
4. Correlate energies with activity to generate 3D-QSAR
5. Display QSAR as colors and/or contours around molecular
structures

CoMFA Example

3D alignment

CoMFA
QSAR
44 compounds
(37 training, 7 test)

Shagufta et al. J Mol Model 13, 99-109 (2006)

CADD Lecture

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a CADD Lecture

Similar a CADD Lecture (20)

Último

Último (20)

CADD Lecture

Notas del editor