Docking & Designing Small Molecules within Rosetta Code Framework
1. Tuesday September 18th 2012
DEVELOPMENT OF METHODS FOR DOCKING AND
DESIGNING
SMALL MOLECULES WITHIN THE ROSETTA CODE
FRAMEWORK
A doctoral dissertation defense presented by
GORDON HOWARD LEMMON
ROSETTA
2. Outline of presentation
A. What is structural biology?
B. Protein modeling and ligand docking
C. Introduction to Rosetta software
D. HIV-1 PR/PI binding affinity prediction
E. Rosetta software development
F. Ligand docking with waters using improved
Rosetta ligand docking code
2
3. Outline of presentation
A. What is structural biology?
B. Protein modeling and ligand docking
C. Introduction to Rosetta software
D. HIV-1 PR/PI binding affinity prediction
E. Rosetta software development
F. Ligand docking with waters using improved
Rosetta ligand docking code
3
4. What is structural biology?
ProteinsDNA
Structural Biology is the study of structure and function of
biological molecules such as DNA, RNA, and proteins
4
5. How big are proteins?
5
Water
1.51 Å
HH
O
Amprenavir
~17 Å 72 atoms
HIV-1 Protease (PR)
~54 Å 3163 atoms
1 Angstrom (Å) = 1 ten millionth of a millimeter
10. Outline of presentation
A. What is structural biology?
B. Protein modeling and ligand docking
C. Introduction to Rosetta software
D. HIV-1 PR/PI binding affinity prediction
E. Rosetta software development
F. Ligand docking with waters using improved
Rosetta ligand docking code
10
11. What is protein modeling?
Prediction of protein structure from
1. Sequence alone (de novo folding)
HIV-1 PR
Amino Acid Sequence
ANPCCSNPCQNRGECMSTGFDQ
YKCDCTRTGFYGENCTTPEFLTRI
KLLLKPTPNTVHYILTHFKGVWNIV
NNIPFLRSLIMKYVLTSRSYLIDSP
PTYNVHYGYKSWEAFSNLSYYTR
ALPPVADDCPTPMGVKGNKELPD
SKEVLEKVLLRREFIPDPQGSNM
MFAFF…
11
12. What is protein modeling?
Prediction of protein structure from
2. Sequence similarity (Comparative modeling)
HIV-1 PR Sequence
PQITLWKRPLVTIRIGGQL
KEALLDTGADDTVLEEMN
LPGRWKPKMIGGIGGFIK
VRQYDQIPIEICGHKAIGT
VLVGPTPTNVIGRNLLTQI
GCTLNF…
HIV-2 PR
HIV-1 PR
12
+
13. What is ligand docking?
Prediction of structure of protein/ligand interface
Prediction of ligand binding affinity
13
+
14. Outline of presentation
A. What is structural biology?
B. Protein modeling and ligand docking
C. Introduction to Rosetta software
D. HIV-1 PR/PI binding affinity prediction
E. Rosetta software development
F. Ligand docking with waters using improved
Rosetta ligand docking code
14
20. Outline of presentation
A. What is structural biology?
B. Protein modeling and ligand docking
C. Introduction to Rosetta software
D. HIV-1 PR/PI binding affinity prediction
E. Rosetta software development
F. Ligand docking with waters using improved
Rosetta ligand docking code
20
29. RosettaLigand PR/PI ΔΔGs predictions
29
0.1 Å 5˚ PI movements
Side chain and ligand rotamer sampling
Minimization of PR side chain and PI
torsion angles
MC Accept
Minimize Backbone torsion angles
Energy filter
Random 5 Å Translation complete
rotation of PI
171 PR template
structures
176 Sequence/PI
pairs
10 Rosetta relaxed models per
input (300,960 models)
30,096 Rosetta inputs
1000 RosettaLigand docked
models per relaxed model
(300,960,000 docked models)
Top 10% of models by total score
for each Sequence/PI pair
Top models by interface score for
each Sequence/PI pair
RosettaLigand DockingPR/PI ΔΔGs prediction workflow
x6
33. Previous PR/PI ΔΔG
predictions failed
Score Function
Correlation
N=112
Number of non-hydrogen atoms 0.172
X-Score::HPScore 0.341
SYBYL::ChemScore 0.276
DS::PMF04 0.183
DrugScorePDB::PairSurf 0.225
AutoDock 0.38
RosettaLigand 0.71
33
Experimental vs Predicted HIV-1 PR ΔΔG
34. Outline of presentation
A. What is structural biology?
B. Protein modeling and ligand docking
C. Introduction to Rosetta software
D. HIV-1 PR/PI binding affinity prediction
E. Rosetta software development
F. Ligand docking with waters using improved
Rosetta ligand docking code
34
40. Fragment based screening can greatly
expand sampling space
Congreve, M. et al. Drug Discov.Today 2003,8, 876-877
Traditional Screening Fragment based screening
40
41. Common drug based Fragments
Hartshorn M.J. Murray C.W.et.al. J. Med. Chem. 2005 48 403-413
H
N
N
N
N
N
N
H
N
N
S
O
O
NH2
NH
NH2
O
N
H
OH
OH
N
H
N N
NH
N
O
N
N NH
O
41
42. RosettaLigandDesign
Library of small
molecule fragments
Place fragments in protein binding site
-10
-12
3
-7
-5
Select low
energy
models for
refinement
Dock ligand with flexible protein
side-chains and backbone
42
43. RosettaLigandDesign
Library of small
molecule fragments
Place fragments in protein binding site
-8
-15
-18
-10
-12
Select low
energy
models for
refinement
Dock ligand with flexible protein
side-chains and backbone
43
46. Rosetta ligand design in action
46
A. Low-res search for starting fragment
B. Refine (dock) starting fragment
C. Grow small-molecule using fragment library
D. Refine (dock) 2-fragment complex
E. Grow small-molecule using fragment library
F. Refine (dock) 3-fragment complex
G. Add Hydrogens to unsatisfied connection
points
47. Protein binding sites are complex
Dethiobiotin
(DTB)
Inorganic
phosphate
Mg
Ions ADP
47
48. Multiple Ligand docking may
capture induced fit effects
Serial Docking
Simultaneous Docking
48
50. Outline of presentation
A. What is structural biology?
B. Protein modeling and ligand docking
C. Introduction to Rosetta software
D. HIV-1 PR/PI binding affinity prediction
E. Rosetta software development
F. Ligand docking with waters using
improved Rosetta ligand docking code
50
60. Conclusions
Binding affinity predictions can be improved by
Optimizing Rosetta score term weights
Ignoring the unbound state
New RosettaLigand code allows
Multiple ligand docking
Fragment based rotamers for greater flexibility
Fragment based design of ligands
Docking with waters helps in spacious binding
cavities, hurts in crowded binding cavities
60
61. Professional acknowledgements
Meiler Lab
Jens Meiler
Kristian Kaufmann
Sam Deluca
Steven Combs
Committee
David Tabb
Richard DAquila
Brian Bachmann
Jarrod Smith
Molecular Biophysics Training Grant (NIH)
RosettaCommons
61
The meilerlab focuses on proteins…There are 1000s of different proteins that all have a unique role to play – these include proteins that form muscle, hair, and skin, to proteins that perform chemical reactions, forming and breaking chemical bonds.
Explain here that most drugs that you pick up at the pharmacy work by binding to specific proteins.Proteins are very large. How is a molecule this large constructed?
How are molecules as large as proteins created?
This protein has 198 amino acids – it is actually two chains of 99 AA eachHow can the sequence determine something as complex as 3-D structure? It has to do with the way that amino acids interact with each other.
Sequence determines structure, which determines function.These mature proteins plays a role in the activity of the HIV virus
Determining sequence is easy, determining structure is hard. If we can predict structure we can understand function.
Using EXPERIMENTAL structures as comparison
Structure means the position of the small molecule with respect to the ligand.Predicting binding affinity is more difficult.If we can predict ligand binding affinity, then we can make predictions about how tight a potential drug will bind to its target and how specific that binding will be.
Point out that this is
The lowest scoring model we predict will be closest to the true position that the small molecule will assume.
I’ve talked about H-bonding but there are many terms and each has a default weight.
Mutations lead to drug resistance. WHO keeps track of these mutations…
Medicine: As HIV-1 PR mutates, a patient being treated with one of these PRs stops responding to treatment. So they are switched to a different PR.
Explain the hypothesis about effect of mutation on flexible vs. rigid structure.
Experimental vs Predicted!
Explain that the ligand moves as well. This is very important!
The idea is that instead of screening libraries of millions of larger compounds, one could screen libraries of several hundred fragments for several independent fragments, then link these together.
for example a protein binding pocket can have…
Induced-fit means that the protein changes its shape as it interacts with the small molecule.Enzymes that catalyze chemical reactions, either creating or breaking bonds are good examples.
RMSD is an average distance over all pairs of atoms.
Talk about how important these results are for PI development
RMSD on X axis and Rosetta Interface Score on Y axisWith water we are consistently producing low scoring models below 2 A RMSD