Jean-Claude Bradley presents on Open Notebook Science: Transparency in Research on October 23, 2012 at Georgia Tech for Open Access Week. Topics include solubility, melting points, a recrystallization app, the Chemical Information Retrieval class at Drexel University and the Open Chemical Property Matrix (OCPM). YouTube recording here: http://youtu.be/XpRyfdNuMrQ
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
Bradley Open Notebook Science Georgia Tech OA week
1. Open Notebook Science:
Transparency in
Research
Georgia Tech Library
Open Access Week
Jean-Claude Bradley
Associate Professor of Chemistry
Drexel University
October 23, 2012
4. “Simple” aldol condensation synthesis
Top Hit
(no reports
of synthesis)
In top ten
(a few reports
of synthesis)
(Andrew Lang)
5. What is the current standard for “sufficient
information” in communicating organic
chemistry?
By definition, all peer-reviewed
published documentation has
been approved as sufficient by
authors, editors and reviewers.
6. Searching for aldol condensations of acetone
in the Reaction Attempts database
(Andrew Lang)
30. What is the melting point of 4-benzyltoluene?
American Petroleum Institute5 C
PHYSPROP -30 C
PHYSPROP 125
C
peer reviewed journal (2008) 97.5 C
government database -30 C
government database 4.58 C
31. The quest to resolve the melting point
of 4-benzyltoluene: liquid at room temp
and can be frozen <-30C
32. Open Lab Notebook page measuring the
melting point of 4-benzyltoluene
36. There are NO FACTS,
only measurements embedded
within assumptions
Open Notebook Science maintains
the integrity of data provenance by
making assumptions explicit
37. Open Random Forest modeling of Open Melting Point
data using CDK descriptors
(Andrew Lang)
R2 = 0.78, TPSA and nHdon most important
44. Comparison of model with triple validated measurements
Straight chain carboxylic acids from 1 to 10 carbons
Straight chain alcohols from 1 to 10 carbons
45. Cyclic primary amines from 3 to 6 carbons (cyclobutylamine flagged for
validation – only single source available)
46. Open Melting Points in Supplementary Data Pages
of Wikipedia (Martin Walker)
51. The importance of recrystallization
• Generally preferred if there is a known
solvent that gives a good yield
• Scales much more easily and cheaply than
chromatography
• However, for new compounds much trial and
error may be needed
52. How does it work?
1. Look up the solvent boiling point
2. Look up the room temperature solubility or predict it via
Abraham descriptors predicted from a model using the
CDK
3. Look up the solute melting point or predict it via a model
using the CDK
4. Use the melting point and the solubility at room
temperature to predict the solubility at boiling
5. Calculate the predicted recrystallization yield
53. The Recrystallization App produces and uses
Open Data:
•Open Solubility Collection and Models
•Open Melting Point Collection and Models
•Modeling depends mainly on CDK (Open
Source Software with Open Descriptors)
•Open Notebook Science
54. What are good solvents to recrystallize benzoic acid?
(Andrew Lang)
55. Click on the solvent to see temp curve
(Andrew Lang)
67. Open Chemical Property Matrix (OCPM)
Boiling point Vapor
pressure
Flash point
Abraham Melting point
descriptors
logP
Aqueous Octanol
solubility solubility
71. Conclusions
More openness in chemistry can make science more efficient
Provide interfaces that make sense to the end users:
Open Data, Open Models and Open Source Software to modelers
Apps (smartphones, Google App Scripts, etc.) for chemists at the bench
Acknowledgements
Andrew Lang (code, modeling)
Bill Acree (modeling, solubility data contribution)
Antony Williams (ChemSpider services, mp data curation)
Matthew McBride and Rida Atif (recrystallization and synthesis)
Kayla Gogarty (OCPM)