1. Suppor&ng
Scien&fic
Sensemaking
Anita
de
Waard
VP
Research
Data
Collabora&ons,
Elsevier
a.dewaard@elsevier.com
Visit
Microso*
Research,
January
23,
2013
2. Outline
• A
model
of
scien&fic
sensemaking:
– Stories,
that
persuade
with
data
– Discourse
segments
and
verb
tense
• Towards
extrac&ng
claim-‐evidence
networks:
– Hedging
in
science
– Crea&ng
claim-‐evidence
networks
• Data:
– Why
life
is
so
complicated
– Connec&ng
biological
experiments
into
collaboratories
3. A
paper
is
a
story…
Story Grammar
The Story of Goldilocks and Paper The AXH Domain of Ataxin-1 Mediates
the Three Bears
Grammar
Neurodegeneration through Its Interaction with Gfi-1/
Senseless Proteins
Setting
Time
Once upon a time
Background
The mechanisms mediating SCA1 pathogenesis are still not fully
understood, but some general principles have emerged.
Character
a little girl named Goldilocks
Objects of the Drosophila Atx-1 homolog (dAtx-1) which lacks a polyQ tract,
study
Location
She went for a walk in the forest.
Pretty soon, she came upon a Experimental studied and compared in vivo effects and interactions to those of the
house.
setup
human protein
Theme
Goal
She knocked and, when no one Research Gain insight into how Atx-1's function contributes to SCA1
answered,
goal
pathogenesis. How these interactions might contribute to the disease
process and how they might cause toxicity in only a subset of neurons
in SCA1 is not fully understood.
Attempt
she walked right in.
Hypothesis
Atx-1 may play a role in the regulation of gene expression
Episode
Name
At the table in the kitchen, there Name
dAtX-1 and hAtx-1 Induce Similar Phenotypes When Overexpressed
were three bowls of porridge.
in Files
Subgoal
Goldilocks was hungry.
Subgoal
test the function of the AXH domain
Attempt
She tasted the porridge from the Method
overexpressed dAtx-1 in flies using the GAL4/UAS system (Brand and
first bowl.
Perrimon, 1993) and compared its effects to those of hAtx-1.
Outcome
This porridge is too hot! she Results
Overexpression of dAtx-1 by Rhodopsin1(Rh1)-GAL4, which drives
exclaimed.
expression in the differentiated R1-R6 photoreceptor cells (Mollereau
et al., 2000 and O'Tousa et al., 1985), results in neurodegeneration in
Attempt
So, she tasted the porridge from the the eye, as does overexpression of hAtx-1[82Q]. Although at 2 days
second bowl.
after eclosion, overexpression of either Atx-1 does not show obvious
morphological changes in the photoreceptor cells
Outcome
This porridge is too cold, she said
Data
(data not shown),
Attempt
So, she tasted the last bowl of
porridge.
Results
both genotypes show many large holes and loss of cell integrity at 28
days
Outcome
Ahhh, this porridge is just right, she
(Figures 1B-1D).
4. …that
persuades…
Aristotle
Quin-lian
Scien-fic
Paper
The
introducon
of
a
speech,
where
one
announces
the
subject
Introducon and
purpose
of
the
discourse,
and
where
one
usually
employs
Introducon:
prooimion
/
exordium
the
persuasive
appeal
to
ethos
in
order
to
establish
credibility
posioning
with
the
audience.
Statement
of
The
speaker
here
provides
a
narrave
account
of
what
has
Introducon:
research
prothesis
Facts/
happened
and
generally
explains
the
nature
of
the
case.
narrao
queson
Summary/
The
proposio
provides
a
brief
summary
of
what
one
is
about
proposo
to
speak
on,
or
concisely
puts
forth
the
charges
or
accusaon.
Summary
of
contents
Proof/
The
main
body
of
the
speech
where
one
offers
logical
piss
confirmao
arguments
as
proof.
The
appeal
to
logos
is
emphasized
here.
Results
Refutaon/
As
the
name
connotes,
this
secon
of
a
speech
was
devoted
to
refutao
answering
the
counterarguments
of
one's
opponent.
Related
Work
Following
the
refutao
and
concluding
the
classical
oraon,
the
Discussion:
summary,
epilogos
perorao
perorao
convenonally
employed
appeals
through
pathos,
and
oUen
included
a
summing
up.
implicaons.
Goal
of
the
paper
is
to
be
published;
it
uses
author/journal
as
a
host
Format
has
co-‐evolved:
predator-‐prey
relaonship
with
reviewers
6. In
defense
of
the
clause
as
the
unit
of
thought:
1. Importantly,
our
results
so
far
indicate
that
the
expression
of
miR-‐3723
did
not
reduce
the
acvity
of
RASV12,
as
these
cells
were
sll
growing
faster
than
normal
cells
and
were
tumorigenic,
for
which
RAS
acvity
is
indispensable
(Hahn
et
al,
1999
and
Kolfschoten
et
al,
2005).
2. To
shed
more
light
on
this
aspect,
we
examined
the
effect
of
miR-‐3723
expression
on
p53
acvaon
in
response
to
oncogenic
smulaon.
3. We
used
for
this
experiment
BJ/ET
cells
containing
p14ARFkd
because,
following
RASV12
treatment,
in
those
cells
p53
is
sll
acvated
but
more
clearly
stabilized
than
in
parental
BJ/ET
cells
(Voorhoeve
and
Agami,
2003),
resulng
in
a
sensized
system
for
slight
alteraons
in
p53
in
response
to
RASV12.
4. Figure
4A
shows
that
following
RASV12
smulaon,
p53
was
stabilized
and
acvated,
and
its
target
gene,
p21cip1,
was
induced
in
all
cases,
indicang
an
intact
p53
pathway
in
these
cells.
• More
than
one
‘thought
unit’
per
sentence.
• Verb
tense
changes
within
sentence
(several
mes).
• Airibuon,
acons/states,
and
preposions
all
contained
within
a
sentence.
7. In
defense
of
the
clause
as
the
unit
of
thought:
1. Importantly,
our
results
so
far
indicate
that
the
expression
of
miR-‐3723
did
not
reduce
the
acvity
of
RASV12,
as
these
cells
were
sll
growing
faster
than
normal
cells
and
were
tumorigenic,
for
which
RAS
acvity
is
indispensable
(Hahn
et
al,
1999
and
Kolfschoten
et
al,
2005).
2. To
shed
more
light
on
this
aspect,
we
examined
the
effect
of
miR-‐3723
expression
on
p53
acvaon
in
response
to
oncogenic
smulaon.
3. We
used
for
this
experiment
BJ/ET
cells
containing
p14ARFkd
because,
following
RASV12
treatment,
in
those
cells
p53
is
sll
acvated
but
more
clearly
stabilized
than
in
parental
BJ/ET
cells
(Voorhoeve
and
Agami,
2003),
resulng
in
a
sensized
system
for
slight
alteraons
in
p53
in
response
to
RASV12.
4. Figure
4A
shows
that
following
RASV12
smulaon,
p53
was
stabilized
and
acvated,
and
its
target
gene,
p21cip1,
was
induced
in
all
cases,
indicang
an
intact
p53
pathway
in
these
cells.
Head:
premise,
movaon,
Middle:
main
End:
interpretaon,
elaboraon,
airibuon
(matrix
clause)
biological
statement
airibuon
(reference)
8. In
defense
of
the
clause
as
the
unit
of
thought:
1. Importantly,
our
results
so
far
indicate
that
the
expression
of
miR-‐3723
did
not
reduce
the
acvity
of
RASV12,
as
these
cells
were
sll
growing
faster
than
normal
cells
and
were
tumorigenic,
for
which
RAS
acvity
is
indispensable
(Hahn
et
al,
1999
and
Kolfschoten
et
al,
2005).
2. To
shed
more
light
on
this
aspect,
we
examined
the
effect
of
miR-‐3723
expression
on
p53
acvaon
in
response
to
oncogenic
smulaon.
3. We
used
for
this
experiment
BJ/ET
cells
containing
p14ARFkd
because,
following
RASV12
treatment,
in
those
cells
p53
is
sll
acvated
but
more
clearly
stabilized
than
in
parental
BJ/ET
cells
(Voorhoeve
and
Agami,
2003),
resulng
in
a
sensized
system
for
slight
alteraons
in
p53
in
response
to
RASV12.
4. Figure
4A
shows
that
following
RASV12
smulaon,
p53
was
stabilized
and
acvated,
and
its
target
gene,
p21cip1,
was
induced
in
all
cases,
indicang
an
intact
p53
pathway
in
these
cells.
Regulatory
Fact
Goal
Method
Result
Implicaon
clause
9. Clause,
realm
and
tense:
Conceptual
Both seminomas and the EC component ofof
Both seminomas and the EC component knowledge
Fact
nonseminomas share features withwithcells. cells. To
nonseminomas share features ES ES
To exclude thatthe detection of miR-371-3 merely
exclude that Goal
the detection of miR-371-3 in ES cells, we tested
reflects its expression pattern merely reflects its Hypothesis
expression pattern in ES cells,
by RPA miR-302a-d, another ES cells-specific
we tested by RPA miR-302a-d, another ES cells-
miRNA cluster (Suh et al, 2004). In many of the
specific miRNA clustere(Suhn g al,s2004). o m a s a n d Method
m i R - 3 7 1 - 3 e x p r s s i et emin Experimental
In many of the miR-371-3 expressing seminomas (Figs
nonseminomas, miR-302a-d was undetectable Evidence
and nonseminomas, miR-302a-d was undetectable
S7 and S8), suggesting that miR-371-3 expression is Result
(Figs S7 and S8),
a selective event during tumorigenesis.
suggesting that Reg-‐Implicaon
miR-371-3 expression is a selective event during
Implicaon
tumorigenesis.
10. Clause,
realm
and
tense:
Concepts,
models,
‘facts’:
Present
tense
Fact
Problem
Implicaon
(1) Both seminomas (3) c. miR-371-3
(2) b. the detection of
and the EC component expression is a
miR-371-3 merely
of nonseminomas selective event
reflects its expression
share features with ES during
pattern in ES cells,
cells. tumorigenesis.
Goal
Regulatory-‐Implicaon
(3) b. suggesting
(2) a. To exclude that Transions:
present
tense
that
Method
Result
(3) a. In many of the miR-371-3
(2) c. we tested by RPA
expressing seminomas and
miR-302a-d, another ES
nonseminomas, miR-302a-d
cells-specific miRNA cluster
was undetectable (Figs S7 and
(Suh et al, 2004).
S8),
Experiment:
Past
tense
11. Tense
use
in
science
and
mythology:
Facts
in
the
Endogenous
small
RNAs
(miRNAs)
regulate
I
sing
of
golden-‐throned
Hera
whom
Rhea
bare.
eternal
present
gene
expression
by
mechanisms
conserved
Queen
of
the
immortals
is
she,
surpassing
all
in
across
metazoans.
beauty:
she
is
the
sister
and
the
wife
of
loud-‐
thundering
Zeus,
-‐-‐the
glorious
one
whom
all
the
blessed
throughout
high
Olympus
reverence
and
honor.
Events
in
the
Vehicle-‐treated
animals
spent
equivalent
Now
the
wooers
turned
to
the
dance
and
to
simple
past
me
invesgang
a
juvenile
in
the
first
and
gladsome
song,
and
made
them
merry,
and
waited
second
sessions
in
experiments
conducted
in
ll
evening
should
come;
and
as
they
made
merry
the
NAC
and
the
striatum:
T1
values
were
dark
evening
came
upon
them.
122
±
6
s
and
114
±
5
s.
Events
with
We
also
generated
BJ/ET
cells
expressing
the
And
she
took
her
mighty
spear,
pped
with
sharp
embedded
RASV12-‐ERTAM
chimera
gene,
which
is
only
bronze,
heavy
and
huge
and
strong,
wherewith
facts
acve
when
tamoxifen
is
added
(De
Vita
et
al,
she
vanquishes
the
ranks
of
men-‐of
warriors,
with
2005).
whom
she
is
wroth,
she,
the
daughter
of
the
mighty
sire.
Aribu-on
in
miRNAs
have
emerged
as
important
In
this
book
I
have
had
old
stories
wriien
down,
as
the
present
regulators
of
development
and
control
I
have
heard
them
told
by
intelligent
people,
perfect
processes
such
as
cell
fate
determinaon
and
concerning
chiefs
who
have
held
dominion
in
the
cell
death
(Abrahante
et
al.,
2003,
Brennecke
northern
countries,
and
who
spoke
the
Danish
et
al.,
2003,
Chang
et
al.,
2004,
Chen
et
al.,
tongue;
and
also
concerning
some
of
their
family
2004,
Johnston
and
Hobert,
2003,
Lee
et
al.,
branches,
according
to
what
has
been
told
me.
1993]
Implica-ons
These
results
indicate
that
although
Now
it
is
said
that
ever
since
then
whenever
the
are
hedged,
miR-‐3723
confer
complete
protecon
to
camel
sees
a
place
where
ashes
have
been
and
in
the
oncogene-‐induced
senescence
in
a
manner
scaiered,
he
wants
to
get
revenge
with
his
enemy
present
tense
similar
to
p53
inacvaon,
the
cellular
the
rat
and
stomps
and
rolls
in
the
ashes
hoping
to
response
to
DNA
damage
remains
intact
get
the
rat
12. From
ficon
to
fact:
Hedging
“[Y]ou
can
transform
..
ficon
into
fact
just
by
adding
or
subtracng
references”,
Bruno
Latour
[1]
• Voorhoeve
et
al.,
2006:
These
miRNAs
neutralize
p53-‐
mediated
CDK
inhibion,
possibly
through
direct
inhibion
of
the
expression
of
the
tumor
suppressor
LATS2.
• Kloosterman
and
Plasterk,
2006:
In
a
genec
screen,
miR-‐372
and
miR-‐373
were
found
to
allow
proliferaon
of
primary
human
cells
that
express
oncogenic
RAS
and
acve
p53,
possibly
by
inhibing
the
tumor
suppressor
LATS2
(Voorhoeve
et
al.,
2006).
• Yabuta
et
al.,
2007:
[On
the
other
hand,]
two
miRNAs,
miRNA-‐372
and-‐373,
funcon
as
poten-al
novel
oncogenes
in
tescular
germ
cell
tumors
by
inhibion
of
LATS2
expression,
which
suggests
that
Lats2
is
an
important
tumor
suppressor
(Voorhoeve
et
al.,
2006).
• Okada
et
al.,
2011:
Two
oncogenic
miRNAs,
miR-‐372
and
miR-‐373,
directly
inhibit
the
expression
of
Lats2,
thereby
allowing
tumorigenic
growth
in
the
presence
of
p53
(Voorhoeve
et
al.,
2006).
13. Hedging
in
science:
• Why
do
authors
hedge?
– Make
a
claim
‘pending
[…]
acceptance
in
the
community’
[2]
– ‘Create
A
Research
Space’
–
hedging
allows
authors
to
insert
themselves
into
the
discourse
in
a
community
[3]
– ‘the
strongest
claim
a
careful
researcher
can
make’
[4]
• Hedging
cues,
speculave
language,
modality/negaon:
– Light
et
al
[5]:
finding
speculave
language
– Wilbur
et
al
[6]:
focus,
polarity,
certainty,
evidence,
and
direconality
– Thompson
et
al
[7]:
level
of
speculaon,
type/source
of
the
evidence
and
level
of
certainty
• Senment
detecon
(e.g.
Kim
and
Hovy
[8]
a.m.o.):
– Holder
of
the
opinion,
strength,
polarity
as
‘mathemacal
funcon’
acng
on
main
proposional
content
– Wide
applicaons
in
product
reviews;
but
not
(yet)
in
science!
14. A
model
for
epistemic
evaluaons:
For
a
Proposion
P,
an
epistemically
marked
clause
E
is
an
evaluaon
of
P,
where
EV,
B,
S(P),
with:
– V
=
Value:
3
=
Assumed
true,
2
=
Probable,
1
=
Possible,
0
=
Unknown,
(-‐
1=
possibly
untrue,
-‐
2
=
probably
untrue,
-‐3
=
assumed
untrue)
– B
=
Basis:
Reasoning
Data
– S
=
Source:
A
=
speaker
is
author
A,
explicit
IA
=
speaker
author,
A,
implicit
N
=
other
author
N,
explicit
NN
=
other
author
NN,
implicit
Model
suggested
by
Eduard
Hovy,
Informaon
Sciences
Instute
University
South
Califormia
15. Reporng
verbs
vs.
epistemic
value:
Value
=
0
establish,
(remain
to
be)
elucidated,
(unknown)
be
(clear/useful),
(remain
to
be)
examined/determined,
describe,
make
difficult
to
infer,
report
Value
=
1
be
important,
consider,
expect,
hypothesize
(5x),
give
(hypothecal)
insight,
raise
possibility
that,
suspect,
think
Value
=
2
appear,
believe,
implicate
(2x),
imply,
indicate
(12x),
play
a
(probable)
role,
represent,
suggest
(18x),
validate
(2x),
Value
=
3
be
able/apparent/important
/posive/visible,
compare
(presumed
true)
(2x),
confirm
(2x),
define,
demonstrate
(15x),
detect
(5x),
discover,
display
(3x),
eliminate,
find
(3x),
idenfy
(4x),
know,
need,
note
(2x),
observe
(2x),
obtain
(success/
results-‐
3x),
prove
to
be,
refer,
report(2x),
reveal
(3x),
see(2x),
show(24x),
study,
view
16. Most
prevalent
clause
type:
These
results
suggest
that...
Adverb/Connecve
thus,
therefore,
together,
recently,
in
summary
Determiner/Pronoun
it,
this,
these,
we/our
Adjecve
previous,
future,
beYer
Noun
phrase
data,
report,
study,
result(s);
method
or
reference
Modal
form
of
‘to
be’,
may,
remain
Adjecve
o*en,
recently,
generally
Verb
show,
obtain,
consider,
view,
reveal,
suggest,
hypothesize,
indicate,
believe
Preposion
that,
to
18. Adding
metadiscourse
to
triples:
Biological
statement
with
BEL/
epistemic
BEL
representa-on:
Epistemic
markup
evalua-on
These
miRNAs
neutralize
p53-‐mediated
CDK
r(MIR:miR-‐372)
-‐| Value
=
inhibion,
possibly
through
direct
inhibion
(tscript(p(HUGO:Trp53))
-‐|
Possible
of
the
expression
of
the
tumor-‐suppressor
kin(p(PFH:”CDK
Family”)))
Source
=
LATS2.
Increased
abundance
of
Unknown
miR-‐372
decreases
Basis
=
abundance
of
LATS2
Unknown
r(MIR:miR-‐372)
-‐|
r(HUGO:LATS2)
Biological
statement
with
Medscan/ MedScan
Analysis:
Epistemic
epistemic
markup
evalua-on
Furthermore,
we
present
evidence
that
the
IL-‐6
è
NUCB2
(nesfan-‐1)
Value
=
secreon
of
nesfaTn-‐1
into
the
culture
Relaon:
MolTransport
Probable
media
was
dramacally
increased
during
the
Effect:
Posive
Source
=
differenaon
of
3T3-‐L1
preadipocytes
into
CellType:
Adipocytes
Author
adipocytes
(P
0.001)
and
aUer
treatments
Cell
Line:
3T3-‐L1
Basis
=
Data
with
TNF-‐alpha,
IL-‐6,
insulin,
and
dexamethasone
(P
0.01).
19. Claim-‐Evidence
example:
Data2Semancs
Goal:
improve
speed
of
integraon
of
research
pracce
Step 1: Patient data +
diagnosis link to Guideline
recommendation
B.
Elsevier-‐published
A. Philips’ Electronic Patient Records Clinical
Guideline
Step 2: Guideline recommendation
links to evidence in report or data
C. Elsevier (or other publisher’s)
Research Report or Data
20. Claim-‐Evidence
Chains
in
Drug-‐drug
wiide
collecon
oaf
nd
drug
names
in
nteracons
Step
1:
Manually
idenfy
DDIs
content
sources
Step
2:
Develop
a
model
of
Drug-‐Drug
Interacon
and
define
candidates
Step
3:
Automate
this
process
and
store
as
Linked
Data
20
21. Claimed
Knowledge
Updates
Definion:
1)
A
CKU
expresses
a
proposion
about
biological
enes
2)
A
CKU
is
a
new
proposion
3)
The
authors
present
the
CKU
as
factual:
=
Strength
=
Certainty
4)
A
CKU
is
derived
from
experimental
work
described
in
the
arcle:
=
Basis
=
Data
5)
The
ownership
is
aiributed
to
the
author(s)
of
the
arcle.
⇒ Source
=
Author,
Explicit
Sandor/de
Waard,
[13]
22. A
corpus
for
citaon
analysis:
Type
Voorhoeve
text
CiTng
text
Method
We
subsequently
created
a
human
Voorhoeve
et
al.
(116)
employed
a
novel
strategy
by
miRNA
expression
library
(miR-‐Lib)
by
combining
an
miRNA
vector
library
and
corresponding
bar
cloning
almost
all
annotated
human
code
array
Using
a
novel
retroviral
miRNA
expression
miRNAs
into
our
vector
(Rfam
release
library,
6)
(Figure
S3)
Agami
and
co-‐workers
performed
a
cell-‐based
screen
Result
we
idenfied
miR-‐372
and
miR-‐373,
miR-‐372
and
miR-‐373
were
consequently
found
to
permit
each
permi|ng
proliferaon
and
proliferaon
and
tumorigenesis
of
these
primary
cells
tumorigenesis
of
primary
human
carrying
both
oncogenic
RAS
and
wild-‐type
p53,
cells
that
harbor
both
oncogenic
Voorhoeve
et
al.
(2006)
idenfied
miR-‐372
and
miR-‐373
RAS
and
acve
wild
-‐
type
p53.
miR-‐372
has
been
recently
described
as
potenal
oncogene
that
collaborate
with
oncogenic
RAS
in
cellular
transformaon
Interpretaon
These
miRNAs
neutralize
p53-‐
probably
through
direct
inhibion
of
the
expression
of
the
mediated
CDK
inhibion,
possibly
tumor-‐suppressor
LATS2
and
subsequent
neutralizaon
of
through
direct
inhibion
of
the
the
p53
pathway.
expression
of
the
tumor
suppressor
Compromised
Lats2
funconality
might
reduce
the
selecve
LATS2
.
pressure
for
p53
inacvaon
during
tumor
progression.
Work
done
with
Lucy
Vanderwende
23. Data
sharing
in
biology
• Interspecies
variability
A
specimen
is
not
a
species!
• Gene
expression
variability
Knowing
genes
is
not
knowing
how
they
are
expressed!
• Microbiome
An
animal
is
an
ecosystem!
• Systems
biology
Whole
is
more
than
the
sum
of
its
parts!
• Models
vs.
experiment
Are
we
talking
about
the
same
things?
In
a
way
we
can
all
use?
• Dynamics
Life
is
not
in
equilibrium!
=
Life
is
complicated!
Reduconism
doesn’t
work
for
living
systems.
hip://en.wikipedia.org/wiki/File:Duck_of_Vaucanson.jpg
24. Stascs
to
the
rescue!
With
enough
observaons,
trends
and
anomalies
can
be
detected:
•
“Here
we
present
resources
from
a
populaon
of
242
healthy
adults
sampled
at
15
or
18
body
sites
up
to
three
mes,
which
have
generated
5,177
microbial
taxonomic
profiles
from
16S
ribosomal
RNA
genes
and
over
3.5
terabases
of
metagenomic
sequence
so
far.”
The
Human
Microbiome
Project
Consorum,
Structure,
funcon
and
diversity
of
the
healthy
human
microbiome,
Nature
486,
207–214
(14
June
2012)
doi:10.1038/
nature11234
• “The
large
sample
size
—
4,298
North
Americans
of
European
descent
and
2,217
African
Americans
—
has
enabled
the
researchers
to
mine
down
into
the
human
genome.”
Nidhi
Subbaraman,
Nature
News,
28
November
2012,
High-‐resoluon
sequencing
study
emphasizes
importance
of
rare
variants
in
disease.
25. Enable
‘incidental
collaboratories’:
• Collect:
store
data
at
the
level
of
the
experiment:
– Accessible
through
a
single
interface
– Add
enough
metadata
to
know
what
was
done/seen
• Connect:
allow
analyses
over:
– Similar
experiment
types
– Experiments
done
with/on
similar
biological
‘things’
(species,
strains,
systems,
cells
etc.)
– In
a
way
that
can
be
used
by
modelers!
• Keep:
– Long-‐term
preservaon
of
data
and
soUware
– Fulfill
Data
Management
Plan
requirements
– Allow
‘gated’
access
when
and
to
whom
researcher
wants
26. Let’s
look
at
a
typical
lab:
• How
to
get
the
right
anbody
IDs
• And
messy
bits
• From
the
lab
notebook
• Into
the
PI’s
command
center?
27. Objecons
and
rebuials
re.
data
sharing
Objec-on:
Rebual:
“But
our
lab
notebooks
are
all
on
Develop
smart
phone/tablet
apps
for
data
paper”
input
“I
need
to
see
a
direct
benefit
from
Develop
‘data
manipula-on
dashboard’
for
something
I
spend
my
me
on”
PI
to
allow
beier
access
to
full
experimental
output
for
his/her
lab
“I
want
things
to
be
peer
reviewed
Allow
reviewers
access
to
experimental
before
I
expose
them”
database
before
publicaon
(of
data
or
paper)
“I
don’t
really
trust
anyone
else’s
Add
a
social
networking
component
to
this
data
–
well,
except
for
the
guys
I
data
repository
so
you
know
who
(to
the
went
to
Grad
School
with…”
individual)
created
that
data
point.
“I
am
afraid
other
people
=
Reward
system
moves
from
a
might
scoop
my
discoveries”
compe--on
to
a
‘shared
mission’
28. Problem:
biological
research
is
quite
insular
• Biology
is
small:
size
10^-‐5
–
10^2
m,
scienst
can
work
alone
(‘King’
and
‘subjects’).
• Biology
is
messy:
it
doesn’t
happen
Prepare
behind
a
terminal.
• Biology
is
compeve:
many
Ponder
Observe
people
with
similar
skill
sets,
Communicate
vying
for
the
same
grants
Analyze
• In
summary:
the
structure
of
biological
research
does
not
inherently
promote
collaboraon
(vs.,
for
instance,
big
physics
or
astronomy).
29. So
we
can
do
joint
experiments:
Across
labs,
experiments:
track
reagents
and
how
they
are
used
Observaons
Observaons
Observaons
Prepare
Prepare
Analyze
Communicate
Analyze
Communicate
30. So
we
can
do
joint
experiments:
Compare
outcome
of
interacons
with
these
enes
Observaons
Observaons
Observaons
Prepare
Prepare
Analyze
Communicate
Analyze
Communicate
31. So
we
can
do
joint
experiments:
Build
a
‘virtual
reagent
spectrogram’
by
comparing
how
different
enes
Observaons
interacted
in
different
experiments
Observaons
Observaons
Prepare
Prepare
Analyze
Communicate
Analyze
Communicate
32. Elsevier
Research
Data
Services:
1. Help
increase
the
amount
of
data
shared
from
the
lab,
enabling
incidental
collaboratories
2. Help
increase
the
value
of
the
data
shared
by
increasing
annotaon,
normalizaon,
provenance
enabling
enhanced
interoperability
3. Help
measure
and
deliver
credit
for
shared
data,
the
researchers,
the
instute,
and
the
funding
body,
enabling
more
sustainable
pla‚orms
33. Summary
–
Possible
Collaboraons?
• A
model
of
scienfic
sensemaking:
Thesis:
joint
– Stories,
that
persuade
with
data
research?
– Discourse
segments
and
verb
tense
• Towards
claim-‐evidence
networks:
Labs:
research
collaboraons?
– Hedging
in
science
– Creang
claim-‐evidence
networks
• Data:
RDS:
joint
– Why
life
is
so
complicated
development?
– Connecng
experiments
into
collaboratories
34. References:
[1]
J
Am
Med
Inform
Assoc.
2010
September;
17(5):
514–518
hip://dx.doi.org/10.1136/jamia.2010.003947
[2]
Quanzhi
Li,
Yi-‐Fang
Brook
Wu
(2006):
Idenfying
important
concepts
from
medical
documents,
Journal
of
Biomedical
Informacs
39
(2006)
668–679
[3]
Useful
list
of
resources
in
bioinformacs
hip://www.bioinformacs.ca/
[4]
Biological
Expression
Language
–
hip://www.openbel.org
[5]
Latour,
B.
and
Woolgar,
S.,
Laboratory
Life:
the
Social
Construcon
of
Scienfic
Facts,
1979,
Sage
Publicaons
[6]
Light
M,
Qiu
XY,
Srinivasan
P.
(2004).
The
language
of
bioscience:
facts,
speculaons,
and
statements
in
between.
BioLINK
2004:
Linking
Biological
Literature,
Ontologies
and
Databases
2004:17-‐24.
[7]
Wilbur
WJ,
Rzhetsky
A,
Shatkay
H
(2006).
New
direcons
in
biomedical
text
annotaons:
definions,
guidelines
and
corpus
construcon.
BMC
Bioinformacs
2006,
7:356.
[8]
Thompson
P.,
Venturi
G.,
McNaught
J,
Montemagni
S,
Ananiadou
S.
(2008).
Categorising
modality
in
biomedical
texts.
Proc.
LREC
2008
Wkshp
Building
and
Evaluang
Resources
for
Biomedical
Text
Mining
2008.
[9]
Kim,
S-‐M.
Hovy,
E.H.
(2004).
Determining
the
Senment
of
Opinions.
Proceedings
of
the
COLING
conference,
Geneva,
2004.
[10]
de
Waard,
A.
and
Schneider,
J.
(2012)
Formalising
Uncertainty:
An
Ontology
of
Reasoning,
Certainty
and
Airibuon
(ORCA),
Semanc
Technologies
Applied
to
Biomedical
Informacs
and
Individualized
Medicine
workshop
at
ISWC
2012
(submiYed)
[11]
Data2Semancs
project:
hip://www.data2semancs.org/
[12]
Boyce
R,
Collins
C,
Horn
J,
Kalet
I.
(2009)
Compung
with
evidence
Part
I:
A
drug-‐mechanism
evidence
taxonomy
oriented
toward
confidence
assignment.
J
Biomed
Inform.
2009
Dec;42(6):979-‐89.
Epub
2009
May
10,
see
also
hip://dbmi-‐icode-‐01.dbmi.pii.edu/dikb-‐evidence/front-‐page.html