6. MRC Stratified Medicine:
Security and Data Sharing Models
“Controlled Data hub”
• Data can be downloaded
Fully audit trail
• Analysis at partner sites
• Background data by CDA
Pros
• Available now
• More agile analysis
• Simpler admin
• Analyst freedom locally
Cons
• Less secure
“Analysis safe haven”
• Data cannot be downloaded
• All analysis on safe haven
Pros
• Secure
Cons
• Complex server admin
• Less flexible analysis
• VM access can be slow
• Solution not available yet
• Resource intensive (high cost)
Open ClosedShared Principles
• Standardised, versioned data
• Access based on contribution “give more, get more”
• Access control by data set
7. PSORT/RAMAP Controlled Data Hub
TranSMART
Data QC
Secure Data
Sharing
Data
Standardisation
Data Integration
Data Export
Simple data
analysis
Analysis on Data Exports using local
Compute resources
• R statistical analysis etc
• Machine Learning
• GWAS
• High-dimensional
• High Performance computing
pipelined
analysis (plugins)
R backend
8. MATURA “Analysis Safe Haven”
Analysis on Data Exports using
Elastic compute resources on VM
• Standardised reference data sets
• R statistical analysis etc
• Machine Learning
• GWAS
• High-dimensional analysis
• High Performance computing
Data
Exports
MATURA
CMB Approval
Public
Data
Analysis VM
9. MRC Stratified Medicine:
Infrastructure Evolution
MRC
eMEDLAB
VM
hosting
•Central data centre
•Open Stack Architecture
•TranSMART hosting
•Secure data access
•Elastic compute for all analysis
Sept2015Now
•QMUL data centre
•TranSMART hosting
•Secure data access
•Limited analysis