Salesforce Miami User Group Event - 1st Quarter 2024
LOD2 Webinar: SIREn
1. Creating Knowledge out of Interlinked
Data
LOD2 Webinar . 24.06.2014 . Page 1 http://lod2.eu
2. Creating Knowledge out of Interlinked
Data
http://lod2.eu
LOD2 is a large-scale integrating project co-funded by the European
Commission within the FP7 Information and Communication Technologies
Work Programme. This 4-year project comprises leading Linked Open
Data technology researchers, companies, and service providers. Coming
from across 12 countries the partners are coordinated by the Agile
Knowledge Engineering and Semantic Web Research Group at the
University of Leipzig, Germany.
LOD2 will integrate and syndicate Linked Data with existing large-scale
applications. The project shows the benefits in the scenarios of Media and
Publishing, Corporate Data intranets and eGovernment.
LOD2 Webinar . 24.06.2014 . Page 2 http://lod2.eu
3. Creating Knowledge out of Interlinked
Data
http://lod2.eu
Once per month the LOD2 webinar series offer a free webinar about
tools and services along the Linked Open Data Life Cycle.
Stay with us and learn more about acquisition, editing, composing,
connected applications – and finally publishing Linked Open Data.
LOD2 Webinar . 24.06.2014 . Page 3 http://lod2.eu
4. Creating Knowledge out of Interlinked
Data
Agenda
• Nested Data Model
• SIREn Overview
• Getting Started with the SIREn Elasticsearch Plugin
• Demo
LOD2 Webinar . 24.06.2014 . Page 4 http://lod2.eu
5. Creating Knowledge out of Interlinked
Data
Schema-Less Nested Data Model
• Model becoming prevalent: JSON, XML, Avro, …
– Can be arbitrarily nested and large
– No strict schema / structure enforced
• Schema-less brings
– Flexibility
– Ease of development
• Developers do not have to invest significant modelling
effort upfront
LOD2 Webinar . 24.06.2014 . Page 5 http://lod2.eu
6. Creating Knowledge out of Interlinked
Data
Introducing SIREn
• Lucene, Solr and Elasticsearch plugin for indexing and
searching JSON
• Rich data model (JSON)
– Nested objects, nested arrays, datatypes
– Generic architecture compatible with various nested data models: JSON,
JSON-LD, XML, Avro, ...
• Schema-agnostic
– SIREn does not require any schema definition to index and search data
– Schema definition can change across records
• Designed from the ground up for high performance and
scalability
LOD2 Webinar . 24.06.2014 . Page 6 http://lod2.eu
7. Creating Knowledge out of Interlinked
Data
Introducing SIREn
LOD2 Webinar . 24.06.2014 . Page 7 http://lod2.eu
8. Creating Knowledge out of Interlinked
Data
Elasticsearch - Overview
• Document-oriented search and analytics engine
– JSON Document
– Based on Apache Lucene
• Distributed, Replication
– High Performance and Availability
• REST API
LOD2 Webinar . 24.06.2014 . Page 8 http://lod2.eu
9. Creating Knowledge out of Interlinked
Data
Elasticsearch – Basic Concepts
• Index = Collection of Documents
– Can have multiple shards and replicas
• Type = A set of documents sharing the same schema
– Similar to a DB table
• Document = JSON object
– Uniquely identified (index/type/id)
– Similar to a DB record
LOD2 Webinar . 24.06.2014 . Page 9 http://lod2.eu
10. Creating Knowledge out of Interlinked
Data
SIREn – Basic Concepts
• JSON object = Tree
– Different mapping available
• Node = An element of the tree
– Can have a parent and one or more children
– Contains data: text, numeric, boolean
LOD2 Webinar . 24.06.2014 . Page 10 http://lod2.eu
12. Creating Knowledge out of Interlinked
Data
Getting Started with Elasticsearch & SIREn
http://sirendb.com/downloads/
(Elasticsearch Distribution Coming Soon)
LOD2 Webinar . 24.06.2014 . Page 12 http://lod2.eu
24. Creating Knowledge out of Interlinked
Data
Searching: Twig Query
• Query operators for Ancestor-Descendant and Parent-
Child relationships
LOD2 Webinar . 24.06.2014 . Page 24 http://lod2.eu
25. Creating Knowledge out of Interlinked
Data
Searching: Twig Query
• Query operators for Ancestor-Descendant and Parent-
Child relationships
• Consists of a root query
Boolean
LOD2 Webinar . 24.06.2014 . Page 25 http://lod2.eu
26. Creating Knowledge out of Interlinked
Data
Searching: Twig Query
• Query operators for Ancestor-Descendant and Parent-
Child relationships
• Consists of a root query and one or more child
Boolean
Phrase
MUST
LOD2 Webinar . 24.06.2014 . Page 26 http://lod2.eu
27. Creating Knowledge out of Interlinked
Data
Searching: Twig Query
• Query operators for Ancestor-Descendant and Parent-
Child relationships
• Consists of a root query and one or more child and
descendant queries
Boolean
Phrase
MUST
Boolean
SHOULD
LOD2 Webinar . 24.06.2014 . Page 27 http://lod2.eu
28. Creating Knowledge out of Interlinked
Data
Searching: Twig Query
• Query operators for Ancestor-Descendant and Parent-
Child relationships
• Consists of a root query and one or more child and
descendant queries
• Can be nested to form complex tree structure
Boolean
Phrase
MUST
Twig
NOT
Range
MUST
Boolean
SHOULD
LOD2 Webinar . 24.06.2014 . Page 28 http://lod2.eu
38. Creating Knowledge out of Interlinked
Data
Demo
LOD2 Webinar . 24.06.2014 . Page 38 http://lod2.eu
39. Creating Knowledge out of Interlinked
Data
Conclusion
• SIREn’s Key Feature
– Dynamic/Schema-less Data Management
– Nested Data
– High performance and scalability
– Powerful search operators
– Elasticsearch/Solr integration
• Contact
– http://sirendb.com
– SindiceTech
– info@sindicetech.com
LOD2 Webinar . 24.06.2014 . Page 39 http://lod2.eu
40. Creating Knowledge out of Interlinked
Data
Credits
Jingle R.E.M., Martin Kaltenböck, Florian Kondert
Coordination Thomas Thurner
Martin Kaltenböck
Moderation Martin Kaltenböck
Presented by Renaud Delbru, Harish Kumar
LOD2 Webinar . 24.06.2014 . Page 40 http://lod2.eu
41. Creating Knowledge out of Interlinked
Data
http://lod2.eu
Hope you enjoyed staying with us – if you need more detailed
information, visit us at www.lod2.eu and let us know how we can
improve to meet your expectations!
Don’t forget to register for our next webinar
20.12. 2011 - Virtuoso (Open Link Software)
24.01. 2012 - OntoWiki (University of Leipzig, Germany)
Have a great day and don’t forget ...
LOD2 Webinar . 24.06.2014 . Page 41 http://lod2.eu
42. Creating Knowledge out of Interlinked
Data
http://lod2.eu
LOD2 Webinar . 24.06.2014 . Page 42 http://lod2.eu
Notas del editor
Binary including full distribution of elasticsearch, with SIREn pre-installed.
Full json document will be indexed both in elasticsearch, and SIREn
Need to give an example here