Distributed Analytics
Over the past decade, vast amounts of machine-readable structured
information have become available through the increasing popularity of
semantic knowledge graphs in a variety of application domains.
However, a major and yet unsolved challenge that research faces today is
to perform scalable analytics, i.e. machine learning, inference and
querying, of this knowledge while taking into account its rich semantic
structures. Current analytics methods are, to our knowledge, either not
fully aware of the
semantics and structure of knowledge graphs or scale insufficiently.
The aim of this research line, in particular our SANSA project,
is to research whether this severe limitation can be overcome by jointly
leveraging results from distributed analytics and semantic technologies.
To achieve this, SANSA will advance the state of the art by
developing foundational models and algorithms in (1) data distribution
techniques for semantic knowledge graphs, (2) semantics-aware distributed
computation of resource embeddings in knowledge graphs, (3) adaptive
distributed querying, (4) efficient self-optimising inference execution
plans and (5) distributed symbolic machine learning approaches.
These advancements will be implemented as a semantic analytics stack
which uses distributed in-memory computing models as the foundation and
includes further layers for (1) knowledge distribution and representation,
(2) querying and inference as well as (3) machine learning.
By design, each layer will be both semantics aware and horizontally scalable.
The synthesis of the above advancements can enable powerful analytics
which impact on several application areas, including life sciences
(e.g. improved therapy response prediction), media and publishing
(e.g. entity resolution and semantic querying) and the internet of
things (e.g. smart meter optimisation, traffic pattern detection).
Conference Publications
- "Sparklify: A Scalable Software Component for Efficient evaluation of SPARQL queries over distributed RDF datasets"
Claus Stadler, Gezim Sejdiu, Damien Graux, Jens Lehmann
In: Proceedings of the 18th International Semantic Web Conference (ISWC), 2019
- "A Scalable Framework for Quality Assessment of RDF Datasets"
Gezim Sejdiu, Anisa Rula, Jens Lehmann, Hajira Jabeen
In: International Semantic Web Conference, Springer, 2019
- "Towards a Scalable Semantic-Based Distributed Approach for SPARQL Query Evaluation"
Gezim Sejdiu, Damien Graux, Imran Khan, Ioanna Lytra, Hajira Jabeen, Jens Lehmann
In: Proceedings of the 15th Semantics - The power of AI and knowledge graphs (Semantics), Springer, 2019
- "Divided we stand out! Forging Cohorts fOr Numeric Outlier Detection in large scale knowledge graphs (CONOD)"
Hajira Jabeen, Rajjat Dadwal, Gezim Sejdiu, Jens Lehmann
In: Proceedings of the 21st International Conference on Knowledge Engineering and Knowledge Management, 2018
- "DistLODStats: Distributed Computation of RDF Dataset Statistics"
Gezim Sejdiu, Ivan Ermilov, Jens Lehmann, Mohamed Nadjib Mami
In: Proceedings of the 17th International Semantic Web Conference (ISWC), Springer, 2018
- "Managing Lifecycle of Big Data Applications"
Ivan Ermilov, Axel-Cyrille Ngonga Ngomo, Aad Versteden, Hajira Jabeen, Gezim Sejdiu, Giorgos Argyriou, Luigi Selmi, Jürgen Jakobitsch, Jens Lehmann
In: Proceedings of the 8th International Conference on Knowledge Engineering and Semantic Web (KESW), Springer, 2017
- "Distributed Semantic Analytics using the SANSA Stack"
Jens Lehmann, Gezim Sejdiu, Lorenz Bühmann, Patrick Westphal, Claus Stadler, Ivan Ermilov, Simon Bin, Nilesh Chakraborty, Muhammad Saleem, Axel-Cyrille Ngonga Ngomo
In: Proceedings of the 16th International Semantic Web Conference (ISWC), pages 147-155, Springer, 2017
- "The BigDataEurope Platform - Supporting the Variety Dimension of Big Data"
Sören Auer, Simon Scerri, Aad Versteden, Erika Pauwels, Angelos Charalambidis, Stasinos Konstantopoulos, Jens Lehmann, Hajira Jabeen, Ivan Ermilov, Gezim Sejdiu, Andreas Ikonomopoulos, Spyros Andronopoulos, Mandy Vlachogiannis, Charalambos Pappas, Athanasios Davettas, Iraklis A. Klampanos, Efstathios Grigoropoulos, Vangelis Karkaletsis, Victor de Boer, Ronald Siebes, Mohamed Nadjib Mami, Sergio Albani, Michele Lazzarini, Paulo Nunes, Emanuele Angiuli, Nikiforos Pittaras, George Giannakopoulos, Giorgos Argyriou, George Stamoulis, George Papadakis, Manolis Koubarakis, Pythagoras Karampiperis, Axel-Cyrille Ngonga Ngomo, Maria-Esther Vidal
In: Proceedings of the 17th International Conference on Web Engineering (ICWE), Lecture Notes in Computer Science, Springer, 2017
- "Distributed Big Data platform for Life Sciences"
Hajira Jabeen, Jens Lehmann
In: Proceedings of the 1st KAUST Research Conference on Computational and experimental interfaces of Big Data and Biotechnology (CBRC-Conf), 2016
Workshop Publications
Demos
- "Clustering Pipelines of Large RDF POI Data"
Rajjat Dadwal, Damien Graux, Gezim Sejdiu, Hajira Jabeen, Jens Lehmann
In: Extended Semantic Web Conference Posters and Demos, 2019
- "The Hubs and Authorities Transaction Network Analysis using the SANSA framework"
Danning Sui, Gezim Sejdiu, Damien Graux, Jens Lehmann
In: Joint Proceedings of the Posters and Demos Track of the International Conference on Semantic Systems 2019, 2019
- "SPIRIT: A Semantic Transparency and Compliance Stack"
Patrick Westphal, Javier Fernandez, Sabrina Kirrane, Jens Lehmann
In: Semantics 2018 Poster and Demos, 2018
- "STATisfy Me: What are my Stats?"
Gezim Sejdiu, Ivan Ermilov, Jens Lehmann, Mohamed Nadjib Mami
In: Proceedings of the 17th International Semantic Web Conference (ISWC), 2018 - Posters & Demos, 2018
- "Profiting from Kitties on Ethereum: Leveraging Blockchain RDF with SANSA"
Damien Graux, Gezim Sejdiu, Hajira Jabeen, Jens Lehmann, Danning Sui, Dominik Muhs, Johannes Pfeffer
In: Proceedings of 14th International Conference on Semantic Systems, 2018
- "The Tale of Sansa Spark"
Ivan Ermilov, Jens Lehmann, Gezim Sejdiu, Lorenz Bühmann, Patrick Westphal, Claus Stadler, Simon Bin, Nilesh Chakraborty, Henning Petzka, Muhammad Saleem, Axel-Cyrille Ngonga Ngomo, Hajira Jabeen
In: Proceedings of 16th International Semantic Web Conference, Poster & Demos, Springer, 2017