Damian Gjurovski

(AG Database and Information Systems, Prof. Michel)
hosted by PhD Program in CS @ TU KL

"Query processing over massive schema-free data"

The past years have witnessed a major shift from traditional data management over mostly relational data, toward various application-tailored data formats without fixed schema. Thus, this thesis firstly focuses on computing natural joins over massive streams of JSON documents that do not adhere to a specific schema. By proposing an efficient and scalable partitioning algorithm that uses the main principles of association analysis, patterns of co-occurrence of the attribute-value pairs within the documents are identified. Data is accordingly forwarded and joined using a novel FP-tree–based join algorithm, allowing compact storing and efficient traversing. In this talk a broadening of the future research area is proposed, including an extension of query processing approaches for local joins or other data formats, such as knowledge graphs. Through extensive experiments the purpose and performance of the created algorithms are shown, and finally, a discussion is conducted touching on the topics for the future of the thesis.


Time: Monday, 22.06.2020, 15:30
Place: https://bbb.rlp.net/b/mid-wdt-qt2
Video:

Termin als iCAL Datei downloaden und in den Kalender importieren.