# Best Thesis Award of the EMCL

## 2016: Medina Andresel

### Title of master thesis: A Compilation Technique for Interactive Ontology-mediated Data Exploration

**Abstract: **Ontologies have been extensively advocated in the last decade as a means to enable a shared understanding of resources between different users and applications. One of their championed applications is Ontology-based Data Access (OBDA), where ontologies mediate access to data sources, providing a high-level conceptual view of the data and making background knowledge available for reasoning at query time. Much research efforts in the last years have been dedicated to the core problem in OBDA: answering a query while leveraging the knowledge given by an ontology, sometimes called ontology mediated query answering (OMQA). However, most work focuses on providing algorithms for answering queries one-at-a-time. In this thesis, we make a first step towards interactive answering queries, that are syntactically related, and exploring targeted fragment of data, in the ontology-mediated setting. We introduce a technique to construct an offline compilation that allows, in online phase, to effciently answer (without accessing the original data source) different variations of queries that fall between two query bounds (given as input to capture user's information needs from above and below). We consider ontologies formalized using DL-Lite - the most popular ontology language, and we focus on conjunctive queries (CQs) that are tree-shaped and have one answer variable - relevant for practical purposes by allowing to capture objects that can have complex combinations of properties. The compilation represents the underpinning for constructing relevant query modiffications, which help the user analyze the data by reducing or increasing the number of answers in a minimal way, or identifying all most specific common properties that the answers of a given query share. The experiments carried out with our prototype implementation reveal an overall good performance of the query-answering procedure and, most importantly, our solution seems to be a promising first step towards flexible ontology-mediated data exploration.

## 2015: Adrian Rebola-Pardo

### Title of master thesis: Unsatisfiability Proofs in SAT Solving with Parity Reasoning

**Abstract:** SAT solvers have become very efficient in the last two decades, due to the use of many techniques. Because of the increase in code complexity, measures to increase reliability of results reported by SAT solvers were implemented. Currently, state-of-the-art SAT solvers are able to produce unsatisfiability proofs when confronted with an unsatisfiable instance. However, it is not known how to generate unsatisfiability proofs for a few very proficient techniques, and this situation forces these techniques to be disabled when unsatisfiability proofs are required. One such technique is parity reasoning, which is essential to the application of SAT solvers to cryptography. In this talk, we solve the problem of generating unsatisfiability proofs for SAT solvers with integrated parity reasoning. In doing so, some theoretical contributions are proposed, including a generalized framework for proof systems that is able to model the DRAT proof standard, currently used by state-of-the-art solvers.

## 2014: Tobias Kaminski

### Title of master thesis: Hybrid MKNF Knowledge Bases Under Paraconsistent Well-Founded Semantics

**Abstract:** Ontologies formalized by means of Description Logics (DLs) and rules in the form of Logic Programs (LPs) are two prominent formalisms in the field of Knowledge Repre- sentation and Reasoning. While DLs adhere to the Open World Assumption and are suited for taxonomic reasoning, LPs implement reasoning under the Closed World Assumption, so that default knowledge can be expressed. However, for many applications it is useful to have a means that allows reasoning over an open domain and expressing rules with exceptions at the same time. Hybrid MKNF knowledge bases make such a means avail- able by formalizing DLs and LPs in a common logic, the Logic of Minimal Knowledge and Negation as Failure (MKNF).Since rules and ontologies are used in open environments such as the Semantic Web, inconsistencies cannot always be avoided. This poses a problem due to the Principle of Ex- plosion, which holds in classical logics. Paraconsistent Logics offer a solution to this issue by assigning meaningful models even to contradictory sets of formulas. Consequently, paraconsistent semantics for DLs and LPs have been investigated intensively. Our goal is to apply the paraconsistent approach to the combination of DLs and LPs in hybrid MKNF knowledge bases.In this thesis, a new six-valued semantics for hybrid MKNF knowledge bases is intro- duced, extending the three-valued approach by Knorr et al., which is based on the well- founded semantics for logic programs. Additionally, a procedural way of computing paraconsistent well-founded models for hybrid MKNF knowledge bases by means of an alternating fixpoint construction is presented and it is proven that the algorithm is sound and complete w.r.t. the model-theoretic characterization of the semantics. Moreover, it is shown that the new semantics is faithful w.r.t. well-studied paraconsistent semantics for DLs and LPs, respectively, and maintains the efficiency of the approach it extends

## 2013: Fariz Darari

### Title of master thesis: Completeness Reasoning for Linked Data Queries

**Abstract:** With the availability of a huge amount of RDF data sources on the Web covering heterogeneous domains, it is increasingly important to provide descriptions of their data quality, which is varied each other. An important dimension of data quality is data completeness, which concerns about how complete an RDF data source is. With this measurement, users can then decide which RDF data sources meet their needs. In this thesis, we propose a completeness reasoning framework for Linked Data queries. We present a formalism to describe the completeness of RDF data sources and show how existing RDF data sources can be described with machine-readable completeness statements. We then investigate the problem of query completeness and study how completeness statements about RDF data sources can infer query completeness. We consider various data settings and query settings for completeness reasoning. Finally, we develop CoRner, a completeness reasoning system based on our completeness reasoning framework.

## 2012: Ronald de Haan

### Title of master thesis: Description Logic Based Reasoning on Programming Languages

**Abstract:** Imperative programming languages are ubiquitous in virtually all fields of technology, with programs specifying all sorts of computational behavior. For many practical reasons, an automated analysis of semantic properties of programs, such as termination and equivalence, is desirable. We provide a new approach to the automated semantic analysis of programs by encoding their behavior into formal logic. We consider a few syntactically simple imperative programming languages, and we encode programs of these languages into expressions of the description logic ALC(D) for a particular domain D. We do this in such a way that models of these encodings correspond to executions of the source programs. In other words, essentially, we assign a model-theoretic semantics to imperative programs. This encoding makes it possible to express semantic properties of programs (most notably termination and equivalence) in the formal logic language. Effectively, in this fashion, we reduce reasoning problems defined on the programs to description logic reasoning. Practically, this directly results in algorithms to perform automated reasoning on a number of restricted fragments of the programming languages (i.e. loop-free programs, or programs restricted to a finite numerical domain). Theoretically, our approach makes it possible to identify further, less restricted fragments of the programming languages for which certain reasoning tasks are decidable. We identify one such fragment, based on finite partitionings of the state space, and illustrate what class of programs belongs to this fragment.

## 2011: Giorgio Stefanoni

### Title of master thesis: Explaining Query Answers in Lightweight Ontologies: The DL-Lite Case

**Abstract:** Accessing data through ontologies expressed in Description Logics (DL) is important for the realization of the SemanticWeb. DL-LiteA, member of the DL-Lite family of DLs, is particularly suitable for answering Conjunctive Queries over large data sets. In order to meet usability requirements, most logic-based applications provide explanation facilities for reasoning services. This holds also for DL-LiteA, where research focused on the explanation of both TBox reasoning and, more recently, query answering. This thesis consists of a complete study on the latter problem, hence, we study both explanations for the absence and the presence of a tuple in the answers.

Unfortunately, it happens that users are not always provided with the result they are expecting when querying an ontology. Then, it is important to provide them with an high-level motivation for the absence of a tuple in the results, i.e., find an explanation for a negative answer. In fact, in contrast with standard database technology, in the ontology setting this problem is not easily solvable by looking at the structure of the data stored in the data-layer and, for example, relaxing the conditions imposed in the query. The reason is that the reasoning involved for answering queries, provides an additional layer obfuscating the way the data is accessed by the system. We address the problem of explaining negative answers for (conjunctive) query answering over DL-LiteA ontologies, by adopting abductive reasoning, that is, we look for additions to the ABox that force a given tuple to be in the result. As reasoning tasks, we consider existence and recognition of an explanation, and relevance and necessity of a certain assertion for an explanation. An important aspect in explanations is to provide the user with solutions that are simple to understand and free of redundancy, hence as small as possible. To address this requirement, we study various restrictions on solutions, in particular, we focus on subset minimal and cardinality minimal ones.

Besides explaining the absence of a tuple in a query answer, it is important to explain also why a given tuple is returned by the system. The presence of a tuple in the results of a query over a DL-LiteA ontology does not only depend on the data but also on the constraints expressed at the conceptual level. For this reason, it is important to provide users with insights on how conceptual information has been used to return a certain tuple in the answers. This problem has been tackled already in the literature, where an high-level procedure explaining positive answers to conjunctive queries over DL-LiteA ontologies is introduced. As now, the mentioned procedure has never been given a formal algorithmic counterpart and, therefore, it is not easily implementable in a real world system. For this reason, we close the gap between theory and practice by providing such an algorithmic solution to the problem. Also, we improve the given procedure by considering minimal explanations for positive answers. Further, this thesis contributes with a new use of this algorithm to solve the problem of explaining to domain users the inconsistency of a DL-LiteA knowledge base.

## 2010: Christian Drescher

### Title of master thesis: Symmetry Breaking for Answer Set Programming

**Abstract:** In the context of answer set programming, this work investigates symmetry detection and symmetry breaking to eliminate symmetric parts of the search space and, thereby, simplify the solution process. We contribute a reduction of symmetry detection to a graph automorphism problem which allows to extract symmetries of a logic program from the symmetries of the constructed coloured graph. The correctness of our reduction is rigorously proven. We also propose an encoding of symmetry-breaking constraints in terms of permutation cycles and use only generators in this process which implicitly represent symmetries and always with exponential compression. These ideas are formulated as preprocessing and implemented in a completely automated flow that first detects symmetries from a given answer set program, adds symmetry-breaking constraints, and can be applied to any existing answer set solver. We demonstrate computational impact on benchmarks versus direct application of the solver.

Furthermore, we explore symmetry breaking for answer set programming in two domains: first, constraint answer set programming as a novel approach to represent and solve constraint satisfaction problems, and second, distributed nonmonotonic multi-context systems. In particular, we formulate a translation-based approach to constraint answer set solving which allows for the application of our symmetry detection and symmetry breaking methods. To compare their performance with a-priori symmetry breaking techniques, we also contribute a decomposition of the global value precedence constraint that enforces domain consistency on the original constraint via the unit-propagation of an answer set solver. We prove correctness and evaluate both options in an empirical analysis. In the context of distributed nonmonotonic multi-context system, we develop an algorithm for distributed symmetry detection and also carry over symmetry-breaking constraints for distributed answer set programming.

## 2009: Carroline Dewi Puspa Kencana Ramli

### Title of master thesis: Logic Programs and Three-Valued Consequence Operators

**Abstract:** In this thesis, first we look for conditions under which both operators are continuous and when they acquire the property of being a contraction. We also introduce a level mapping characterisation of the new operator that puts it within the same framework with other three-valued semantics for logic programs, including the Fitting and well-founded semantics. Then we turn our attention to the underlying three-valued logic used to characterise these operators, dubbed the Fitting semantics. We will see that under this semantics, the model of completed program is not necessarily a model of the program itself. This happens because under Fitting semantics, the law of equivalence does not hold. We show that the {L}ukasiewicz semantics is a good candidate to replace Fitting semantics since it admits the law of equivalence while not changing the meaning or properties of logic programs. Further, we present the core method, a connectionist model generator for logic programs, that can easily be adapted to handle Stenning and van Lambalgen's approach. Finally, since under the new operator negative information is difficult to express in the program, we propose a number of approaches to add this kind of expressivity to the formalism.

## 2008: Jean Christoph Jung

### Title of master thesis: Value Orderings Based on Solution Counting

**Abstract:** Constraint satisfaction problems are typically solved using backtracking search. One key issue for backtracking is the selection of a good value, but comparably little effort was dedicated to this topic . Another important aspect of constraint programming besides looking for one solution is counting solutions either in the whole problem or in single constraints. Recent works give first ideas how to exploit the advances in solution counting in designing better value ordering heuristics. With this thesis we want to elaborate these in more depth.

We intend to make two main contributions. First, we provide a novel approach for solution counting in individual constraints, which results in a different view on counting in regular constraints and in counting algorithms for extensional and grammar constraints. Second, we consider the problem of a good value selection from different perspectives and derive several value ordering heuristics based on solution counting. We evaluate both the counting and the heuristics by solving rostering problems modelled with grammar and regular constraints.

## 2007: Novak Novakovic

### Title of master thesis: Proof-theoretic Approach to Deciding Subsumption and Computing Least Common Subsumer in EL w.r.t. Hybrid Tboxes

**Abstract:** In his master thesis he studied lightweight description logics which admit tractable reasoning and, in particular, he developed and implemented algorithms for deciding subsumption and computing least common subsumers in the description logic EL. As pointed out by one referee of his thesis, the contribution of Novak is non-trivial and beyond a standard master thesis. He used a number of diverse and novel techniques and showed significant creativity in applying techniques from theoretical computer science to novel cases and in developing completely new proof ideas. Novak thus sticks out as a highly talented and inspired student of computational logic, who is able to understand and solve difficult questions in an independent and efficient way.

## 2006: María Magdalena Ortíz de la Fuente

### Title of master thesis: Data Complexity of Query Answering in Expressive Description Logics

**Abstract:** DLs/ontologies are increasingly seen as a mechanism to query data repositories. This calls for investigating flexible query mechanisms over expressive knowledge bases. In particular, very expressive description logics must be considered (in oder to capture common data modeling constructs) in combination with well established query languages, such as those inspired by database technology. Moreover, since the extensional data (ABox) is usually much larger thanthe terminological component (TBox), it is crucial to study data complexity, i.e. to single out the contribution of the ABox in the overall complexity of reasoning, in order to optimize the inference techniques with respect to the data size. In this talk, we will present a tableaux-based algorithm for checking entailment of conjunctive queries and unions of conjunctive queries in the DLs SHIQ and SHOIQ. The standard blocking conditions of the SHIQ/SHOIQ tableau are replaced by some modified ones, which make it suitable for answering all queries that do not contain transitive roles. The algorithm provides a coNP upper bound in data complexity. This, together with a long known matching lower bound for logics even weaker than ALC, shows that a wide range of DLs are coNP complete in data complexity. Some other applications of the novel tableaux-based algorithm are discussed. Notably, since the algorithm is inspired by work on hybrid knowledge bases, it allows us to extend the CARIN family of languages to CARIN-SHOIQ, a hybrid knowledge representation language combining very expressive DLs and non-recursive Datalog rules, and provide a sound and complete reasoning algorithm for it.