Object Oriented Queries Over Software Systems
ACM SIGPLAN 2007 Workshop on Partial Evaluation and Program Manipulation
Invited Talk by Oege de Moor
Joint work with Elnar Hajiyev, and Mathieu Verbaere
Abstract
Code queries are useful for enforcing coding conventions, navigating a large code base, and
for identifying locations to refactor. The program understanding community has long advocated
the use of a relational database to facilitate such code queries, but relational queries over code
have not found widespread use.
We argue this is due to a lack of scalability (it takes too long to evaluate queries), and a lack
of a modern query language with all the tool support that entails (writing code queries in SQL
can take many pages). In this talk we demonstrate a new system that goes some way towards
alleviating both problems.
First, we demonstrate IQL, an object-oriented query language. It allows the concise expression of
complex queries; due to its object-oriented features, it is easy to extend and modify existing
queries. Another benefit of object-orientation is that it gives natural tool support, for instance
for auto-completion.
The semantics of IQL is defined by translation to Datalog. Datalog is a logic programming language
like Prolog, but it lacks data structures. As a consequence, all queries in Datalog are guaranteed
to terminate, and they have a very simple semantics, enabling aggressive optimisations.
While Datalog has been extensively studied in the theoretical database community, it lacks
a proper type system. We introduce a type system with subtyping to account for the class hierarchy
of IQL. Type inference is achieved via an abstract interpretation that maps each relation between runtime
values to a relation between binding sites.
We implement Datalog itself via a compiler to procedural SQL; a configuration file allows us one
target different relational database systems as the backend. The compiler performs many
optimisations, ranging from straightforward constant propagation to a form of the well-known
`magic sets' transformation. It also performs type specialisation, based on the abstract
interpretation in the type inference algorithm.