Module reformulation-core
Result Sets
Package answering.resultset
provides interfaces defining the possible results of Ontop queries:
- any query result object (
OBDAResultSet
) extendsCloseable
and must be released after use; - ignoring intermediate levels, there are three type of results:
BooleanResultSet
(a boolean value),TupleResultSet
andSimpleGraphResultSet
(iterators overOntopBindingSet
andAssertion
instances); - interfaces
OntopBindingSet
andOntopBinding
mirror RDF4J interfacesBindingSet
andBinding
and model a result tuple and a variable binding within that tuple; -
AbstractOntopBindingSet
andOntopBindingImpl
provide complete implementations of bindings, based on aLinkedHashMap
(declared abstract although fully implemented).
Queries
Package answering.reformulation.input
defines the possible queries in Ontop (based on result set interfaces) with factory interfaces and a low-level utility class:
- any
InputQuery
has a query string, a result typeR
(generic parameter), and the capability to be transformed into an intermediate representationIQ
, which is done by methodInputQuery.translate()
with the support of anInputQueryTranslator
that currently must be anRDF4JInputQueryTranslator
; - a
ConstructQuery
has aConstructTemplate
that reuses RDF4J interfaces (ProjectionElemList
andExtension
) to encode the CONSTRUCT clause, plus apparently support for BIND clauses; - there are two query factories
InputQueryFactory
andRDF4JInputQueryFactory
that are pretty equivalent in capabilities if not thatInputQueryFactory
may also parse a query without knowing in advance its type; - factories are available via dependency injection and the first
InputQueryFactory
internally delegates to an instance of the latterRDF4JInputQueryFactory
; -
RDF4JInputQueryFactory
implements query interfaceX
with classRDF4JX
that includes an RDF4JParsedQuery
object in addition to the query string; this object is then passed to theRDF4JInputQueryTranslator
for translation toIQ
; - utility class
SPARQLQueryUtility
provides constants for SPARQL keywords, query identification methods, utility methods for DESCRIBE queries, and utility methods for CONSTRUCT queries.
Query Reformulator
The QueryReformulator
is the central component of Ontop
. It provides an InputQueryFactory
for creating Ontop queries, and a reformulateIntoNativeQuery()
method that reformulates the input query into an IQ
that can be submitted to the wrapped source (e.g., an IQ
with a native SQL query for a relational DB), depending on how the various reformulation steps and sub-components have been configured.
A QueryReformulator
can be obtained from an OBDASpecification
through factory ReformulationFactory
(built by GUICE), which returns an instance of QuestQueryProcessor
that is being injected other sub-components (e.g., a queryRewriter
) and is initializated according to the following steps:
- queryRewriter.setTBox(obdaSpecification.getSaturatedTBox())
- queryUnfolder = translationFactory.create(obdaSpecification.getSaturatedMapping())
- nativeQueryGenerator = translationFactory.create(obdaSpecification.getDBParameters())
Given InputQuery q
, method QueryReformulator.reformulateIntoNativeQuery()
proceeds along the following steps:
- translation:
iq1 = q.translate(inputQueryTranslator)
- rewriting:
iq2 = queryRewriter.rewrite(iq1)
- unfolding:
iq3 = queryUnfolder.optimize(iq2)
- optimization:
iq4 = generalStructuralAndSemanticIQOptimizer.optimize(iq3, executorRegistry)
- planning:
iq5 = queryPlanner.optimize(iq4, executorRegistry)
- generation:
iq6 = nativeQueryGenerator.generateSourceQuery(iq5)
- caching:
queryCache.put(q, iq6)
Method getRewritingRendering()
performs only translation and rewriting and returns a string representation of the obtained IQ
.
A QueryCache
is used to cache and reuse previous reformulation results. Two implementations DummyQueryCache
doing nothing and GuiceBasedQueryCache
based on a GUAVA Cache
are provided and configured via dependency injection.
Query Translation
Interface InputQueryTranslator
has no methods (acts as marker) whereas its sub-interface RDF4JInputQueryTranslator
converts from the RDF4J SPARQL algebra to the Ontop algebra IQ
, using Triple
and Quad
atoms for statement patterns. Two conversion methods translate()
and translateAskQuery()
are provided, the latter due to the fact an ASK query cannot be detected by its ParsedQuery
object alone. The interface is implemented by RDF4JInputQueryTranslatorImpl
.
Query Rewriter
Interface QueryRewriter
specifies a component for rewriting queries accounting for the class and property subsumption axioms of a (classified, i.e., closed under subsumption) TBox, which is supplied via method setTBox()
.
Since mapping are already saturated based on TBox, no actual rewriting of intensional Triple / Quad atoms is necessary. What is done, instead, is to drop atoms that can be inferred from (are in the chase of) other query atoms (e.g., drop ?x a :Agent
if the query contains ?x a :Person
and the TBox contains :Person rdfs:subClassOf :Agent
). This rewriting is implemented by class DummyRewriter
.
Sub-interface ExistentialQueryRewriter
and its implementation TreeWitnessRewriter
provide support also for existential reasoning.
Query Unfolder
Interface QueryUnfolder
specifies a component for replacing intensional Triple / Quad atoms in a query IQ
with their corresponding (unions of) saturated mappings IQ
s. The transformation is done via method optimize()
inherited from IQOptimizer
.
The interface is instantiated via factory TranslationFactory
and implemented by class BasicQueryUnfolder
, which extends AbstractIntensionalQueryMerger
and at costruction-time accepts a Mapping
object that provides all the saturated mapping definitions.
Native Query Generator
Interface NativeQueryGenerator
, obtainable via factory TranslationFactory
, specifies a component for rewriting a query IQ
to a corresponding IQ
including a NativeNode
with the query to submit to the wrapped source (e.g., a SQL one). The interface implementation depends on the particular backend and thus it is implemented outside the module, and specifically in module reformulation-sql
for relational DBs. Within this module, a PostProcessingProjectionSplitter
component with its implementaiton PostProcessingProjectionSplitterImpl
is defined for use by other modules.
Remarks
- distinction between
GraphResultSet
andSimpleGraphResultSet
is unclear - it's unclear why
AbstractOntopBindingSet
is made abstract -
TupleSPARQLQuery
is never used alone (onlySelectQuery
is used) so it may be removed - why is an
Extension
needed inConstructTemplate
? why not including it in the query body?