2007-08-13
Introduction
There are a few areas where the SPARQL specification is unclear or has changed, necessitating a clarification of twinql's behavior. The purpose of this document is to detail each of these areas.
Clarifications and deviations
- Blank node labels in SPARQL are scoped to their enclosing basic graph pattern. It is an error to reference the same blank node label outside of that basic graph pattern. twinql does not signal an error in this circumstance: you are responsible for ensuring that the rules around blank node labels are followed. For example, against the following triples:
rdf:type rdf:type rdfs:Property . rdf:type rdf:foo "foo" . rdfs:Property rdf:foo "bar" .This query:
SELECT ?foo { { _:x a ?type } { _:x ?p ?foo } }will return the following results:
<?xml version="1.0"?> <sparql xmlns="http://www.w3.org/2005/sparql-results#"> <head> <variable name="foo"/> </head> <results ordered="false" distinct="false"> <result> <binding name="foo"> <uri>http://www.w3.org/2000/01/rdf-schema#Property</uri> </binding> </result> <result> <binding name="foo"> <literal>foo</literal> </binding> </result> </results> </sparql>The correct behavior here is for an error to be signaled. If an error is not signaled, each use of
_:xshould regardless represent a different blank node, which is not the case. You might find that the provided behavior is more useful. - Blank nodes in earlier SPARQL documents could be placed in the predicate position. The editors' draft is explicit that they can only be used in subject and object positions. twinql conforms to the earlier specification, and allows you to use blank nodes in the predicate position in your queries. This is a strict superset of the current specification; all valid queries written against the current draft specification will execute correctly.
Other points of note
- There is an outstanding SPARQL issue regarding the semantics of unbound variables in
OPTIONAL, described here. This has not yet been addressed by the DAWG.twinql will behave as follows:
PREFIX ex: <http://example.org/> SELECT ?s FROM <data.rdf> WHERE { OPTIONAL { ex:a ex:b ?s } }→ one result, \{ s : unbound \}.
PREFIX ex: <http://example.org/> SELECT ?s FROM <data.rdf> WHERE { OPTIONAL { ex:a ex:b ?s } FILTER BOUND(?s) }→ no results.
The reasoning is that the main body of the expression, including the optional, produces one result, in which
?sis not bound. The querier obviously wishes to remove all results in which?shas no value, so twinql discards that result. Similarly, the behavior of a SPARQL engine when an unbound variable is passed to a filter expression is not precisely defined. The document states that all arguments (apart from, obviously, those to
BOUND) must be RDF terms, which suggests that it is an error to pass an unbound variable.It is important to note that the empty graph pattern matches any graph. This manifests itself in the production of one completely empty result, if no other patterns apply.
There is no prohibition in the specification on the matching of literals by blank nodes in the query pattern. In this respect, blank nodes behave exactly like variables with the exception that they cannot be part of the results bindings.
Reports of any unusual or incorrect behavior are welcome.