| Lehr- und Forschungseinheit für Programmier- und Modellierungssprachen |
The evaluation of XPath queries against XML streams can require in general memory linear in the size of the input XML stream [SPEX]. This happens in cases where potential answers to queries have to be buffered until the decision regarding their appartenance to the result can be met. However, such a time-limited buffering is not always possible, especially when the XML stream is unbounded or infinite. In such cases, the evaluation of queries could require unbounded (or even infinite) memory. Several approaches are proposed to ameliorate this situation, mainly focusing on enforcing hard memory constraints for the query evaluation, e.g., [SAQL], at the price of retrieving approximate instead of exact answers.
It would be interesting to identify classes of XPath queries, for which there can be always inferred a bound on the required memory for query evaluation. More precisely, such classes of queries should be syntactically characterizable and memory bounds should be inferrable as functions only of the input query, or also of the input stream. In this sense, some classes can be described independent of the structure of the stream, or dependent on the stream schema.
Well-ordered XPath Queries: As a prime target towards describing such classes, one can envisage as a first step the characterization of XPath queries, the evaluation of which does not require at all buffering of potential answers. Such a query property can be detected, e.g., by checking whether all constraints (or predicates) of a potential answer are fulfilled along the stream before encountering that potential answer, i.e., before fully evaluating the head path that leads to the answers. However, a statical analysis of the queries, i.e., without the inspection of the stream and before the query evaluation, could be sufficient in some cases to detect such a query property. As early investigation of the supervisors shows, such a property, called ad-hoc well-orderedness, could be detected statically for some classes of queries. The intuition behind ther property name well-orderedness resides in the informal observation that for a well-ordered query its predicates are always evaluated before before its head path, thus there exists an order of the evaluation of subqueries that ensures the buffer-freeness of the evaluation of the whole query.
The thesis should attack and provide solutions to the following problems