next up previous contents
Next: Ongoing research Up: Project: Semi structured data Previous: Summary   Contents

Background

Unstructured collections, or unstructured data, are collections that do not respect a predefined schema, and hence need to carry a description of their own structure. These are called semistructured when one can recognize in them some degree of homogeneity. Semistructured data (SSD) are currently quite common, often in the form of XML documents. Their partial regularity makes semistructured collections amenable to be accessed through query languages, but not through query languages that have been designed to access fully structured databases. New languages are needed that are able to tolerate the data irregularity, and that can be used to query, at the same time, both data and structure. New type systems are required to describe SSD and to verify whether queries match such data. New systems are needed to efficiently store such data.



Maria Simi 2006-10-23