XTech 2005: XML, the Web and beyond.
The Mozilla RDF (Resource Description Framework) engine was one of the first implementations in the field. Targetted at an early specification of RDF, it did not keep up with the changes to the RDF specifications and the challenges of web applications.
The current developements in the Mozilla RDF engine are described, including conformance work with current specifications, API design, and performance.
The Mozilla RDF implementation had a history of phases with lively development and times of almost no progress at all. Being practically unowned until recently, the Mozilla implementation lost track of the changes to the RDF specifications in 2004. In addition, the RDF API was designed to be used internally to the Mozilla code base and lacks a thorough review with respect to the security aspects arising from web applications.
The current work on the RDF engine picks up both of these areas and tries to incrementally address them. This presentation exposes the directions of this work and calls for comments from the community outside Mozilla.
From an API perspective, the most prominent change to the specifications concerns nomenclature. The specification that the Mozilla implementation targeted described arcs in terms of
source -> property -> target,
the current specifications, however, use
subject -> predicate -> object.
This difference in nomenclature raises the entry barrier to the Mozilla RDF engine and makes mapping of concepts in the specification to the API harder than necessary. This requires a new set of APIs in the current terminology, without breaking existing code. The extension mechanism inherent to the Mozilla products requires some level of backwards compatibility as well, so that extensions can choose to work on an RDF API existing in earlier versions of Mozilla products.
Conformance work regarding typed literals will be more
involved, mixing design and compatibility aspects and code size at
the same time. The 2004 RDF specifications introduced XML Schema
datatypes for typed literals, while the Mozilla RDF implementation
uses propritary attributes
(http://home.netscape.com/NC-rdf#parseType). The work
in this area should incrementally support XML Schema types while
being backwards compatible to support legacy data and profile
migration. This work in particular has to balance the code size
bloat on the one hand with extending conformant parsing to more
XML Schema data types. Furtermore, the resulting code should be
shared with other parts of Mozilla handling XML Schema, and maybe
even extensible. We expect an on-going process of improvement here,
starting small in both the amount of supported data types and
footprint.
A testbed for the RDF Test Cases will provide a detailed bug list to address in the RDF/XML parser. We hope to address these bugs in a backwards compatible manner.
There are different perspectives on the Mozilla RDF API, which suggest an API redesign. The first, updating the terminology of the API to the current specifications, has been mentioned above. Further perspectives include web applications and performance considerations, encompanied by back end restrictions.
For RDF to be exposable to web applications, it needs to undergo a thorough security review. Access to RDF APIs from the web require security checks for creating RDF/XML data sources and non-RDF/XML data sources as well as key objects like resources or literals. There needs to be a clear entry point for web applications to generate or access these objects, while at the same time, performance for core Mozilla code is essential and must not degrade.
Web applications will require at least read access to RDF/XML data sources and read and write access to temporary (in-memory) data sources. At a later stage, there should be write access to RDF storage on servers, if a standardized protocol exists.
As most data source objects within Mozilla are backed by hash tables, entry points to sequences of return values should not be interruptable by write operations. The current APIs generate array-based snapshots, introducing CPU- and memory-loads. The intention of the current work is to replace those APIs with a visitor pattern as the following.
interface rdfITripleVisitor : nsISupports
{
void visit(in nsIRDFResource aSubject, in nsIRDFResource aPredicate,
in nsIRDFNode aObject, in boolean aTruthValue);
};
interface rdfIDataSource : nsISupports
{
void getAllSubjects(in rdfITripleVisitor aVisitor);
};
Caching of the results to be used for write access can then be easily achieved while providing an efficient API for read access. Using optimized APIs like this can hopefully improve the performance of the Mozilla platform in general.
Up-to-date information on the design process of the Mozilla RDF APIs is found on the Mozilla wiki. It will be updated to reflect the feedback received during the XTech conference.
The Mozilla project is thankful to the XTech program comittee to give us the opportunity to expose the current development of its RDF engine to the interested community outside of its project.
Axel Hecht
Mozilla Europe