XTech 2005: XML, the Web and beyond.

The Oracle XSLT Virtual Machine

Discuss this paper on the XTech wiki
View XML source for this paper

Keywords

Abstract

Abstract

This paper describes the architecture, implementation and high-performance characteristics of Oracle XSLT Virtual Machine (XVM) and Compiler. XVM was designed to work as stand-alone unit or in conjunction with Oracle XMLType, enabling XSLT and XPath computations inside Oracle database. It takes stylesheets and XPath expressions precompiled into platform-independent byte code and uses pre allocated stacks for intermediate data. A goal of this paper is to describe how a high-performance XSLT engine can be built by using the classical Virtual Machine approach.

Introduction

XSLT is a tree-oriented, rule-based language for transforming instances of XML documents into zero or more result trees. As specified in [XSLT-1.0][XSLT-2.0] a transformation in XSLT is expressed in the form of a stylesheet containing a set of rules in the form of templates. During the transformation process the patterns associated with template rules are matched against the current set of document nodes. The winning template is activated and the process continues until there are no more document nodes to be matched.

As we stated in an earlier paper [XSLTVM] an XSLT transformation, if properly compiled can be executed on a stack-based Virtual Machine. The common sense behind that statement is the tree-based nature of XSLT transformations. Also, we expected that the stack-based implementation would have significant performance advantages over non-ordered heap-based implementations, because of significantly less or no run-time memory allocations.

This paper covers the design, implementation and performance characteristics of the Oracle XSLT Virtual Machine and Compiler (XVM), which are already integrated into the Oracle Database 10g server. The major project requirements were improved performance and smaller memory footprint as compared with the first Oracle XSLT processor. Of course, it still needed to be fully standards-compliant. The project goals were easily achieved mostly because of the XVM architecture.

The XVM is ‘classically’ designed. There is a compiler (referred as Compiler), which compiles the input stylesheet, optimizes its intermediate representation and generates platform-independent byte-code. And there is an XSLT Virtual Machine (referred as VM) to execute that byte-code.

Compiler

XVM compilation consists of four processing phases. During the first phase the input stylesheet and all included stylesheets are parsed by Oracle XML Parser and converted into a DOM tree. Since both the Oracle XML Parser and the Oracle XVM share the same Error Messaging module, all parsing error are seamlessly mixed with XVM Compiler errors.

The second phase is a top-level declaration processing. This is needed because XSLT allows forward referencing – some entities can be referenced before they physically appear in the stylesheet. The Compiler scans all top-level variables, parameters, key declarations, format declarations, templates, … etc. and adds them into the Compiler’s Symbol-Table. XSLT is a static scoped language, which allows inner declarations to hide outer ones. This is why the Symbol-Table uses a stack-based design in order to push and pop inner scopes of visibility. For example, all XPath function declarations are pushed first and stay at the bottom of the Symbol-Table stack.

During the third phase, the stylesheet DOM tree is navigated top-down and left-right. Each node is semantically analyzed, XPath strings are parsed and an intermediate tree is generated.

The fourth phase includes intermediate tree optimizations and byte-code generation. Currently, the Compiler does only local optimizations. The rule-based nature of XSLT makes global data-flow analysis quite difficult and in many cases impossible. The code generator treats the <xsl:stylesheet> element as the stylesheet ‘main’ template and treats its activation code as the stylesheet byte-code entry point. The ‘main’ template body includes all top-level variable, parameter and attribute-set declarations.

All stylesheet variables and parameters reside in the VM stack quite like procedural language function parameters and local variables. That way, all variable references are compiled as stack offsets. Top-level variable offsets are from the bottom of the stack. Local template variable offsets are calculated from the beginning of current template stack’s frame.

Byte-code

XVM byte-code is a platform independent sequence of two byte units. It has a header and a body. The header contains byte-code descriptions (XSLT or XPath, version, … etc), lengths and offsets to each body section.

The body contains sections for the byte-code itself, strings, numbers (as strings), string-tables, patterns, pattern tables, external function tables, … etc. All byte-code references are in the form of relative offsets (as a number of units) from the beginning of the corresponding section or offsets from the current address.

When the VM loads the byte-code, it converts all numbers from their string representation into a digital one (float, double or integer). Numbers reside in tables and VM instructions refer to them with their table offsets calculated by the Compiler. Also, the VM populates tables with predefined or external function addresses. There is one table for each function namespace. Again, VM instructions refer to functions with table offsets prepared by Compiler. The VM approach for dynamically linking external entities is quite similar to DLL dynamic linkage in Windows.

VM Architecture

 

As it was mentioned previously VM has a stack-based architecture. Each machine instruction pops its operands from the stack and replaces them with the result. VM uses the following stacks:

The first three stacks logically comprise the main VM stack. The reason for that is better performance. Both three stacks grow and shrink in parallel. The node-stack and the string stack consist of segments dynamically allocated if needed. Also, VM maintains a number of pools for stylesheet byte-code, patterns, pattern-groups, constants, … etc.

VM instructions have the following format:

                opcode [mode] [operand ]*

Depending on the opcode and mode the operands are treated as one of the following entities:

The VM dynamic architecture is quite simple. There is a set of functions, one for each instruction, implementing the instruction semantics. Also, there is another set of service functions, providing services like node creation, stack maintenance, stream output, … etc. The VM main loop moves the instruction pointer over byte-code instructions and calls the corresponding function. The default instruction pointer step is one instruction. Only instructions like ‘branch’ or ‘call’ can change the instruction pointer according to their operand values. Each instruction takes its operands from the VM-stack and pushes back the result. A template stack frame is pushed into the VM-stack when a template is activated. The template frame contains the return address, current stack pointers, current node, template descriptor address plus (if needed) slots for template parameters and local variables.

The output is generated in a form of SAX-like events and depending on the current output mode it is directed to the DOM builder, Streamer or SAX event generator.

Example

Figure 1: Example XSLT Stylesheet
<?xml version="1.0"?>
 
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 
<xsl:template match="catalog">
  <xsl:for-each select="table/row[position() &lt; 10]">
   <xsl:choose>
     <xsl:when test="number(id)=1">
       <id1><xsl:value-of select="lastname"/></id1>
     </xsl:when>
     <xsl:when test="number(id)=2">
      <id2><xsl:value-of select="lastname"/></id2>
     </xsl:when>
   </xsl:choose>
  </xsl:for-each>
</xsl:template>
 
<xsl:template match="/">
  <xsl:apply-templates select="//catalog[@version=1]"/>
</xsl:template>
 
</xsl:stylesheet>

The stylesheet shown in Figure 1 finds all library catalogs with the attribute ‘version’ equal to 1 and outputs all ‘lastname’ elements in the first ten table rows, with child element ‘id’ equal to 1 or 2.

The Compiler has an option to print out the input stylesheet mixed with the generated machine instructions. The machine instructions are shown in the listing with the following format:

 

    29     FOREACH            code: @-12  
     |       |                     |
  address  opcode               operand

 

The machine instruction address is the offset (in 2 byte units) from the beginning of the byte-code pool. The operand “code: @-12” tells the VM where to jump when ‘FOREACH’ ends. The jump address is represented as an offset from the current instruction address.

The following fragment of compiler listing shows machine instructions generated for stylesheet statements 6 to 10.

 

Figure 2: Fragment of XVM Compiler Listing
6.  <xsl:for-each select="table/row[position() &lt; 10]">
  
  23  PUSHCUR        
  24  CHILD          table                   
  27  PUSHCTX        
  28  PUSHEMPTY      
  29  FOREACH        code: @-12   
  31  PUSHCUR        
  32  CHILD          row                     
  35  FILTERPOS      #0          #10        
  38  UNION          
  39  BRA            code: @10  
  41  PUSHCTX        
  42  FOREACH        code: @-52   
 
7.    <xsl:choose>
8.      <xsl:when test="number(id)=1">
 
  44  PUSHCUR        
  45  CHILD          id                      
  48  CALLFUNC       24         1          0          
  52  EQIMM          #1
  54  BNO            code: @-14  
 
9.        <id1><xsl:value-of select="lastname"/></id1>
  
  56  ELEMLIT        'id1'       'id1'      ''         
  60  PUSHCUR        
  61  CHILD          lastname                
  64  TEXT     
  65  ENDELEM        
  66  BRA            code: @24   
 
10.      </xsl:when>

 

The above stylesheet fragment loops over the first ten table rows finds rows with child ‘id’ equal to 1 or 2 and outputs the row ‘lastname’ child.

Machine instructions generated for statement 6 (<xsl:for-each>) evaluate the ‘select’ XPath expression. First, all ‘table’ child elements are collected (24). Next, an empty node-set is pushed into the VM-stack for collecting the nodes filtered by the expression predicate (28). The result node set is moved (pushed) into the context-stack (41).

<xsl:when> instructions find the ‘id’ child (45), convert it into a number (48) and compare the result with number ‘1’. The branch instruction (54) jumps to the next <xsl:when> if the result of that comparison is false. If not, the literal element ‘id1’ is outputted (56) with the child’s ‘lastname’ content (61, 64). Next, the control is returned (66) to the loop instruction.

When in debug mode the VM outputs the current machine instruction and the content of the VM-stack (top 2 objects). Figure 3 shows the fragment of the VM debug output illustrating the execution of machine instructions from Figure 1.

 

Figure 3: Fragment from VM Debug Output
  44  PUSHCUR        
|------ VM-stack ------------|
  23 -> SYSTEM
  22 -> SYSTEM
  
  45  CHILD          id                     
|------ VM-stack ------------|
  24 -> NDSET[1]<-(0)
  2   -> #elem: row(527)
  23 -> SYSTEM
  
  48  CALLFUNC       24         1          0         
|------ VM-stack ------------|
  24 -> NDSET[1]
  2   -> #elem: id(529)
  23 -> SYSTEM
 
  52  EQIMM          #1    
|------ VM-stack ------------|
  24 -> NUMBER: 1.000000
  23 -> SYSTEM
  
  54  BNO            code: @-20   
|------ VM-stack ------------|
  24 -> BOOLEAN: true
  23 -> SYSTEM
  
  56  ELEMLIT        'id1'       'id1'      ''         
|------ VM-stack ------------|
  23 -> SYSTEM
  22 -> SYSTEM

Oracle DB Integration

The XVM is part of Oracle XML Developer’s Kit (XDK). The XDK includes all Oracle XML tools (Java and C). It is tightly integrated with Oracle XML DB (XMLType). Oracle XMLType allows users to store their XML documents into the DB (with or without an XML Schema) and access them using relational, XPath, XSLT or DOM interfaces. Depending on the XML meta-context settings, the XVM can access XML DB documents or standalone DOM trees created by the XDK parser.

The flexibility to use different DOM tree representations doesn’t allow the XVM to optimize the DOM trees for faster search. But that limitation doesn’t prevent the XVM to compare quite favorably (because of the architecture) against other C XSLT processors.

Performance Results

As it was expected the XVM is significantly faster than XSLT Java processors (up to 6 times, depending on the platform) and faster than C processors we tested. It is more than two times faster than the older Oracle XSLT processor, which was our primary goal. We run all our tests using ‘UTF-8’ encoding. Since the XVM supports different code paths, depending on the encoding, results we got with explicitly set single-byte encoding were about 30% better than the numbers above.

Regarding memory footprints the XVM is a clear winner. The byte-code stylesheet representation occupies significantly less space than others equivalent in memory tree representations. Also, the XVM keeps its intermediate results in the stack. Other processors allocate memory dynamically, which results in increased memory fragmentation.

Conclusion

The Oracle XVM processor experimentally proves that is possible to express XSLT transformations as a stack-based computation. We consider the following XVM features as an important improvement over other XSLT implementations:

Bibliography

[XSLT-1.0]
[XSLT-2.0]
[XPath-1.0]
[XPath-2.0]
[XSLTVM]

Biography

Anguel Novoselsky

www.oracle.com

Anguel Novoselsky has been working in the area of Compiler and Language Design for more than 19 years with companies in USA, Canada and Bulgaria. Currently, he is a member of Oracle team developing Oracle XML components.