XTech 2005: XML, the Web and beyond.

A success story: How XML services were used to benefit a high-end visual effects company

Discuss this paper on the XTech wiki
View XML source for this paper

Keywords

Abstract

This presentation illustrates the real-world deployment of a complex XML-RPC service architecture within one of Europe's largest visual effects houses, ranging from integration with Filemaker Pro, to driving 35mm Laser Film Recorders.

Introduction

Framestore CFC is one of the largest post-production houses in Europe. We have a substantial body of staff, ranging from creatives, such as paint artists and editors, through to more organisational people such as production coordinators and producers. Along side the staff is the technical infrastructure consisting of a very fast core network, multi-terabyte storage servers, and a large render farm with hundreds of nodes. At any given time, we can be working on many different projects at varying stages of completion. It’s an environment where any changes to working practices have to be carefully managed, as it's crucial that none of the ongoing work is disrupted in any way. Deadlines are tight, especially for those right at the end of the movie production chain. Once the digital images are put to 35mm film within Framestore CFC, they are sent for development and replication, then sent straight to the cinemas themselves. The technical aspects of the computing infrastucture are also very challenging. We have multiple Linux desktops which run a core of the tools, Linux storage servers, and a substantial number of Windows desktops, Mac OS X desktops, and more esoteric systems such as Silicon Graphics servers and workstations. It is creating a system of workflow tools that operate across these disparate platforms that is the task of the Systems Development team.

The work that the Systems Development team on is split across two main sections of the company. The Visual Effects department (VFX), and the Digital Lab (DLab). VFX is responsible for creating virtual charaters and sets, intricate compositing of "green screen" filmed material with computer created effects, and even replacing peoples heads on bodies . DLab are responsible for scanning raw 35mm film footage, performing dust and scratch removal on the scanned images, colour grading the film, which involves manipulating the images according to the director of photography's wishes, and then filming the resulting files back out to 35mm film using laser film recorders.

Over the past year and a half, we have engineered a rich set of XML-RPC services that offer a simple API into a variety of dynamic systems and databases employed throughout the company. This has ranged from providing an API to access data held within central Filemaker databases, to providing an interface to drive the film recorders. It has been an important step for us to establish these interfaces within the company, as there has been a gradual a cultural shift towards getting the maximum amount of reuse out of common workflow features across the business.

History and Company Background

In 2001, around two years before the creation of the Systems Development team, there was a merger between Framestore, who worked mostly with commercials for television, and CFC, who worked exclusively on film. The union between the resulting company following the merger has been a slow process. As well as the shift to shared resources connecting the two companies - from stationary to file servers - there was also a desire to shift towards centralising some of the more core aspects of the work. Traditionally, each project would act independently of each another. As deadlines were so tight, each project was self-sufficient, resulting in duplication across projects, particularly of tools. Depending on the strength of talent of the individuals on each project, a tool could be well-written, or could be an awful one that caused more problems that it tried to solve. The problem was that none of the people who wrote these tools thought about how, when or why they might be reused. Those individuals working on the project were not there to be software designers, they were there to ensure the projects were delivered on time.

Soon after I joined, I was made head of a new, small team called the Systems Development. We were not attached to any of the projects directly, and therefore could consider the longer term interests of the company, in addition to short-term immediacy of the projects going on at that time. Gradually, with the help of other key new staff members within the company, we started to identify the core ways in which projects were structured. A large proportion of this identification centered around the use of Filemaker, which is a tool that enables users to create databases, and be able to create custom views upon that data. Filemaker is a very powerful tool which enables people to track large amounts of data with relative ease, but it is not a particularly open system, and is limited to providing clients only on Windows and Mac OS X. The only method of data integration that people had perviously used was to perform a regular data export out to a CSV file, and then process this data accordingly.

Dynamic Data Interchange

Around November 2003, the Systems Development team happened to - as a result of the large amount of body shuffling that goes on within Framestore CFC - sit with the Pipeline Tools group, who were more responsible for integration and development within the highly technical aspects and applications used by the VFX department's work. They also created and managed fairly large amounts of data, and we began talking about the possibilities of data exchange, and what form that might take. In a coffee-based brainstorming session, we boiled it down to a schema that could contain any arbitrary primitives (string, int, float), and found that this had a remarkable similarity to the way that XML-RPC was designed. The very first place any Web Services were implemented was as a way to expose data provided by Filemaker, and this had it's initial release as a system which expected GET requests, and returned simple XML data structures. A developer called David Stuart was responsible for this bridging, and it revolves around using Claris Dynamic Markup Language (CDML) for forming HTML POST requests to Filemaker's in-built Web Companion technology, and then parsing the resulting XML that it returns and reformatting it according to the Framestore internal schema. This had to do many ugly functions such as parsing out all of the javascript code returned upon an operation that resulted in an error condition, but it worked as a simple way of reading data in from Filemaker. We then subsequently replaced the original model of using GET requests with a list of discreet XML-RPC methods, by using the PHP server implementation written by Edd Dumbill.

Suddenly this offered a real time way of interacting with Filemaker, rather than having forced regular synchronisation. The merger of Framestore and CFC had resulted in a number of systems that were responsible for tracking work in different areas of the business, and as mentioned previously, there were established periods during the day where the systems were updated to make them all consistent. Outside of these periods, users would regularly come across records present in some systems that were not present in others, and would loose faith in the databases being an accurate representation of the work performed. We now had access to a method of providing near synchronous updates of all dependent databases as records were added, edited or removed.

Transparency and Accountability

At Framestore CFC, we have developed a daemon written in C++ that monitors all of the storage systems throughout the company, and logs the filesystem structure and asset metadata to a large MySQL database. It had to be fast, efficient and very light weight, but also needed to have accountability. The way in which it operates on a server can make it compete for disk I/O and CPU resources when large-scale render jobs occurring on the render farm need to access data stored on that particular storage server. It was crucial to the successful deployment of the system that the daemon be as open and transparent about what it was doing at any given time. Usually, this can be handled by activity logging, but in this case, we were dealing with such vast quantities of data, that logging each directory that was currently being examining out to a file was infeasable. It could index up to 500,000 files in 3 minutes, so we had to rethink the way in which the information could be made available.

We decided that a simple set of XML-RPC methods could be implemented that would enable the desired level of information accessibility. One of the most crucial methods was the status() call. This method took no parameters, and returned a structure, using our standard key:value pair encoding, indicating the name of the variable returned. The variables it returned indicated the daemon's uptime, directory and file counts, the path to the file being examined, how many pending requests it had in its queue, and what type of metadata it was gathering. This was then tied in with a simple 16 line python script to perform a status query and print the results directly out:

{'uptime': 602027, 'directory_count': 105065, 'file_count': 819322, 'current_path': '/mnt/raid1/job/common/texures/firstpass/', 'queue_length': 0, 'second_pass':0}

Each variable is instantly understandable from the key string it is given in the hash structure.

The status() method call became invaluable in diagnosing performance issues during the initial deployment within the company. At one point, the daemon became extremely slow on a particular server, and an examination of the server in question yielded no obvious issues. Then we examined what the daemon was currently indexing, and noticed that the path it was looking down had become, at a certain place in the directory hierarchy, a symbolic link to another server's NFS export. Since the daemon was never intended to follow symbolic links, this highlighted a design flaw, such that when a previously indexed directory was subsequently replaced with a symbolic link to a directory on another server, there was no explicit check to see if this change had occurred, and it was followed as normal. This issue was then fixed within the code, and a new version was deployed. This episode demonstrated how the ability to query the run-time state of the daemon was an invaluable asset.

The ease in which the developer was able to enable his application to support XML-RPC was also surprising. The most simple implementation found was one written by Chris Morley [http://xmlrpcpp.sourceforge.net/]. It is small, self-contained, supports introspection, and completely POSIX compliant, enabling us to maintain our target cross-platform compatibility. The indexer is currently deployed within Framestore CFC on Linux, Mac OS X and SGI IRIX systems.

The indexer project highlights just how easy it can be to open up applications/programs to be accountable for the way they operate.

Ease of integration

After the success of the Filemaker XML-RPC gateway, the team subsequently were able to create a very feature-rich web-based query tool that took the data that was present in Filemaker, and transformed it into a streamlined view tailored for VFX operators and artists. This was a another crucial step in opening up Filemaker, as these operators were using Linux workstations, and could not run any traditional way of gaining access to that data. Once this central site was in regular use, there was another core aspect of the workflow that could be centralised and turned into a service. This was the provision of a system for versioning of work on each shot being worked upon, and the automatic generation of Quicktime movie clips and thumbnail images from a directory on a server that contained each frame of the shot, at the native film resolution of 2048 x 1556. The Quicktime and thumbnails also had to contain the date and version numbers 'burnt in' to the images that made up that shot.

This was to be achieved using another in-house tool developed by one of the Pipeline Tools developers which would turn these sequences of images into the movie files. This tool was a Perl script that supported a large number of flags that governed all aspects of the processing, from the resolution of the Quicktime movie file generated, to the location and font of the textual data embedded within each frame. We approached the original developer and suggested that these large numbers of flags available at run-time could also be expressed as XML, and came up with a schema that represented all of the currently available options. Once this step was done, we could concentrate on providing the necessary database calls to guarantee correct versioning upon submission of image sequences to the system.

This XML file could contain any number of tags, depending on how far the user wanted to deviate from the default settings when creating the Quicktime and thumbnail. This arbitrary number of tags mapped extremely well to our use of hashes as inputs to the XML-RPC methods, so we mapped the key-value pairs from the method invocation, straight through to the backend program, effectively being able to support any new options that were available in the backend program's subsequent revisions and development.

When this system was released, we visited each of the projects and introduced them to the ability to form method calls through whatever language they wanted. We prepared a Perl script, a Python script, and a shell script wrapper that in turn invoked the Perl script. Armed with these 3 implementations, we could win over the end users by demonstrating the principles involved, in whatever language they were most comfortable working with. We also pointed out the system.listMethods and system.methodHelp functions to them so that they could feel empowered to investigate the services that we had created, and work out a way in which they could integrate with the system as a whole.

This proved very successful, and 6 months down the line, we have had little or no interaction the developers who rely on these methods, and this has been considered a success by all involved.

Hands-off development

The Systems Development team, as mentioned earlier, are not the only development team within the company who are responsible for managing databases. In particular, there is one of growing importance called FCDB. This system was developed by one of the Pipeline tools members, and is used within the VFX department. It is effectively a file store and revision control system for components of a 3d scene, be it a model, a texture, or a 2d matte image. Each of these elements can be stored within FCDB and have much the same information as would be for files within CVS, namely a filename, version, datestamp, author, and notes field.

We asked the author of this system if he would consider writing an XML-RPC service that could query the database and he said that should be possible. Some weeks late