XTech 2005: XML, the Web and beyond.

Periodical Reports from Local Governments to the Japanese Central Government

Discuss this paper on the XTech wiki
View XML source for this paper

Keywords

Abstract

This project is concerned with periodical reports from local governments to the Ministry of Internal Affairs and Communications (Japan). To fully exploit data in these periodical reports, this project is designing XML documents, schemas, and user interfaces.

Introduction

Ministry of Internal Affairs and Communications (MIC) in Japan created E-Local Governments System Development Panel. The goal of this panel is to allow e-local governments in Japan to be easily and consistently developed. One of the working groups reporting to this panel is Working Group for liaison between the central government and local governments. Under the auspices of this working group, a number of developers and officials conduct a project for building such a liaison system.

This project is initially concerned with periodical reports from local governments to the Ministry of Internal Affairs and Communications, but is expected to cover other information interchange between local governments and ministries. The focus in 2004 is design and analysis of periodical reports and workflow around them. In 2005, we plan to continue the design and analysis and further implement a pilot system for liaison between the Central Government and Local Governments.

Simply put, local governments in Japan are Prefectures, Government Ordinance Cities, or Municipalities. Municipalities report to prefectures, while government ordinance cities report to prefectures as well as the central government. Local governments depend on the central government for most funding.

Data standardization

Document analysis

There are 293 periodical reports from local governments to the Ministry of Internal Affairs and Communications. Moreover, each of them has one version for prefectures and another for municipalities.

We analyzed two periodical reports. They are dominated by numerical data and have mostly tabular structures, although other periodical reports contain prose as well as numerical data. As usual, we eliminate layout information and focus on structural or semantic information.

We find that periodical reports for municipalities and those for prefectures share common structures, but they have many differences. Some of the differences are fundamental and required, but others are caused by mistakes or layout constraints. We have attempted to make the two types of reports more similar by adopting logical decompositions of reports.

We also find that some pieces of information repeatedly appear in several periodical reports. Common schema components are appropriate for such common pieces of information.

Schemas and schema languages

Three grammar-based schema language have come to be widely recognized. They are DTD, W3C XML Schema, and RELAX NG. We have adopted RELAX NG, since (1) it is simple and powerful, and (2) RELAX NG schemas can be automatically converted to W3C XML Schema and DTD by . We created schemas by first creating XML documents by hand and then generating RELAX NG schemas by trang. Then, we converted such RELAX NG schemas to DTD and W3C XML Schema by trang.

Schematron

Periodical reports contain a large number of numerical data, and some of the data are computationally dependent on others. To capture such dependencies, we adopted . Schematron rules use XPath expressions for referencing to elements and attributes in XML documents.

Datatype library

We have developed a set of common datatypes. These datatypes handle the Japanese currency (yen), the Japanese calendar, and so forth. These datatypes are defined in RELAX NG but are derived from those in W3C XML Schema Part 2.

EGIX

We plan to use for representing glyphs that are not available in Unicode.

Future works

Although our schemas use common schema components, we feel that our schemas are not fully modularized yet. For example, our schemas do not directly capture similarities between those periodical reports from prefectures and those from municipalities. Moreover, some groups of periodical reports (e.g., those about governmental subsidy) appear to be similar. We plan to modularize our schemas for directly capturing such similarities.

We also plan to implement user interfaces for editing periodical reports. We are strongly interested in the use of XForms. However, as a short-term solution, we are also interested in using Open Office by converting our XML documents to the Open Document format and vice versa.

Bibliography

Bibliography
[Prefectures] Prefectures of Japan
[Government Ordinance Cities] Government Ordinance Cities
[Municipalities] Municipality of Japan
[RELAX NG] Document Schema Definition Languages -- Regular-grammar-based validation - RELAX NG,
[W3C XML Schema] W3C XML Schema,
[] Document Schema Definition Languages -- Rule-based validation - Schematron,
[] Embedding Glyph Identifiers in XML Documents,
20 December 2002
[] Trang,

Biography

Makoto Murata

Affiliated Researcher, International University of Japan http://www.iuj.ac.jp

MURATA is a co-editor of the RELAX NG specifications of OASIS, the editor of its predecessor, RELAX Core of ISO/IEC, and the editor of ISO/IEC DSDL Part 4. He has been promoting hedge or tree automata as formal models for schemas since 1994. He was a member of the original XML WG, co-editor of media-type RFCs for XML, and the editor of the Japanese XML Profile. He is now involved in e-Japan as a member of Working Group for system integration between the central government and local governments and Working Group for Data Standardization, which is subordinate to E-Local Governments System Development Panel. He graudated from Kyoto University.