XTech 2005: XML, the Web and beyond.
Most web content is authored for PC-style screens of a certain size. Reformatting content for screens of other sizes is necessary in order to extend the reach of the web beyond the desktop. The presentation will describe methods for making content available on small – and not-so-small – devices without incurring significant costs for authors. On the client side, the browser can alter the size, position and presentation of various elements based on heuristics. On the author side, writing style sheets for different classes of devices is a simple but powerful way of making content available on screens of different sizes. As more non-PC devices are used to access the web, markup should evolved to, e.g., support the needs for better forms and menus. Finally, the presentaion will describe how web applications can be authored for mobile devices using common web standards.
HTML is, 15 years after it was created, the dominant content format on the web. The definition and conventions for using HTML have evolved over time. In the beginning, HTML was a simple structured document format with markup tags added between text strings to indicate the role of the text. For example, a string of text could be marked as a paragraph, while another string could be marked as a clickable link. The elements in early HTML were logical rather than presentational. For example, HTML would mark some text as a heading but would not describe how the heading was to be presented. The presentation of text – including the font, color and size to be use – was primarily determined by the browser.
The web quickly gained momentum. With the launch of the National Center for Supercomputing Applications (NCSA) Mosaic browser in 1993 [Andreessen 1993a], users suddenly had an attractive browser to surf a steadily growing set of interlinked documents. With an increasing number of users, more authors were attracted to the web, and content proliferated.
As the web attracted attention outside of scientific environments, authors started complaining that they did not have enough influence over the appearance of their pages. One of the most frequent questions asked by authors new to the web was how to change fonts and colors of elements. This excerpt from a message sent to the www-talk [www-talk] mailing list early in 1994 [Andreessen 1994a], gives a sense of the tension between authors and browser implementors:
In fact, it has been a constant source of delight for me over the past year to get to continually tell hordes (literally) of people who want to -- strap yourselves in, here it comes -- control what their documents look like in ways that would be trivial in TeX, Microsoft Word, and every other common text processing environment: 'Sorry, you're screwed.'
The author of the message was Marc Andreessen, one of the programmers behind the popular NCSA Mosaic browser. He later became a co-founder of Netscape which fullfilled authors' requests by introducing presentational tags in HTML. In 1994, Netscape announced [Andreessen 1994b] the first beta release of their browser. The Netscape browser supported a set of new presentational HTML tags (e.g. CENTER to center text) and more were to follow.
As a result of these developments, many HTML pages on the web are presentation-oriented. Pages are authored on and for PC-type devices, and the use of tables to achieve certain layouts (rather than to present tabular data) is widespread.
The introduction of new kinds of devices, in particular mobile devices with small screens, has created new challenges. First, browser vendors have been challenged to develop browsers that display legacy content on small screens. Second, the web community has been challenged to find new ways of encoding content which are independent of devices. This paper will describe how the challenges have been addressed.
Mobile devices are developed to fit into pockets and handbags. This limits their physical size and restricts the size of the screens, and presenting typical web pages on these units is a challenge.
On the desktop, browsers use scrollbars to deal with content that is too large for a window. User use scrollbars to zoom/pan large pages. Scrollbars can also be employed on small devices, but handset manufactorers will often insist that scrolling should be limited to one dimension; vertical scrolling is accepted, but horizonal is not. This limitation makes it impossible to use zooming and panning as a way of showing large pages on small screens.
One way to deal with this limitation is to reformat pages to fit small screens. Reformatting is more radical than zooming/panning as it involves breaking conventions on how to display content. Consider this screenshot:

On the left side of the figure is a Web page displayed by a normal desktop browser. On the right side, the same page has been reformatted for a small screen. Among the changes that the reformatting has caused are:
Note, however, that all content is still available to the user.
By using heuristics like these rules, most web pages can be reformatted for small screens. Not all pages will look great and some functionality will be missing, but the user experience can be very good.
Dividing devices into "small" and "large" is simplistic: current web devices come in many different form factors and browsers need to support a range of screen sizes. Further, some devices can be used both in "portrait" and "landscape" mode. This provides both an opportunity and a problem for browsers. It is an opportunity as bigger screens allow more content to be shown. It is a problem since users typically expect more of the original page layout to be preserved on a bigger screen. One way to deal sith this issue is shown in the figure below where the width of columns is adjusted as screen width is constrained. Ultimately, the columns become too narrow and the number of columns must be reduced.

Starting around 2000, Cascading Style Sheets (CSS) has been interoperably supported in major browsers and designers have increasingly used CSS to express their designs rather than HTML tables. One of the perceived benefits of using CSS is that the content and structure of the document are separated from its presentation. As such, pages using CSS should be easier to reformat than pages using tables. There are several reasons why this is not always the case.
As designers get gain more experience with different kinds of devices, these difficulties are likely to be overcome and CSS will prove to be an important tool for encoding scalable designs.
HTML4 and CSS2 currently support media-dependent style sheets tailored for different media types. Here is a simple example that sets different colors on different types of devices:
@media screen { body { color: green }}
@media handheld { body { color: blue }}
@media print { body { color: black }}
CSS3 Media Queries MQ extend the functionality of media types by allowing more precise labeling of style sheets. Here is an example:
@media screen and (min-width: 600px) { body { font-size: 14px }}
@media screen and (max-width: 600px) { body { font-size: 12px }}
In the example above, the font size of the body element will depend on the width of the
window; it the width is more than 600 pixels, the font size will be 14
pixels, otherwise 12 pixels. The width of the window is one of several
"media features" that can be queried. Among the others are: window
height, height/width of the device, the aspect ratio of the device,
and the color capabilities of the device.
Media queries can also be expressed inside HTML documents:
<link rel="stylesheet" media="screen and (device-height: 600px)" />
As well as the XML syntax:
<?xml-stylesheet media="all and (min-color-index: 256)"
href="http://www.example.com/..." ?>
Media queries allow authors to target their content for many types of devices just by providing different style sheets.
As described in the previous section, style sheets may describe different presentations for different devices without changing the content itself. However, the markup of many current web pages is clearly not optimal for presentation on a range of devices.
Over the years, several attempts have been made to change how documents are marked up. In 1997, the "Wireless Application Protocol" (WAP) was first presented. WAP consisted of a set of specifictions to bring advanced applications and internet content to digital mobile phones. One of the specifications was the "Wireless Markup Language" (WML) which was proposed as a markup language for wireless devices. WML only had limited success, although still in use by some.
In 2000, W3C published the XHTML specification. XHTML is a reformulation of HTML using XML as the base syntax. XHTML represents an attempt to clean up the markup on the web. XML's draconian parsing rules must be followed by authors otherwise browsers will refuse to process the documents. For many authors, XHTML creates more problems than it solves, and XHTML is not widely used on the web today.
These examples show how difficult it is to change how content is published on the web. It seems unlikely that a new markup language will revolutionize web publishing in the forseeable future. Instead, HTML and how it is used will slowly evolve.
Two current trends in specification development are likely to influence HTML's evolution. First, an effort to maintain and evolve the HTML4 specification has been started. WebForms 2.0 WF2, a specification developed by the "Web Hypertext Application Technology Working Group" (WHAT WG), builds on HTML4 forms. The specification encodes current practice, fixes some errors, and adds functionality that has been requested. WebForms can be used both with HTML and XHTML.
Second, the concept of "microformats" MF might lead to HTML being extended by a several smaller languages. There is no strict definition of what a microformat is. Rather, the development of microformats is guided by a set of loosely defined principles. Microformats are small additions to HTML, rather than languages of their own. Microformats are designed for humans first, and machines second. Microformats are compatible with deployed applications and do not try to change everyone's behavior. Examples of microformats are XFN (which encodes information about personal relationships in HTML) and hCard (which is a representation of vCard in HTML).
There should be room for a wide range of devices on the web. In order to make these devices first-class citizens of the web, two developments must happen. First, browser vendors must continue their efforts to reformat current web documents for small screens. Second, the web community must evolve the languages used for web authoring so that content is not tied to certain devices.
Håkon Wium Lie
CTO, Opera Software http://www.opera.com
Håkon Wium Lie is the CTO of Opera Software. His job is to make sure Opera remains a better, smaller and faster browser than the one you know. Before joining Opera in 1999, Håkon worked at W3C where he was responsible for the development of Cascading Style Sheets (CSS), a concept he first proposed while working with Tim Berners-Lee at CERN in 1994. Håkon holds a MS degree in Visual Studies from the MIT Media Lab.