Champion9Where does the XML data come from?

by Michael Champion
Senior R&D Advisor, Software AG


About the Author: Michael Champion is a member of the W3C's Document Object Model Working Group and co-editor of the core XML portion of the DOM Level 1 recommendation. Champion is currently a senior R&D advisor for new technologies at Software AG.

XML is currently generating a great deal of interest as the universal language of electronic business. Much effort and expense has been spent explaining the benefits of XML technology, but not much attention has been given to answering practical questions such as "How much data is currently available in XML and where does it come from?" The XML data that is interesting to you is obviously dependent on your particular requirements, but it is possible to identify some general answers and point you to some tools that support the storage of XML.

In brief, there's no shortage of XML data available on the Internet, and there are lots of ways to convert legacy data to XML relatively easily. The amount of data and number of support tools has increased very noticeably in the past year, and will surely grow exponentially in the years to come.

In fact, most enterprises will probably soon find themselves overwhelmed by XML data that may come from all sorts of non-XML sources and generated by "middleware" components and applications, but have lasting value and will need to be persistently stored. As this scenario unfolds, many organizations will find it necessary to have a scalable, reliable database such as Software AG?s Tamino, which uses XML and Internet standards to store, retrieve, and query all this data.

Note that the companies and products noted here are intended to be representative of what is possible today, and not by any means an exhaustive list of what is available.

XML on the Web or in messages

Over the next year or two, more and more data that you will come across in the normal course of your business will be in XML format.

  • XHTML. This dialect of HTML in well-formed XML syntax is becoming fairly common on the Internet. For example, presents much of its content in XHTML.

Creating XML

The sorts of tools that currently produce proprietary binary formatted data -- such as word processors, spreadsheets, data entry forms, etc. -- have already begun to be supplemented by equivalent products that produce XML. The biggest vendors, especially Microsoft, have shown a clear commitment to accelerate this trend by saving data in XML format. In the meantime, you can employ products such as:

  • XMetaL or other word-processor-like applications that can be used by ordinary office workers without XML expertise to produce documents in XML format.
  • Tools are available that produce XML data from online forms that ordinary users can easily fill out. See the offerings from icomXpress and JetForm.
  • eNumerate is developing spreadsheet-like application that will produce XML data in a format that can be displayed in browsers via XSL and graphed, plotted, etc. by a free browser plug-in.

Exporting XML

As all the companies that have jumped on the XML bandwagon actually implement XML support in their products, it will be increasingly common to be able to simply export data from existing tools in XML format.

  • MS Office 2000 exports specialized markup data in XML "islands" inside an HTML data format that is almost well-formed XML.
  • ERP and other enterprise-level systems are increasingly supporting XML as an output format. See for one prominent example.
  • Software design tools such as Rational Rose are supporting the UMI XML format for exchange of UML diagrams, rules, etc.

Converting XML

Finally, a number of specialized tools are being designed to easily convert data in conventional databases and flat files into XML syntax.

  • TextPipe is a Windows GUI stream editor that works in a similar manner to Unix sed, perl, grep, etc., converting the data to XML format and optionally generating a DTD to describe the result.
  • Dave Raggett's famous tidy program easily converts messy, non-standard HTML such as that found on the Web to well-formed XHTML.
  • upCast, from infinity loop, has both client-side and server-side tools which convert the RTF format supported by Microsoft and other word processor vendors into XML, using heuristics to recreate the logical structure from the layout.


What's New

Robin Cover's XML News

XML - An opportunity for small and medium-sized enterprises

The fast track to Web services

Survey on XML database adoption signals profound changes in data management

XML middleware paves way for students registration at Texas A&M

Automotive supplier tracks supply chain with XML  

Software AG News:
Software AG now offering trial version of new Tamino XML Server

Software AG establishes itself as market leader for XML database management systems

Software AG and Mozquito Technologies AG announce strategic XML partnership

Software AG Receives CIO-100 Award For Innovative Business Practices and Services

XML Basics

What is XML?
Why XML?
XML - the benefits
XML and databases

Learn XML in 11.5 minutes




Success with
Electronic Business
Just XML
The XML Companion
The XML Handbook
XML: A Primer