Introduction
Page 2
Let's Dissect
Summary
Let's dissect the pieces of my company roster XML document to see each piece's role and responsibility.
Header:
The header tells the document's user that this is an XML document - using version 1.0 of the XML specification in this case.
<?xml version="1.0"?> <company name="Information Strategies"> <employees> <employee id="1">Hank Aaron</employee> <employee id="2">Babe Ruth</employee> </employees> </company>
Tags (brackets, greater than, less than):
Just like in HTML, you use greater than (">") and less than
("<") signs called tags to indicate the opening and closing of an element.
<?xml version="1.0"?> <company name="Information Strategies"> <employees> <employee id="1">Hank Aaron</employee> <employee id="2">Babe Ruth</employee> </employees> </company>
Elements:
Elements are the basic building blocks of XML. They may contain text, comments, or other elements, and consist of a start tag and an end tag. Typically, XML elements are akin to nouns in the real world. They represent people, places, or things.
<?xml version="1.0"?> <company name="Information Strategies"> <employees> <employee id="1">Hank Aaron</employee> <employee id="2">Babe Ruth</employee> </employees> </company>
Note that in XML, every opening element (i.e. "<company>") must also contain a closing element (i.e. "</company>"). The closing element consists of the name of the opening element, prefixed with a slash ("/"). XML is case-sensitive. While "<company></company>" is well-formed, "<COMPANY></company >" and "<Company></cOMPANY >" are not.
Also, if the element does not contain text or other elements, you may abbreviate the closing tag by simply adding a slash ("/") before the closing bracket in your element (i.e. "<company></company>" can be abbreviated as "<company />"). In addition to the rules defining opening and closing tags, it is important to note that in order to create a well-formed XML document, you must properly nest all elements. The previous document properly nests the "<employee>" elements within the "<employees>" element, but the following would not be acceptable in XML because the second "<employee>" element exists outside of the "<employees>" element:
<employees> <employee id="1">Hank Aaron</employee> </employees> <employee id="2">Babe Ruth</employee>
Attributes:
Where elements represent the nouns contained in an XML document, attributes represent the adjectives that describe the elements. The following document tells me that Hank Aaron's id is "1" and that Babe Ruth's is "2". This helps to describe these two employees.
<?xml version="1.0"?> <company name="Information Strategies"> <employees> <employee id="1">Hank Aaron</employee> <employee id="2">Babe Ruth</employee> </employees> </company>
Note that in order to be well formed, all attribute values must be contained within quotation marks. id="1" is correct, while id=1 is not acceptable. This is a marked difference from standard HTML formatting that places much looser restrictions on what is acceptable.
Text/Content:
Elements contain contents that give critical information about them. This information represents that entity itself in an XML document. In the following document, Hank Aaron is the employee; Babe Ruth is the employee. <?xml version="1.0"?> <company name="Information Strategies"> <employees> <employee id="1">Hank Aaron</employee> <employee id="2">Babe Ruth</employee> </employees> </company>
As you can see, XML and HTML are practically identical with the exception that XML is far less lenient when it comes to case-sensitivity, using closing tags, and properly nesting parent/child elements. This is excellent news for Web developers everywhere as it ensures that if you write well-formed HTML, you'll find the transition to XML virtually seamless.
Jeff Jones
About the author:
Jeff Jones For more information on XHTML, XML, and the W3C, check out the W3C website at http://www.w3c.org.