XML File Components

This page provides some general information pertaining to XML files and their creation.

1

Elements

Copy
 <industry> <Company category=”Mail”> <name >Window Book</name> <founded>1988</founded> <CEO>Jeff Peoples</CEO> <location>Cambridge, MA</location> </Company> <products> <type category=”Software”> <product1>DAT-MAIL</product1> <product2>MailDrop Engine</product2> </products> </industry>
When discussing XML files or documents, the term “Element” is used quite frequently. Elements are the building blocks of XML documents and are comprised of one or more of the following:

  • Text (such as <name>, <founded>, <CEO>, <location>, <product1>, <product2> that contain text content).

  • Attributes (for example <Company> has the attribute category=”Mail” and <type> has the attribute category=”Software”). Attribute values must always be quoted.

  • Other elements (such as <industry>, <Company>, and <products> are Elements.

XML files or documents must have a Root element. A Root element is the parent of all other elements. In the example used here, <industry> is the Root element. All elements require an opening and closing tag, see Tags.

Empty Elements

An element with no content is said to be ‘empty’. An empty element can be indicated in one of two ways (either use will generate the same result):

Copy
<element></element>

Or

Copy
<element />

In addition, empty elements can have attributes.

Element Naming Rules

Create descriptive names like <title>, <firstname>, <lastname>, etc. and short and simple names like <book_title> (not <the_title_of_the_book>).

XML Elements must follow these naming rules:

  • Element names are case-sensitive.

  • Element names must start with a letter or underscore.

  • Element names cannot start with the letters “xml” (or “XML” or “Xml”, etc.).

  • Element names can contain letters, digits, hyphens, underscores, and periods.

  • Element names cannot contain spaces.

Things to avoid:

  • Avoid “-“. If something is named “first-name”, some software may confuse this with a subtraction.

  • Avoid “.”. If something is named “first.name”, some software my think that “name” is a property of the object “first”.

  • Avoid “:”. Colons are reserved for Namespaces.

Nesting Elements

XML Elements must be nested properly.

Wrong: <b><i>This text is bold and italic</b></i>

Correct: <b><i>This text is bold and italic</i></b>

The correct example of properly nested elements means that since the <i> element is opened inside the <b> element, it must be closed before or inside the <b> element.

Tags

Tags are names used to identify the start (opening) or end (closing) of an element. Start or opening tags are names enclosed within < and >. End or closing tags is the same name as the corresponding start or opening tag enclosed by </ and >.

Start / Opening Tag: <input> End/Closing Tag: </input>

Tags are case sensitive so the tag <Letter> is considered different from the tag <letter>. Opening and closing tags of the same element must be written using the same case.

<message>This is the correct use of an opening and closing tag</message>

Namespaces

XML Namespaces provide a method to avoid element or tag name conflicts. Because the same tag name can or may be used in many different ways throughout an XML document, the tag can be associated with a namespace, which determines the context. A namespace declaration resembles an attribute, but it is not.

Example:

Copy
<TrueAddress_Job xmlns="http://TrueAddress.net">

The namespace can be any string, but it is usually a URI (Universal Resource Identifier) because a URI is something that is uniquely tied to and controlled by the creator. In the example above, the URI is http://TrueAddress.net. Elements contained within another element (nested) inherit or are part of the same the namespace if they do not have their own declaration.

Special Characters

Characters that have special meaning in XML need to be expressed differently when used within a value or text. The table below lists special characters and their equivalent expression.

Special Character

Equivalent Expression (XML)

&quot

&apos

<

&lt

>

>&gt

&

&amp

XML Schema Definition

An XML Schema Definition or XSD is used to describe and validate the structure and content of XML data. The schema defines the elements, attributes, and data types. The basic idea behind XML schemas is that they describe the legitimate format that an XML document can take.

As stated above, Elements are the building blocks of XML documents. XSDs also use elements, which in XSD files use a prefix of “xs:”.

An element can be defined within an XSD as follows:

<xs:element name = “x” type = “y”/>

The following XML schema element types are used in the TrueAddress.xsd schema file:

  • Simple Type: a simple type element is used only in the context of the text. Some of the predefined simple types are: xs:integer; xs:boolean; xs:string; and xs:date.

    Example: <xs:element name = “phone_number” type = “xs:int”/>

  • Complex Type: a complex type element is a container for other element definitions. This allows the user to specify the child elements an element can contain and to provide some structure within XML documents.

Example: <xs:element name = “address”>

<xs:complexType>

<xs:sequence>

<xs:element name = “name” type = “xs:string”/>

<xs:element name = “Company” type = “xs:string”/>

<xs:element name = “phone” type = “xs:int”/>

</xs:sequence>

</xs:complexType>

</xs:element>

In the example above, Address element consists of child elements. This is a container for other <xs:element> definitions that allows to build a simple hierarchy of elements in the XML document.

Attributes

Attributes in XSD provide extra information within an element. Attributes have a name property and a type property as shown below:

Example: <xs:attribute name = “x” type = “y”/>

Return to Additional References