How to generate valid XML from Drupal CMS nodes
There have been many discussions and requests for how to generate Drupal CMS content/nodes. In this document is a review of the reasoning and the steps that facilitate generating XML dat from a Drupal CMS.
Well formed-ness/validity
XML trees must be well formed and valid to a DTD. DTD declaration is optional and in this situation, we will not need one as the tree we ae generating is relatively simple. However, if we are to share the generated XML data with others, it will be necessary to have a DTD to enable others to know the schema of the XML data
CDATA, XHTML within XML - advantage of having valid HTML
Since there is almost certainty that the XML data will contain other angle brackets, either the data must be well-formed HTML/XHTML (this will prevent the parser from generating errors) or in situations where this is not assures, it is necessary to write a DTD and set the contents as CDATA to exclude the HTML/XHTML from validation. When it is possible to control the output, I would always suggest having valid XHTML data to enable the XML data consumer application to dig deeper into the data to find certain elements (image, H titles, forms, etc). Valid XHTMl qualifies as XML, and so you will suddenly have sub-sets of your content accessible to XSLT and backend applications.
Drupal Strategy to generate XML data from the CMS
The presentation / theming layer in Drupal is best suited for generating XML data from CMS nodes. This can be accomplished by either creating a fresh theme, or adding theme template files to generate XML. In most applications, the content within a Drupal site has to be presented both as human browser-readable XHTML along with outputting XML. In the demonstration at hand, I have created template files to generate the node title, link to XHTML and node content as the three XML data nodes (Use your server-side logic to generate the correct content mime-type, text/xml for applications that need to detect the mime-type):
<drupal_page>
<node>
<link></link>
<title></title>
<content></content>
</node>
</drupal_page>
Demonstration
Browse to any page on cmsproducer.com and add a the querystring value format=xml such as:
http://cmsproducer.com?format=xml
or
http://cmsproducer.com/XML-XSLT-XHTML-Transformation-PHP-ASP-NET?format=xml
It will generate the same page with just two nodes in the XML tree:


