Here is an essay I wrote for my Masters course. I think it could be an interesting read...
"Comparison between DTD and XML Schemas"
Introduction
Document Type Definition (DTD) and XML Schema are XML schema languages, therefore both of them are used to express a set of rules to which an XML document must conform in order to pass validation. We need schema languages in order to check whether a given document conforms to all the rules given business must enforce.
DTD is native to XML schema language. It has been around for a long time; however it has some shortcomings which will be discussed later.
XML Schema has been developed later with the same purpose as DTD in mind; however it tries to negate DTD’s shortcomings. Furthermore, XML Schema is the first schema language that has been recommended by W3C (World Wide Web Consortium). An instance of XML Schema is called XML Schema Definition (XSD). As, XSD has been designed later it isn’t as limited in functionality as DTD.
DTD versus XML Schema
DTD lacks a couple of important constructs that are vital to working with XML documents. The biggest concern is that DTD syntax is based on SGML not the XML itself. Therefore from the technological standpoint, there is a need for two parsers. First one to validate XML file (just for structure integrity, not for conforming to set rules) and the second one to validate DTD file (again, just the structure). XSD is simply an XML file; therefore we can validate structural integrity of the XSD and the target XML document with just one parser.
As DTD was designed to be a schema language it is very troublesome, that it lacks a very important feature. It lacks data typing. It is impossible to specify what should be the type of the given element. For example it is impossible to enforce a rule that the amount of transaction is of type double. One could still pass a string and DTD parser would state that the XML document is valid. In XSD, one is actually forced (it is a required attribute) to specify the type of an object for each element; hence this problem has been completely mitigated in XSD. XSD also has type called ‘anyType’ for those special situations when type is not important, or it differs for different documents.
Another important shortcoming of DTD is the fact that it is impossible to specify the number of occurrences of a given element. Therefore, a lot of the validation must still be done even after the parser says that the given document conforms to the DTD schema. XSD on the other hand has properties: ‘minoccurs’ and ‘maxoccurs’ which obviously are not perfect but are definitely a step in the right direction. Of course, there are situations where these properties will not be sufficient to specify the business rules required and again the work lies on the shoulders of the developer that must check whether the file corresponds to the standards.
Since the creation of DTD, XML has evolved, newer constructs have been created. Unfortunately, DTD has not been up to par with those. An important feature of XML files are namespaces. They have been created due to the fact the XML has become a mainstream standard and therefore the documents have started to grow very large. It is not uncommon for a document to use two different set of vocabularies. If those vocabularies have the elements with the same name then it would lead to unmanageable files. Creation of XML namespaces has solved those problems. Unfortunately, there is no support for XML namespaces in DTD. This is a big limitation in most modern systems they are simply a must. On the other hand when creating an XSD file, one must specify the namespace of the specification. This and the fact that since XSD is an XML file and has its own namespace shows that this feature of XML documents is deeply supported.
Corollary
DTD is an old technology (it is based in SGML which was developed in 1960s). XML Schema came into being when XML became very popular and it became clear that DTD cannot meet the needs of the businesses. XSD was developed with all the shortcomings of DTD in mind. It was supposed to mitigate all the problems that DTD has. It is believed that the XML Schema is the successor of Document Type Definition.
To sum up, there are a lot differences between the two technologies, even though they attempt to solve the same problem. It is important to note, that even though XML Schema seems far more superior it is simply because it was developed much later then DTD. Most legacy systems still use DTD, however new systems tend to lean toward the usage of XML Schema. To ease in this transition there are a lot tools that allow translating DTD files into XSD format.