Using the DTDHandler and EntityResolver
In this section, we discuss the two remaining SAX event handlers:
DTDHandler
andEntityResolver
. TheDTDHandler
is invoked when the DTD encounters an unparsed entity or a notation declaration. TheEntityResolver
comes into play when a URN (public ID) must be resolved to a URL (system ID).The DTDHandler API
In Choosing Your Parser Implementation you saw a method for referencing a file that contains binary data, such as an image file, using MIME data types. That is the simplest, most extensible mechanism. For compatibility with older SGML-style data, though, it is also possible to define an unparsed entity.
The
NDATA
keyword defines an unparsed entity:The
NDATA
keyword says that the data in this entity is not parsable XML data but instead is data that uses some other notation. In this case, the notation is namedgif
. The DTD must then include a declaration for that notation, which would look something like this:When the parser sees an unparsed entity or a notation declaration, it does nothing with the information except to pass it along to the application using the
DTDHandler
interface. That interface defines two methods:notationDecl
(String name, String publicId, String systemId)unparsedEntityDecl
(String name, String publicId, String systemId, String notationName)The
notationDecl
method is passed the name of the notation and either the public or the system identifier, or both, depending on which is declared in the DTD. TheunparsedEntityDecl
method is passed the name of the entity, the appropriate identifiers, and the name of the notation it uses.
Note: The
DTDHandler
interface is implemented by theDefaultHandler
class.
Notations can also be used in attribute declarations. For example, the following declaration requires notations for the GIF and PNG image-file formats:
Here, the
type
is declared as being eithergif
orpng
. The default, if neither is specified, isgif
.Whether the notation reference is used to describe an unparsed entity or an attribute, it is up to the application to do the appropriate processing. The parser knows nothing at all about the semantics of the notations. It only passes on the declarations.
The EntityResolver API
The
EntityResolver
API lets you convert a public ID (URN) into a system ID (URL). Your application may need to do that, for example, to convert something likehref="urn:/someName"
into"http://someURL"
.The
EntityResolver
interface defines a single method:This method returns an
InputSource
object, which can be used to access the entity's contents. Converting a URL into anInputSource
is easy enough. But the URL that is passed as the system ID will be the location of the original document which is, as likely as not, somewhere out on the web. To access a local copy, if there is one, you must maintain a catalog somewhere on the system that maps names (public IDs) into local URLs.