Extensible Stylesheets
XML LogoThe browser has a peculiar dimensionality that natively supports moving around
complexity between different data and language domains. One of those
domains is XML (Extensible Markup Language) and it comes practically in the
form of RSS (Really Simple Syndication) Feeds, Atom Feeds, Site Maps, OPML
(Outline Processor Markup Language) Outlines and other various XML representations.
XSLT (Extensible Stylesheet
Language Transformations) is a
XML into different output
formats. You can expose and transform big or tiny blobs of atom.xml,
rss.xml, sitemap.xml, and opml.xml files into
PDF (Portable Document Format) but
that’s outside the scope of this article.
In some respects, XSLT is considered
“dead”
technology, but take the word dead with a
X is dead, and Y killed it” is a common trope on the Internet. You can
find articles and comments for any X of your choosing.
DTD
(Document Type Definition) for XHTML (EXtensible HyperText
Markup Language) in its source generated from XML — you’d be surprised.
XSLT operates as an XML templating language and a rather verbose one at
that. If you go deep enough, the verbosity gets unwieldy and like all
programming shenanigans it’s a perpetual rabbit hole. Here’s my practical notes
for working with XML and XSLT in a web context for adding style and
presentation while maintaining a bit of sanity.
Formats, File Extensions, and MIME types
The XSLT transformations discussed here will be limited to
XHTML output.
Raw XML in the browser has no styles associated with it
(example) so styles are added with XSLT
(example).
The MIME (Multipurpose Internet Mail Extensions)
type definition for XSLT
is application/xslt+xml. A file extension ending in .xsl or .xslt is the
commonly accepted and used form.
The mimetype definition for XHTML is application/xhtml+xml but it’s usually
served using the text/html content type for browsers to assume HTML instead of XML
parsing. XHTML has
HTML vs XHTML is an epic and historic flame war. Think tabs vs. spaces,
self–closing tags vs. non self–closing tags or any other versus trope you can
imagine.
HTML which you can take a look at in this
XHTML in a nutshell article.
XML Validation and Formatting
You can validate and check an XML document for well formedness using
xmllint from the
libxml2
W3C (The World Wide Web Consortium) offers an online
feed validation service, but an offline
validator sets up a better feedback loop and is a lot more robust and
XML has multiple validation grammars in the form of schemas.
RELAX NG (REgular LAnguage for XML Next Generation)
is one of those schema language formats. Schema examples can be found in
RFCs (Request for
Comments) or in niche places around the web — for example here’s a
RSS rng file, an
ATOM rnc file, and
an ATOM rng file.
The catch is that these validation schema files may have differing use cases or
may be out of spec due to time, but they’re still worth looking at.
RELAX NG has both a standard
xml.rng syntax and a compact
xml.rnc syntax. Offline validation with xmllint does not
xmllint manual it supports RELAX NG,
WXS (W3C XML Schema), and
Schematron.
rnc compact schema syntax — but rng works. Schema
ATOM feeds locally.
rnc and rng can be achieved with the
Java program
trang (usually goes by the name
jing-trang in package
repositories).
- Trang
- Trang converts between different schema languages for
XML.RELAX NG(XMLsyntax),RELAX NGcompact syntax,XML1.0DTDsandW3CXMLSchema (WXS).
In my case, and maybe yours, it’s easier to run trang on an already well
specified and well formed XML document. This produces a basic rng schema
file for validation and adding more rules.
shell
trang rss.xml rss.rng
trang atom.xml atom.rng
trang opml.xml opml.rngValidate XML using the rng file with xmllint and the --relaxng flag. The
--noout flag disables printing the output to the command line.
shell
$ xmllint --noout --relaxng rss.rng rss.xml
rss.xml validatesIf it fails to validate it will return the error message defined by the schema’s grammar.
shell
$ xmllint --noout --relaxng rss.rng rss.xml
rss.xml:25: element description: Relax-NG validity error : Did not expect element description there
rss.xml fails to validatePretty print XML with --pretty 1 for basic formatting or --pretty 2 for “one
attribute per line” white space formatting.
shell
xmllint --pretty 1 rss.xml
xmllint --pretty 2 rss.xmlStylesheet Processing and Validation
The command line XSLT processor
xsltproc can be used to process
stylesheets offline and works only on stylesheets up to version 1.1. If using
xsltproc as a validation tool for xsl files, you’ll have to downgrade the
version declaration from version 3.0 to
version 1.1 and
1.0 is the version that
that most browsers support.
shell
xsltproc rss.xslshell
xsltproc rss.xsl rss.xmlOther processors like Xalan–Java supports
XSLT up to version 1.0 and
Saxon
up to version 3.0.
Stylesheet Boilerplate and Transformations
Below is one variation of a stylesheet that transforms XML to XHTML. A
typical XHTML document skeleton is embedded within along with XSLT elements
for processing and transformation.
xsl
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
xmlns:dc="http://purl.org/dc/elements/1.1/"
version="1.1"
>
<xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>XHTML Document</title>
<meta
http-equiv="Content-Type"
content="text/html; charset=utf-8"
/>
<meta
name="viewport"
content="width=device-width, initial-scale=1, maximum-scale=1"
/>
</head>
<body>
</body>
</html>
</xsl:template>
</xsl:stylesheet>In the above,
namespace attributes in the form
xmlns:itunes extend the document. You could think of them as imports for
extending features and avoiding naming conflicts. The URL points to the
“allowed”
For example, the
Atom Activity Streams namespace
could be added under xmlns:activity and extend the stylesheet with an
understanding of Activity Streams
related vocabulary. Namespaces can also be used to extend processing
instructions like xmlns:xsl for XSLT processors that support them.
xsl
<xsl:stylesheet
xmlns:activity="http://activitystrea.ms/specs/atom/1.0/"
version="1.1"
>xml
<activity:verb>post</activity:verb>Drop the xsl stylesheet inside a XML document with the xml-stylesheet
declaration and the browser handles the rest.
xml
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<?xml-stylesheet href="/rss.xsl" type="text/xsl"?>XSLT works in conjunction with
XPath (the XML Path Language)
and is somewhat similar to CSS (Cascading Style Sheets) selectors. Command
line programs like xmlstarlet
make use of XPath expressions for selecting data from parts of an XML
document.
xsl
<xsl:value-of select="/rss/channel/atom:link[@rel='previous']/@href"/>The XPath expression from the select attribute above gets the href value
from the <link> tag in the atom namespace which is equal to
https://example.com/page/2/rss.xml.
xml
<atom:link rel="next" href="https://example.com/page/2/rss.xml" />If you’re familiar with CSS, then you’re in luck.
Cheat sheets for XPath are everywhere across the
Internet in a “CSS to XPath” format. Test expressions locally with
xsltproc, xmlstarlet or an
online Xpath expression test bed.
css
rss > channel > link[rel="previous"][href] {
display: inline;
}Value selections, for loops, and, switch statements are the more commonly used
XSLT elements.
Attributes and Value Selection
Create attributes with the xsl:attribute element. The attributes are added to
the parent tag. Select values with the xsl:value-of element.
xsl
<a>
<xsl:attribute name="href">
<xsl:value-of select="/rss/channel/atom:link[@rel='next']/@href"/>
</xsl:attribute>
</a>
<!-- Output: <a href="https://example.com/page/2/rss.xml"></a> -->xsl
<img>
<xsl:attribute name="alt"><xsl:value-of select="/rss/channel/category"/></xsl:attribute>
<xsl:attribute name="title"><xsl:value-of select="/rss/channel/category"/></xsl:attribute>
<xsl:attribute name="src"><xsl:value-of select="/rss/channel/image/url"/></xsl:attribute>
</img>
<!-- Output: <img alt="image" title="image" src="/image"></img> -->For Each
A typical for each construction executes over a range of XML tags with the
xsl:for-each element.
xsl
<xsl:for-each select="/rss/channel/item">
<h2>
<xsl:value-of select="title" />
</h2>
</xsl:for-each>Switch Statements
A switch statement construction is executed with a combination of the
xsl:choose, xsl:otherwise, and xsl:when elements. The test attribute
on xsl:when contains the condition.
xsl
<xsl:choose>
<xsl:when test="/rss/channel/atom:link[@rel='previous']/@href">
<xsl:attribute name="href">
<xsl:value-of select="/rss/channel/atom:link[@rel='previous']/@href"/>
</xsl:attribute>
</xsl:when>
<xsl:otherwise>
<xsl:attribute name="href">/</xsl:attribute>
</xsl:otherwise>
</xsl:choose>View the XSLT
elements and function
reference for the complete list of instructions.
Conclusion
There you have it — a basic overview and approach to working with XML,
XSLT, and XHTML.