+++
date = "2022-11-30T19:57:00+00:00"
publishdate = "2023-12-29T07:08:55+00:00"
title = "Extensible Stylesheets"
slug = "extensible-stylesheets"
author = "Thedro"
tags = ["xml"]
type = "posts"
summary =  "The browser has a peculiar dimensionality that natively supports moving around complexity between different data and language domains."
draft =  ""
syntax =  "1"
toc =  ""
updated =  ""
+++

![XML Logo](/images/extensible-stylesheets.png "
  `XML` Logo"
)

The browser has a peculiar dimensionality that natively supports moving around
complexity between different data and language domains. One of those 
domains is `XML` (Extensible Markup Language) and it comes practically in the
form of `RSS` (Really Simple Syndication) Feeds, `Atom` Feeds, Site Maps, `OPML`
(Outline Processor Markup Language) Outlines and other various `XML` representations.

[`XSLT`](https://developer.mozilla.org/en-US/docs/Web/XSLT) (Extensible Stylesheet
Language Transformations) is a
{{< sidenote mark="language" set="left" >}}
See the [XSL Transformations Version 3.0](https://www.w3.org/TR/xslt-30/)
specification.
{{< /sidenote >}}
that transforms `XML` into different output
formats. You can expose and transform big or tiny blobs of `atom.xml`,
`rss.xml`, `sitemap.xml`, and `opml.xml` files into
{{< sidenote mark="different" set="right" >}}
Think plain text or `PDF` (Portable Document Format) but
that's outside the scope of this article.
{{< /sidenote >}}
presentation formats.

In some respects, [`XSLT`](https://www.w3.org/TR/xslt-30/) is considered
["dead"](https://lists.w3.org/Archives/Public/public-forms/2013Oct/0013.html)
technology, but take the word _dead_ with a
{{< sidenote mark="grain" set="left" >}}
"`X` is _dead_, and `Y` killed it" is a common trope on the Internet. You can
find articles and comments for any `X` of your choosing.
{{< /sidenote >}}
of salt. It's not uncommon to right click a site, and lo and behold see a `DTD`
(Document Type Definition) for `XHTML` (EXtensible HyperText
Markup Language) in its source generated from `XML` --- you'd be surprised.

`XSLT` operates as an `XML` templating language and a rather verbose one at
that. If you go deep enough, the verbosity gets unwieldy and like all
programming shenanigans it's a perpetual rabbit hole. Here's my practical notes
for working with `XML` and `XSLT` in a web context for adding style and
presentation while maintaining a bit of sanity.

## Formats, File Extensions, and MIME types

The `XSLT` transformations discussed here will be limited to
[`XHTML`](https://developer.mozilla.org/en-US/docs/Web/Guide/HTML/XHTML) output.
Raw `XML` in the browser has no styles associated with it
([example](/post/rss.xml)) so styles are added with `XSLT`
([example](https://micro.thedroneely.com/m/tdro/rss.xml)).

The `MIME` (Multipurpose Internet Mail Extensions)
[type definition](https://www.w3.org/TR/xslt20/#xslt-mime-definition) for `XSLT`
is `application/xslt+xml`. A file extension ending in `.xsl` or `.xslt` is the
commonly accepted and used form.

The mimetype definition for `XHTML` is `application/xhtml+xml` but it's usually
served using the `text/html` content type for browsers to assume `HTML` instead of `XML`
parsing. `XHTML` has
{{< sidenote mark="differences" set="left" >}}
`HTML` vs `XHTML` is an epic and historic flame war. Think tabs vs. spaces,
self--closing tags vs. non self--closing tags or any other versus trope you can
imagine.
{{< /sidenote >}}
from `HTML` which you can take a look at in this
[`XHTML` in a nutshell article](https://blog.whatwg.org/xhtml5-in-a-nutshell).

## XML Validation and Formatting

You can _validate_ and check an `XML` document for well formedness using
[`xmllint`](https://man.archlinux.org/man/xmllint.1) from the
[`libxml2`](https://repology.org/project/libxml2/versions)
{{< sidenote mark="package." set="right" >}}
Check your
[Linux distribution repositories](https://repology.org/repositories/statistics).
{{< /sidenote >}}
`W3C` (The World Wide Web Consortium) offers an online
[feed validation service](https://validator.w3.org/feed/), but an offline
validator sets up a better feedback loop and is a lot more robust and
{{< sidenote mark="efficient." set="left" >}}
If you're behind a [CGNAT](https://en.wikipedia.org/wiki/Carrier-grade_NAT) like
me, the Internet is effectively a captcha game. CaptchaNET™.
{{< /sidenote >}}

`XML` has multiple validation grammars in the form of schemas.
[`RELAX NG`](https://relaxng.org/) (REgular LAnguage for `XML` Next Generation)
is one of those schema language formats. Schema examples can be found in
[`RFCs`](https://datatracker.ietf.org/doc/html/rfc4287#appendix-B) (Request for
Comments) or in niche places around the web --- for example here's a
[`RSS` rng file](https://www.w3.org/2002/09/rss-rng/rss.rng), an
[`ATOM` rnc file](https://gist.github.com/tommorris/3725394#file-atom-rnc), and
an [`ATOM` rng file](https://gist.github.com/tommorris/3725394#file-atom-rng).
The catch is that these validation schema files may have differing use cases or
may be out of spec due to time, but they're still worth looking at.

`RELAX NG` has both a standard
`xml.rng` syntax and a [compact](https://relaxng.org/compact-tutorial-20030326.html)
`xml.rnc` syntax. Offline validation with `xmllint` does not
{{< sidenote mark="support" set="right" >}}
According to the `xmllint` manual it supports [`RELAX NG`](https://relaxng.org/),
[`WXS`](https://www.w3.org/XML/Schema) (`W3C` `XML` Schema), and
[Schematron](https://www.schematron.com/).
{{< /sidenote >}}
`rnc` compact schema syntax --- but `rng` works. Schema
{{< sidenote mark="conversion" set="left" >}}
As seen in this [blog post](https://cweiske.de/tagebuch/atom-validation.htm) on
validating `ATOM` feeds locally.
{{< /sidenote >}}
between `rnc` and `rng` can be achieved with the
[Java](<https://en.wikipedia.org/wiki/Java_(programming_language)>) program
[`trang`](https://relaxng.org/jclark/trang.html) (usually goes by the name
[`jing-trang`](https://repology.org/project/jing-trang/versions) in package
repositories).

**Trang**
: Trang converts between different schema languages for `XML`.
`RELAX NG` (`XML` syntax), `RELAX NG` compact syntax, `XML` `1.0` `DTDs`
and `W3C` `XML` Schema (`WXS`).

In my case, and maybe yours, it's easier to run `trang` on an already well
specified and well formed `XML` document. This produces a basic `rng` schema
file for validation and adding more rules.

```shell
trang rss.xml rss.rng
trang atom.xml atom.rng
trang opml.xml opml.rng
```

Validate `XML` using the `rng` file with `xmllint` and the `--relaxng` flag. The
`--noout` flag disables printing the output to the command line.

``` shell
$ xmllint --noout --relaxng rss.rng rss.xml
rss.xml validates
```

If it fails to validate it will return the error message defined by the schema's
grammar.

```shell
$ xmllint --noout --relaxng rss.rng rss.xml
rss.xml:25: element description: Relax-NG validity error : Did not expect element description there
rss.xml fails to validate
```

Pretty print `XML` with `--pretty 1` for basic formatting or `--pretty 2` for "one
attribute per line" white space formatting.

```shell
xmllint --pretty 1 rss.xml
xmllint --pretty 2 rss.xml
```

## Stylesheet Processing and Validation

The command line `XSLT` processor
[`xsltproc`](https://man.archlinux.org/man/xsltproc.1.en) can be used to process
stylesheets offline and works only on stylesheets up to version `1.1`. If using
`xsltproc` as a validation tool for `xsl` files, you'll have to downgrade the
version declaration from [version `3.0`](https://www.w3.org/TR/xslt-30/) to
version `1.1` and
{{< sidenote mark="sacrifice" set="right" >}}
Not that it matters much --- you'll find that [version `1.0`](https://www.w3.org/TR/xslt-10/) is the version that
that most browsers support.
{{< /sidenote >}}
a few
features.

```shell {caption="If nothing returns the xsl file is validated"}
xsltproc rss.xsl
```

```shell {caption="Transform rss.xml using rss.xsl. The data transforms from XML &rarr; XHTML"}
xsltproc rss.xsl rss.xml
```

Other processors like [Xalan--Java](https://xml.apache.org/xalan-j/) supports
`XSLT` up to version `1.0` and
[Saxon](https://www.saxonica.com/documentation11/index.html#!using-xsl/xslt30)
up to version `3.0`.

## Stylesheet Boilerplate and Transformations

Below is one variation of a stylesheet that transforms `XML` to `XHTML`. A
typical `XHTML` document skeleton is embedded within along with `XSLT` elements
for processing and transformation.

```xsl {options="hl_lines=5 7",caption="A basic template for transforming RSS, OPML, ATOM &rarr; XHTML"}
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:atom="http://www.w3.org/2005/Atom"
  xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd"
  xmlns:dc="http://purl.org/dc/elements/1.1/"
  version="1.1"
>
  <xsl:output method="html" version="1.0" encoding="UTF-8" indent="yes"/>
  <xsl:template match="/">
    <html xmlns="http://www.w3.org/1999/xhtml">
      <head>
        <title>XHTML Document</title>
        <meta
          http-equiv="Content-Type"
          content="text/html; charset=utf-8"
        />
        <meta
          name="viewport"
          content="width=device-width, initial-scale=1, maximum-scale=1"
        />
      </head>
      <body>
      </body>
    </html>
  </xsl:template>
</xsl:stylesheet>
```
In the above,
[namespace attributes](https://en.wikipedia.org/wiki/XML_namespace) in the form
`xmlns:itunes` extend the document. You _could_ think of them as imports for
extending features and avoiding naming conflicts. The `URL` points to the
"allowed"
{{< sidenote mark="vocabulary" set="right" >}}
Specs are meant to be broken after all.
{{< /sidenote >}}
specified by the namespace.

For example, the
[Atom Activity Streams](https://activitystrea.ms/specs/atom/1.0/) namespace
could be added under `xmlns:activity` and extend the stylesheet with an
understanding of [Activity Streams](https://www.w3.org/TR/activitystreams-core/)
related vocabulary. Namespaces can also be used to extend processing
instructions like `xmlns:xsl` for `XSLT` processors that support them.

```xsl {options="hl_lines=2",caption="Namespace in the XSLT stylesheet"}
<xsl:stylesheet
  xmlns:activity="http://activitystrea.ms/specs/atom/1.0/"
  version="1.1"
>
```

```xml {caption="Somewhere in an XML document"}
<activity:verb>post</activity:verb>
```

Drop the `xsl` stylesheet inside a `XML` document with the `xml-stylesheet`
declaration and the browser handles the rest.

```xml {options="hl_lines=2"}
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<?xml-stylesheet href="/rss.xsl" type="text/xsl"?>
```

`XSLT` works in conjunction with
[`XPath`](https://www.w3.org/TR/1999/REC-xpath-19991116/) (the `XML` Path Language)
and is somewhat similar to `CSS` (Cascading Style Sheets) selectors. Command
line programs like [`xmlstarlet`](https://man.archlinux.org/man/xmlstarlet.1.en)
make use of `XPath` expressions for selecting data from parts of an `XML`
document.

```xsl
<xsl:value-of select="/rss/channel/atom:link[@rel='previous']/@href"/>
```

The` XPath` expression from the select attribute above gets the `href` value
from the `<link>` tag in the `atom` namespace which is equal to
`https://example.com/page/2/rss.xml`.

```xml
<atom:link rel="next" href="https://example.com/page/2/rss.xml" />
```

If you're familiar with `CSS`, then you're in luck.
[Cheat sheets](https://devhints.io/xpath) for `XPath` are everywhere across the
Internet in a "`CSS` to `XPath`" format. Test expressions locally with
`xsltproc`, [`xmlstarlet`](https://man.archlinux.org/man/xmlstarlet.1) or an
[online `Xpath` expression test bed](http://www.whitebeam.org/library/guide/TechNotes/xpathtestbed.rhtm).

```css {caption="The CSS equivalent to [my XPath selector](#code-block-60ec880) above"}
rss > channel > link[rel="previous"][href] {
  display: inline;
}
```

Value selections, for loops, and, switch statements are the more commonly used
`XSLT` elements.

## Attributes and Value Selection

Create attributes with the `xsl:attribute` element. The attributes are added to
the parent tag. Select values with the `xsl:value-of` element.

```xsl {caption="Anchor attribute selection"}
<a>
  <xsl:attribute name="href">
    <xsl:value-of select="/rss/channel/atom:link[@rel='next']/@href"/>
  </xsl:attribute>
</a>

<!--  Output: <a href="https://example.com/page/2/rss.xml"></a> -->
```

```xsl {caption="Image attribute selection"}
<img>
  <xsl:attribute name="alt"><xsl:value-of select="/rss/channel/category"/></xsl:attribute>
  <xsl:attribute name="title"><xsl:value-of select="/rss/channel/category"/></xsl:attribute>
  <xsl:attribute name="src"><xsl:value-of select="/rss/channel/image/url"/></xsl:attribute>
</img>

<!--  Output: <img alt="image" title="image" src="/image"></img> -->
```

## For Each

A typical `for each` construction executes over a range of `XML` tags with the
`xsl:for-each` element.

```xsl
<xsl:for-each select="/rss/channel/item">
  <h2>
    <xsl:value-of select="title" />
  </h2>
</xsl:for-each>
```

## Switch Statements

A `switch` statement construction is executed with a combination of the
`xsl:choose`, `xsl:otherwise`, and `xsl:when` elements. The `test` attribute
on `xsl:when` contains the condition.

```xsl {options="hl_lines=1 3 9"}
<xsl:choose>

  <xsl:when test="/rss/channel/atom:link[@rel='previous']/@href">
    <xsl:attribute name="href">
      <xsl:value-of select="/rss/channel/atom:link[@rel='previous']/@href"/>
    </xsl:attribute>
  </xsl:when>

  <xsl:otherwise>
    <xsl:attribute name="href">/</xsl:attribute>
  </xsl:otherwise>

</xsl:choose>
```

View the `XSLT`
[elements and function](https://developer.mozilla.org/en-US/docs/Web/XSLT/Element)
reference for the complete list of instructions.

## Conclusion

There you have it --- a basic overview and approach to working with `XML`,
`XSLT`, and `XHTML`.