+++
date = "2022-09-29T08:26:21+00:00"
publishdate = "2023-12-29T07:08:55+00:00"
title = "Making Web Pages"
slug = "making-web-pages"
author = "Thedro"
tags = ["html"]
type = "posts"
summary =  "A recent look at my web server configuration revealed that my error pages were set up in an odd way."
draft =  ""
syntax =  "1"
toc =  ""
updated =  "2022-10-01"
+++

![404 Page not found error](/images/making-web-pages.png "
  A 404 error page"
)

## Introduction

A recent look at my web server configuration revealed that my error pages were
set up in an odd way. From the base route of a domain, it's possible to iterate
over each error page starting with [`/400`](/400), [`/401`](/401),
[`/402`](/402), [`/403`](/403), [`/404`](/404) and so on. I guess that my
{{< sidenote mark="past" set="left" >}} 
My initial personal [`nginx`](https://nginx.org/en/docs/beginners_guide.html)
configuration is from years ago.
{{< /sidenote>}} 
self wanted to verify an error page's content and existence.

It did seem like more was going on but upon further investigation, these error
pages were just statically generated `HTML` (HyperText Markup Language). They
were made with `HTML` and `XML` (Extensible Markup Language) utilities many
moons ago.

Just as there are too many books to read in a lifetime, there are
[too many programs](https://staticsitegenerators.net/) to try out in a lifetime. If you look
online it's easy to get the impression that making a website is rocket
science. The reason for this is obvious --- almost all discussions online are
oriented towards engineers working in the industry. The reality is that a
website or even an "application" can be as simple as a single `HTML` text file.

The blessed ways of crafting _and_ delivering a website have become beyond the
pale complicated, but composing `HTML` is still as simple as it always was, and
perhaps even easier thanks to some improvements in the
[`HTML` specification](https://html.spec.whatwg.org/dev/).

## HTML and XML Utilities

`HTML` and `XML` utilities or `html-xml-utils` are a simple set of programs for
manipulating `HTML` and `XML` files. Here's the master list that describes the
purpose of each utility taken right from the
[readme](https://www.w3.org/Tools/HTML-XML-utils/README).

```text
asc2xml      -  convert from UTF-8 to &#nnn; entities
hxaddid      -  add IDs to selected elements
hxcite       -  replace bibliographic references by hyperlinks
hxcite-mkbib -  expand references and create bibliography
hxclean      -  apply heuristics to correct an HTML file
hxcopy       -  copy an HTML file while preserving relative links
hxcount      -  count elements and attributes in HTML or XML files
hxextract    -  extract selected elements
hxincl       -  expand included HTML or XML files
hxindex      -  create an alphabetically sorted index
hxmkbib      -  create bibliography from a template
hxmultitoc   -  create a table of contents for a set of HTML files
hxname2id    -  move some ID= or NAME= from A elements to their parents
hxnormalize  -  pretty-print an HTML file
hxnsxml      -  convert output of hxxmlns back to normal XML
hxnum        -  number section headings in an HTML file
hxpipe       -  convert XML to a format easier to parse with Perl or AWK
hxprintlinks -  number links & add table of URLs at end of an HTML file
hxprune      -  remove marked elements from an HTML file
hxref        -  generate cross-references
hxselect     -  extract elements that match a (CSS) selector
hxtoc        -  insert a table of contents in an HTML file
hxuncdata    -  replace CDATA sections by character entities
hxunent      -  replace HTML predefined character entities to UTF-8
hxunpipe     -  convert output of pipe back to XML format
hxunxmlns    -  replace "global names" by XML Namespace prefixes
hxwls        -  list links in an HTML file
xml2asc      -  convert from &#nnn; entities to UTF-8
```

These `28` programs (primitives) allow you to do a lot of magic with `HTML` (and
`XML`). If you're on a Linux distribution this package exists on
[Debian](https://packages.debian.org/sid/html-xml-utils),
[Arch](https://archlinux.org/packages/community/x86_64/html-xml-utils/),
[Alpine](https://pkgs.alpinelinux.org/package/edge/community/x86/html-xml-utils)
and [others](https://repology.org/project/html-xml-utils/versions) as
`html-xml-utils`.

## Basic Templating

The web was initially designed with the purpose of passing documents around.
Below is a modern skeleton of a basic
{{< sidenote mark="`index.html`" set="right" >}}
The file name `index.html` is a convention that pretty much all web servers
recognize. If it exists, the web server will process it automatically.
{{< /sidenote>}}
document.

```html {caption="index.html"}
<!DOCTYPE html>
<html lang="en">
  <head>
    <title>The Document</title>
    <meta charset="utf-8" />
    <meta
      name="viewport"
      content="width=device-width, initial-scale=1"
    />
  </head>
  <body>
  </body>
</html>
```

The document type is `HTML` with a language attribute set to English, the `<head>`
includes the title, character set and viewport while the `<body>` begins the
rest of the document and ends it with a closing `</body>`.

The package `html-xml-utils` has a program
[`hxincl`](https://man.archlinux.org/man/hxincl.1) that allows augmenting that
base `index.html` document. Here's what a rough template of those error pages
look like using the inclusion directives of `hxincl`.

```html {options="hl_lines=9 18 21 36",caption="template.html"}
<!DOCTYPE html>
<html lang="en">
  <head>
    <meta
      name="viewport"
      content="width=device-width, initial-scale=1"
    />
    <style>
      <!-- include "index.css" -->
    </style>
  </head>
  <body>
    <article>
      <section>
        <h1>
          <code>
            <b>
              <!-- include "error.code.html" -->
            </b>
          </code>
          <!-- include "error.title.html" -->
        </h1>
      </section>
      <p>You can try:</p>
      <ul id="action">
        <li>
          Going back to the
          <a onclick="window.history.go(-1); return false;" href="/">
            previous page.
          </a>
        </li>
        <li>Returning <a href="/">to the home page.</a></li>
      </ul>
    </article>
    <script>
      <!-- include "index.js" -->
    </script>
  </body>
</html>
```

Running `hxincl` on this template expands the contents of the included file
directives which are conveniently just `HTML` comments. It expands into a final
artifact: the error code value and title, along with a traditional separation of
concerns for `CSS` (Cascading Style Sheets) as `index.css` and `JavaScript` as
`index.js`. The source code of an error page shows the full expansion.

```shell
hxincl -f -x template.html
```

The `-f` flag removes all comments after the expansion and the `-x` flag
enforces `XML` conventions when generating
[void elements](https://developer.mozilla.org/en-US/docs/Glossary/Empty_element).
The program [`hxnormalize`](https://man.archlinux.org/man/hxnormalize.1) can be
used to format and pretty print the final `HTML` document.

## Conclusion

This was a very simple demonstration and `hxincl` can be used much more cleverly
with substitutions. There's no props, slots, or template inheritance but I
thought it'd be rather nice to demonstrate that you don't need complex tools to
start generating `HTML`.