+++ date = "2022-09-29T08:26:21+00:00" publishdate = "2023-12-29T07:08:55+00:00" title = "Making Web Pages" slug = "making-web-pages" author = "Thedro" tags = ["html"] type = "posts" summary = "A recent look at my web server configuration revealed that my error pages were set up in an odd way." draft = "" syntax = "1" toc = "" updated = "2022-10-01" +++ ![404 Page not found error](/images/making-web-pages.png " A 404 error page" ) ## Introduction A recent look at my web server configuration revealed that my error pages were set up in an odd way. From the base route of a domain, it's possible to iterate over each error page starting with [`/400`](/400), [`/401`](/401), [`/402`](/402), [`/403`](/403), [`/404`](/404) and so on. I guess that my {{< sidenote mark="past" set="left" >}} My initial personal [`nginx`](https://nginx.org/en/docs/beginners_guide.html) configuration is from years ago. {{< /sidenote>}} self wanted to verify an error page's content and existence. It did seem like more was going on but upon further investigation, these error pages were just statically generated `HTML` (HyperText Markup Language). They were made with `HTML` and `XML` (Extensible Markup Language) utilities many moons ago. Just as there are too many books to read in a lifetime, there are [too many programs](https://staticsitegenerators.net/) to try out in a lifetime. If you look online it's easy to get the impression that making a website is rocket science. The reason for this is obvious --- almost all discussions online are oriented towards engineers working in the industry. The reality is that a website or even an "application" can be as simple as a single `HTML` text file. The blessed ways of crafting _and_ delivering a website have become beyond the pale complicated, but composing `HTML` is still as simple as it always was, and perhaps even easier thanks to some improvements in the [`HTML` specification](https://html.spec.whatwg.org/dev/). ## HTML and XML Utilities `HTML` and `XML` utilities or `html-xml-utils` are a simple set of programs for manipulating `HTML` and `XML` files. Here's the master list that describes the purpose of each utility taken right from the [readme](https://www.w3.org/Tools/HTML-XML-utils/README). ```text asc2xml - convert from UTF-8 to &#nnn; entities hxaddid - add IDs to selected elements hxcite - replace bibliographic references by hyperlinks hxcite-mkbib - expand references and create bibliography hxclean - apply heuristics to correct an HTML file hxcopy - copy an HTML file while preserving relative links hxcount - count elements and attributes in HTML or XML files hxextract - extract selected elements hxincl - expand included HTML or XML files hxindex - create an alphabetically sorted index hxmkbib - create bibliography from a template hxmultitoc - create a table of contents for a set of HTML files hxname2id - move some ID= or NAME= from A elements to their parents hxnormalize - pretty-print an HTML file hxnsxml - convert output of hxxmlns back to normal XML hxnum - number section headings in an HTML file hxpipe - convert XML to a format easier to parse with Perl or AWK hxprintlinks - number links & add table of URLs at end of an HTML file hxprune - remove marked elements from an HTML file hxref - generate cross-references hxselect - extract elements that match a (CSS) selector hxtoc - insert a table of contents in an HTML file hxuncdata - replace CDATA sections by character entities hxunent - replace HTML predefined character entities to UTF-8 hxunpipe - convert output of pipe back to XML format hxunxmlns - replace "global names" by XML Namespace prefixes hxwls - list links in an HTML file xml2asc - convert from &#nnn; entities to UTF-8 ``` These `28` programs (primitives) allow you to do a lot of magic with `HTML` (and `XML`). If you're on a Linux distribution this package exists on [Debian](https://packages.debian.org/sid/html-xml-utils), [Arch](https://archlinux.org/packages/community/x86_64/html-xml-utils/), [Alpine](https://pkgs.alpinelinux.org/package/edge/community/x86/html-xml-utils) and [others](https://repology.org/project/html-xml-utils/versions) as `html-xml-utils`. ## Basic Templating The web was initially designed with the purpose of passing documents around. Below is a modern skeleton of a basic {{< sidenote mark="`index.html`" set="right" >}} The file name `index.html` is a convention that pretty much all web servers recognize. If it exists, the web server will process it automatically. {{< /sidenote>}} document. ```html {caption="index.html"} The Document ``` The document type is `HTML` with a language attribute set to English, the `` includes the title, character set and viewport while the `` begins the rest of the document and ends it with a closing ``. The package `html-xml-utils` has a program [`hxincl`](https://man.archlinux.org/man/hxincl.1) that allows augmenting that base `index.html` document. Here's what a rough template of those error pages look like using the inclusion directives of `hxincl`. ```html {options="hl_lines=9 18 21 36",caption="template.html"}

You can try:

``` Running `hxincl` on this template expands the contents of the included file directives which are conveniently just `HTML` comments. It expands into a final artifact: the error code value and title, along with a traditional separation of concerns for `CSS` (Cascading Style Sheets) as `index.css` and `JavaScript` as `index.js`. The source code of an error page shows the full expansion. ```shell hxincl -f -x template.html ``` The `-f` flag removes all comments after the expansion and the `-x` flag enforces `XML` conventions when generating [void elements](https://developer.mozilla.org/en-US/docs/Glossary/Empty_element). The program [`hxnormalize`](https://man.archlinux.org/man/hxnormalize.1) can be used to format and pretty print the final `HTML` document. ## Conclusion This was a very simple demonstration and `hxincl` can be used much more cleverly with substitutions. There's no props, slots, or template inheritance but I thought it'd be rather nice to demonstrate that you don't need complex tools to start generating `HTML`.