This Site - Matt Whipple

Site Nav

Content

This site is largely going to evolve into being somewhat of a journal of random non-private stuff, some of which may be of interest to others but much of which is unlikely to be interesting to anyone other than my future self. It will effectively evolve as a combination of a collection of literate programming, annotated bookmarks, and a sandbox within which I may practice writing and processing of the referenced sources. As this work is done in background time and the referenced sources will be consumed, the content is likely to grow fairly slowly.

This site also acts as somewhat of a floodgate against information overload. I've generally considered myself fairly unaffected by the concept of information overload both due to mentality and habits, but I do find that the modern world and particularly being a practitioner of software engineering lends itself to being inundated by an enormouse breadth of information. This very often does not allow for appropriate depth of understanding which yields a significant time invested in acquiring shallow knowledge which can produce an effect of being bombarded with seeing trees at the sake of the missing the forest. The end result is hopefully a framework for more deliberate distillation of information which should lead to more careful curation of durable knowledge at a sustainable pace.

Technology

This site (the latest in a long line for my of varying technologies) is created using Org mode[emacs-org-mode-info] in Emacs. The site is hosted using GitLab pages[git-lab-pages-docs].

Structure

This site started as one long page, but as it is now expected to include longer records associated with specific tasks it seems more appropriate to break each such activity on to its own page.

This shift in organization approach also introduces the need to pay attention to some concerns which were previously deferred such as navigation and citation source management.

Navigation Menus

Navigation menus consist of content which is shared across pages and which should ideally be nearly identical across those pages to provide a consistent navigation experience. The entire structure is likely to be a hierarchy of pages and the active perspective on that tree should primarily reflect at least the siblings and ancestry for the current page, often in addition to any type of global navigation.

Implementation Alternatives

One typical approach to managing the navigation would be that each page defines its relative location and then the entire structure is collected from the information within each page. This approach can be particularly nice to make sure that each page is represented in the structure, it keeps the relevant meta-content close to the content, and it also lends itself naturally to alternative projections of content grouping such as using tags. This is therefore a very attractive alternative if there is a readily available means to act on the metadata contained within each page. Starting from a relative scratch position, however, this would introduce a significant amount of overhead so a simpler approach will be adopted for the first pass.

An alternative is manually managed navigation structures. Manually managed navigation requires minimal logic beyond including the navigation in the relevant places. The main disadvantage to having the separate definition is that it can effectively bifurcate some of metadata from its content and therefore shifts the complexity from the point of the generation of the structure to the coordination between the structure and the content which it is intended to reflect.

Such an approach could further be divided into approaches of either having a single tree which contains everything and which can be traversed and partially presented as necessary, or having individual roots/files for specific purposes.

At the outset I'll be starting with a top level root and progressing from there. Adopting a single tree from which different subtrees could be retrieved as needeed seems as though it should be straightforward and is likely to be an adopted solution if the path to generating the structure from page local metadata doesn't seem worthwhile. Whether a single structure or separate roots are managed initially will largely depend on whether I incidentally stumble across a clear path to doing the subtree selection I'd want prior to needing it.

Solution

Basic inclusion of a shared structure can be done using the INCLUDE org mode directive. Starting with the top global nav, this can be done with:

#+INCLUDE :"./nav_main.org"

The included file will contain a snippet which is used to generate the menu and supporting content. Any surrounding content should be avoided if the structure is intended to capture semantics only. The contents of the includes and call sites along with the representation of the navigation structure and how it is consumed are therefore subject to change if niftier navigation management is introduced.

This approach did lead to an issue where org agenda hangs while processing this files.

Citation Sources

org-mode footnotes provide a simple readily available way to manage sources, but coordinating or duplicating those sources across a growing range of separate pages feels cumbersome. Having a single included file of footnotes is a potential simple solution, but I've grown partial to BibTeX and so this consolidation of sources has bumped my desire to integrate BibTeX with org-mode up my list of things to noddle around with. A seemingly pretty solid starting point for this integration is org-ref[emacs-org-ref-github].

This package is certainly a lot bigger than most of the other ones I've pulled in thus far in my latest journey into emacs. It seems to be fairly aligned with other packages I plan on pulling in at some point or another, but otherwise I may just cannibalize the key bits of functionality that I use.

A wrinkle introduced along with org-ref is that it is not bundled with emacs. While this is not normally an issue it does slightly complicate the generation and publishing of the site through GitLab CI which was previously just calling a standard emacs Docker image. This could be addressed by packaging a more suitable image, though I'm more likely to first introduce Cask to manage the dependencies and then see what is needed from there. For starters I'm going to move the generation so that it's done pre-commit and the generated files are checked in to the repository where they can be published. Checking in generated files is a bit of a smell but it frees me up to come back to a better solution later.

Semantic Section IDs

One of the unforunate but understandable realizations when using exported org documents is that the generated ids have no semantic meaning. A naive hope would be that the ids were generated based on section headings or equivalent contents or that a strategy to do so existed. The CUSTOM_ID property was noticed in passing while reading the Org manual but anything more automagic seemed tempting. Poking around from `ox-html.el` seems to indicate that ids are generated either from that property or from the relatively simple org-export-get-reference function.

This lack of provided functionality is eminently reasonable: deriving a desirable id is relatively complex and fragile: knowing how to transform variations on possible content into a desirable and acceptable convention which is also unique within some potentially uncertain context introduces a fair amount of complexity. Further it is likely to produce suboptimal results in the event of collisions, text which is not suitable for such transformations (and constraining the text for suitably and immutability introduces a new limitation which could easily be avoided), or any content whose meaning is relative to enclosing context.

Ultimately therefore the simplicity and control that `CUSTOMID` provides more than offsets the trivial toil of explicitly specifying obvious values.

Any given heading will be qualified enough so that its meaning is clear without presuming its current context. This should allow the content to be referenced in a way which is unambiguous without tying its identity to its current location (therefore allowing any nagivable structure to evolve independently of the content and aligning with the concept that Cool URIs Don't Change).

TODO Cool URIs

Generation

For the purposes of generating through CI the site generation process should be reproducible independently of my current environment…but since I'm pre-generating the files at the moment I'll just configure my local emacs to take care of it.

As I want to publish the site of which this page is a part, I can configure the project here using the enclosing directory and publish to the public directory expected by GitLab.

(add-to-list
  'org-publish-project-alist
  (list
    "com-mattwhipple-www"
    :base-directory default-directory
    :base-extension "org"
    :publishing-directory (format "%spublic" default-directory)
    :publishing-function '(org-html-publish-to-html)
    :with-toc nil
    :recursive t
    :section-numbers nil))

Since I have some code blocks littered about that I'd like to evaluate without needing to confirm them, that check can be disabled:

(setq org-confirm-babel-evaluate nil)

With the project defined It can then be generated using:

(org-publish "com-mattwhipple-www")

Publishing

The site is published automatically using the CI/CD functionality provided by GitLab. As mentioned earlier, the build is currently reliant on additional emacs packages which introduces minor complications to automating the build which I'm deferring dealing with for now, so the publishing will first comprise of publishing the the pre-generated files which are checked in. A perk to this approach is that publishing happens nearly instantly after a push, but that doesn't seem like enough of a benefit to compensate for having to commit generated files (and potentially introducing irreproducible builds).

My uninformed hope was that with a static HTML site the pages configuration could just specify the path to the files, the pipeline however led to the error:


jobs pages config should implement a script: or a trigger: keyword

I therefore am borrowing from the provided Plain HTML example[git-lab-pages-plain-html], The pipeline must understandably be complete which means that the simplest case still needs to explicitly define the resources and actions which will effectively do nothing.

An minimal image needs to be specified, presumably since some image or other would always be required to execute in the containerized CI/CD execution environment. Normally it would be wise to pin to a more specific tag but as the required functionality is so incredibly simple I'll just copy from the example. The entire resulting .gitlab-ci.yml is just a copy from the example so this content doesn't add anything other than additional commentary (and the file is likely to evolve).

image: alpine:latest

pages:
  stage: deploy

As the error encountered suggests, a script or trigger is required so one is defined which is effectively a no-op.

  script:
    - echo 'Nothing to do...'

Finally comes the significant bit of publishing the pre-generated files from the master branch.

  artifacts:
    paths:
      - public
  only:
    - master

TODO Re-automate generation

Customizing the Domain

Mapping my personal domain to point to the GitLab pages site was relatively straightforward using the instructions in the GitLab documentation. A minor bump was encountered due to the inconvenient and potentially incorrect information for the TXT verification record.

The inconvenience was due to the GitLab UI providing copying of the complete record whereas the management UI for the zone file splits out the individual fields; this was fairly easily resolved by more selective copying and pasting.

The more interesting issue and one that is potentially incorrect was due to the host part of the record seemingly being a fully qualified domain name but not including a trailing `.`. When added to the record as is it created a relative host thereby including a spurious domain, resulting in something like verification.mattwhipple.com.mattwhipple.com rather than the expected verification.mattwhipple.com. To resolve this I removed that part of the domain from the host field, though I think a trailing period would have also done the trick.

These potentially warrant updates to the GitLab documentation, but at the moment I'm unaware how much of it may be covered in existing documentation around DNS and also uncertain how much the minor hassles I encountered would apply to other users. The likelihood of my spending time to gain clarity around those conerns is pretty low, and therefore so is the corresponding likelihood that I'd submit an update or issue for the GitLab documentation.

Footer

Bibliography

  • [emacs-org-mode-info] @miscemacs-org-mode-info, title = The Org Manual (Info), url = https://orgmode.org/org.html, status = wip,
  • [git-lab-pages-docs] @miscgit-lab-pages-docs, title = GitLab Pages, url = https://docs.gitlab.com/ee/user/project/pages/index.html, notes = The docs can be a bit thin but the combination of docs and tracking down relevant examples and general knowledge seems to normally get things moving., status = done,
  • [emacs-org-ref-github] @miscemacs-org-ref-github, title = org-ref (GitHub), url = https://github.com/jkitchin/org-ref, status = wip,
  • [git-lab-pages-plain-html] @miscgit-lab-pages-plain-html, title = Example plain HTML site using GitLab Pages: https://pages.gitlab.io/plain-html, url = https://gitlab.com/pages/plain-html, status = done, notes = This example provides a good bare minimal GitLab Pages setup.

TODO Adjust bibliography styles

Author: mwhipple

Created: 2020-10-26 Mon 08:46

Validate