Tangle - Matt Whipple

Tangle

Overview

I’m an advocate of Literate Programming and so one of the first tools I’ll be looking for is something to tangle some source, and in the nature of going from scratch I’ll start with something thrown together in C.

I’ll be using Markdown(“Markdown - Wikipedia” 2021) as the source input and can rely on additional attributes associated with fenced code blocks to indicate tangle targets.

The simplest initial solution would just be to look for fenced blocks and output those blocks with a specific file, so in the interest of bootstrapping I’ll start with that. This could be executed with a command similar to < lpfile tangle filename > tangledsource.

To actually bootstrap this I can copy the file and quickly comment out the non-code portions.

This is particularly limited and is likely to be replaced to allow for more expressive structuring of the programs. If this addresses short term needs it may be kept long enough for me to adopt an existing tool or more accessible platform.

With the initial simple tangling the code will be written in order which also requires covering the prototypes for forward references in C.

Imports

I’ll start with some standard imports to help with I/O on strings and other typical behavior.

This includes string.h(IEEE/The Open Group 2017),..

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

Prototypes

This should be small enough that most of the logic can be left in main, but parsing the file out seems worth extracting. Doing a comparison rather than extracting the file avoids needing to use the heap and can likely allow for more reuse of behavior.

int has_file(const char* line, const char* name, const size_t name_len);

Main

Main will expect a single filename as an argument and for now perform virtually no argument validation. It will then loop over lines and track the state, entering output mode when a fenced block is seen with the desired file and exiting output mode when the end of such a fence is seen.

int
main(int argc, char**argv) {
  if (argc != 2) return EXIT_FAILURE;
  const char *filename = *(argv + 1);
  const size_t file_attribute_len = strlen(filename) + 5;
  const char *file_attribute = malloc(file_attribute_len + 1);
  sprintf(file_attribute, "file=%s", filename);

  int in_output_mode = 0;
  const char* line = NULL;
  size_t len = 0;

  while (getline(&line, &len, stdin) != -1) {
    if (in_output_mode) {
      if (!strncmp(line, "~~~", 3)) in_output_mode = 0;
      else printf("%s", line);
      continue;
    }

    if (!strncmp(line, "~~~", 3) &&
        has_file(line, file_attribute, file_attribute_len)) in_output_mode = 1;
  }

  return EXIT_SUCCESS;
}

has_file

The implementation of has_file can largely just rely on strstr with an additional check that the substring is followed by a word boundary to avoid possible prefix collisions.

int
has_file(const char *line, const char *name, const size_t name_len) {
  const char *found = strstr(line, name);
  if (found == NULL) return 0;
  switch(*(found + name_len)) {
    case '\0':
    case ' ':
    case '}': return 1;
    default: return 0;
  }
}
IEEE/The Open Group. 2017. “String.h(0P) - POSIX Programmer’s Manual.”
“Markdown - Wikipedia.” 2021. https://en.wikipedia.org/wiki/Markdown.