agenda

As part of my most recent tool rebooting I’ve shifted away from org-mode and have instead adopted Pandoc enhanced markdown. One of the features available in org-mode which I’m now looking to replace is the generation of an agenda based on todo items.

This will ultimately likely be handled by existing tools such as awk, but as part of larger ideological ambitions I’m starting with using C.

includes

The agenda objects will make use of date/time values and so time.h will be included(1).

_X_OPEN_SOURCE is defined to allow use of strptime and the value ensures getline remains available.

#define _XOPEN_SOURCE 700
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

types

Attribute

Each item may have zero or many attributes which represent the additional information in the markdown. The struct itself will be designed to capture the text in the markdwon directly along with any additional calculated fields which may or may not be valid/present. There is an assumption that the name of the attribute implies the semantics of its data.

struct Attribute {
  char           *name;
  char           *value;
  time_t         time;
};

AgendaItem

An item itself keeps track of its location within the source files, its title text, and its attributes.

struct AgendaItem {
  char              *location;
  char              *title;
  int               attribute_count;
  struct Attribute  *attributes;
};

AgendaItemNode

The AgendaItems will be collected into a binary tree to facilitate sorting, this type captures the vertices within that tree.

This will also capture one or more weights which will be used to sort the nodes.

struct AgendaItemNode {
  struct AgendaItem     item;
  struct AgendaItemNode *left;
  struct AgendaItemNode *right;
  long                  *weights;
};

Forward References

The purpose of defined functions will be explained here. When I get around to making my tangle program slightly better or switch to using an established solution the forward references can be defined alongside the implementation, but currently everything is kept in order and C requires things to be declared before reference to reduce required passes.

collect_from_filename

Process each file while providing appropriate resource management. Any parsed AgendaItems will be collected into the passed tree. The new root of the tree will be returned.

struct AgendaItemNode *collect_from_filename(struct AgendaItemNode *collection, char *filename);

has_item

Check with the current line has an agenda item and should be parsed.

const int has_item(const char *line);

parsing

Parsing is broken up where each field is populated in separate methods and then the complete parsed item is returned. The original implementation of this advanced the pointer through the line as it parsed, but that seems likely to be an attempt to be too clever and so is avoided for now. Support for tracking parsed vs. unparsed data seems potentially useful but needs more substance before pursuing.

const struct AgendaItem parse_item(char *line, const char *location);
void populate_attributes(struct AgendaItem *ai, char *line);
void populate_title(struct AgendaItem *ai, const char *line);

Attribute Retrieval

A small helper will be provided to return the Attribute with the requested name if present (NULL if not).

struct Attribute *attribute_with_name(struct AgendaItem ai, char *name);

rendering

The output of this program is the AgendaItems in some form, these function takes care of rendering an item to stdout.

void print_item(struct AgendaItem* ai);

sorting

Sorting will consist of determining the weights for AgendaItemNodes as they are added, and comparison of weights to determine placement within the tree.

The algorithm for weighing items will be hardcoded for now.

long *weigh_item(struct AgendaItem ai);
long compare_weights(long *x, long *y);

tree functions

The function will consist of populating a tree of items and then traversing the tree with some handler. collect returns the new root of the tree.

struct AgendaItemNode *collect(struct AgendaItemNode *collection, struct AgendaItem item);
void visit(struct AgendaItemNode *collection, void (*handle)(struct AgendaItem *ai));

main

As tracking file is one of the purposes, each source file should be passed as an argument and then it will be processed by name.

The collected collection will then be output.

For the moment I’m leaving memory cleanup to the process exit, but more explicit cleanup is likely to be added afterward.

int
main(int argc, char **argv) {
  struct AgendaItemNode *collection = NULL;

  for (argv++; *argv; argv++) {
    collection = collect_from_filename(collection, *argv);
  }

  visit(collection, print_item);

  return EXIT_SUCCESS;
}

collect_from_filename

Collecting from files handles the management of the file resource and walks through each line of file parsing items and adding them to the collection when relevant. The root of the collection is returned.

struct AgendaItemNode
*collect_from_filename(struct AgendaItemNode *collection, char *filename) {
  FILE *fp = fopen(filename, "r");
  if (!fp) {
    fprintf(stderr, "Error opening %s\n", filename);
    return collection;
  }

  char *line = NULL;
  size_t len = 0;

  for(; getline(&line, &len, fp) != -1;) {
    if (!has_item(line)) continue;
    struct AgendaItem ai = parse_item(line, filename);
    collection = collect(collection, ai);
  } 

  fclose(fp);
  return collection;
}

has_item

The predicate currently used is a dumb string check.

const int
has_item(const char *line) {
  return strstr(line, "TODO") != NULL;
}

parse_item

Most of the actual parsing logic is done in the populate methods so this function creates the struct and passes it through to those.

const struct AgendaItem
parse_item(char *line, const char *location) {
  struct AgendaItem ai = {.location = strdup(location) };
  populate_title(&ai, line);
  populate_attributes(&ai, line);
  return ai;
}

populate_attributes

Currently attributes are expected to be enclosed within curly braces and in \(name=\)value format. Currently this stays as dumb as possible so the end brace isn’t validated and there’s nothing supported in the way of escaping or quoting or spaces in values. This will be made more robust as needed (though likely replaced by something else).

void populate_attributes(struct AgendaItem *ai, char *line) {

Find Metadata

This will start by advancing to the brace enclosed metadata block and returning if there is none.

  char *metadata = strstr(line, "{");
  if (!metadata) return;

Count Attributes

To consolidate allocation the number of attributes will be counted. With the simple parsing that is assumed to be the number of =s before the end of the metadata block or the end of the line.

  int equals = 0;
  for (char *e=metadata; e && *e != '}'; e++)
    if (*e == '=') equals++;
  if ((ai->attributes = calloc(equals, sizeof(struct Attribute))) == NULL) {
    fprintf(stderr, "Error allocating memory for attributes!");
    return;
  }
  ai->attribute_count = equals;

Parse Attributes

The parsing of attributes (like their counting) will be driven by =s and will make use of two pointers: split will track where the = indicates the split between the name and the value and mark will be used to bound and collect the values. The name will be retrieved from the text between split and the trailing mark which is left at the start of the name after which mark will be advance past split to the end of the value which will then be collected.

Those pointers are defined here and each previously allocated attribute is selected for population. The initialization value of mark is normally irrelevant but seems likely to help with the edge case of an initial =.

  char *split = line;
  char *mark = line;

  for (int att=0; att < equals; att++) {
    struct Attribute *attribute = &ai->attributes[att];

Name Extraction

Names are extracted by leaving mark as a trailing pointer. split looks for the next = and any time it passes over an attribute boundary (a space) it sets mark to point to the character after that boundary such that when split lands on an =, mark points to the beginning of the text after the preceeding boundary. This trusts some of the conditions that should have been validated by the previously run attribute counting.

    for(; *split != '='; split++) {
      if (*split == ' ') mark = split + 1;
    }
    attribute->name = strndup(mark, split-mark);

Value Extraction

Extracting the value drops mark after the split and advances to the next boundary and then copies the intervening text.

This also blindly tries to set additional representations of the value which currently consists of time; any valid values will be populated and any invalid values should just leave a zero value. A better solution would likely be to rely on additional syntax which can clearly indicate intent/richer types. Such an approach is likely to be adopted as options are evaluated but this bit of DWIMmery addresses immediate needs.

    for(mark=++split; *mark != ' ' && *mark != '}'; mark++) {
      attribute->value = strndup(split, (mark-split)+1);
      struct tm parsed;
      memset(&parsed, 0, sizeof(struct tm));
      if (strptime(attribute->value, "%Y-%m-%d", &parsed) == NULL) continue;
      time_t t = mktime(&parsed);
      if (t > 0) attribute->time = t;
    }
  }
}

populate_title

Title population is currently relatively straightforward: assume the first { is the start of metadata and everything up to that character is the title (or the full line with no newline if no brace is present).

void
populate_title(struct AgendaItem* ai, const char* line) {
  char *metadata = strchr(line, '{');
  if (metadata) ai->title = strndup(line, metadata-line);
  else ai->title = strndup(line, strlen(line)-1);
}

attribute_with_name

Iterate over the Attributes and return the first with the matching name (there could be multiple).

struct Attribute
*attribute_with_name(struct AgendaItem ai, char *name) {
  for (int i = 0; i < ai.attribute_count; i++) {
    if (!strcmp(ai.attributes[i].name, name)) return &ai.attributes[i];
  }
  return NULL;
}

print_item

An item will be printed with its location and title. For now some harcoded special treatment of the scheduled attribute will also print that if it is present (specifically the first one found since there’s no enforcement of unique names).

void
print_item(struct AgendaItem *ai) {
  printf("%s: %s", ai->location, ai->title);

  struct Attribute *scheduled = attribute_with_name(*ai, "scheduled");
  if (scheduled) printf(" [%s]", scheduled->value);

  printf("\n");
}

populate_weight

Weight will currently be based on any priority and scheduled date. This will be done in strict order of priority first and then date.

The priority will be similar to niceness values in that negative numbers can be used and lower the number the higher the priority. Anything without a provided priority defaults to 0.

The current algorithm will sort lower weights earlier and those without dates should be at the end and therefore if there is value then the weight will default to the present.

long
*weigh_item(struct AgendaItem ai) {
  long *weights = calloc(2, sizeof(long));
  if (weights == NULL) return NULL;

  struct Attribute *priority = attribute_with_name(ai, "priority");
  weights[0] = priority ? atoi(priority->value) : 0;    

  struct Attribute *scheduled = attribute_with_name(ai, "scheduled");
  if (scheduled && scheduled->time) weights[1] = scheduled->time;
  else weights[1] = time(NULL);

  return weights;
}

compare_weights

Multiple weights are compared to allow for levels of sorting, so this will return based on the first difference detected or indicate equality if there are no differences. Any negative value indicates x is less than y, positive indicates y is less than x, and zero indicates they are equal or unable to be compared.

long
compare_weights(long *x, long *y) {
  for (;x && y; x++, y++) {
    long diff = *x - *y;
    if (diff) return diff;
  }
  return 0;
}

collect

Collection involves creating an AgendaItemNode to contain the passed item and then adding that node to the existing binary tree. The tree will be sorted such that lower weights are to the left. Currently the tree is not balanced in any way so worst case performance is linear. The current expected size of data makes this a non-issue but it should be addressed later.

struct AgendaItemNode
*collect(struct AgendaItemNode *collection, struct AgendaItem ai) {
  struct AgendaItemNode *node = malloc(sizeof(struct AgendaItemNode));
  if (node == NULL) {
    fprintf(stderr, "Allocation failure");
    return NULL;
  }
  node->left = node->right = NULL;
  node->item = ai;
  node->weights = weigh_item(ai);

  // Become root if collection is empty.
  if (!collection) return node;

  for(struct AgendaItemNode *possibleParent = collection;;) {
    if (compare_weights(node->weights, possibleParent->weights) < 0) {
      if (!possibleParent->left) {
        possibleParent->left = node;
        return collection;
      }
      possibleParent = possibleParent->left;
    } else {
      if (!possibleParent->right) {
        possibleParent->right = node;
        return collection;
      }
      possibleParent = possibleParent->right;
    }
  }
}

visit

Visit traverses the tree and calls the handler on each element. This may match the Visitor pattern but I lose track of the specifics of patterns and therefore want to avoid inaccurate implications.

This consists of a depth first in-order traversal. As the tree should remain fairly shallow this will just make use of recursion with the execution stack.

void
visit(struct AgendaItemNode *collection, void (*handle)(struct AgendaItem* ai)) {
  if (collection->left) visit(collection->left, handle);
  handle(&collection->item);
  if (collection->right) visit(collection->right, handle);
}
1.
GROUP, IEEE/The Open. Time.h(0P) - POSIX programmer’s manual. 2017.