As part of my most recent tool rebooting I’ve shifted away from org-mode and have instead adopted Pandoc enhanced markdown. One of the features available in org-mode which I’m now looking to replace is the generation of an agenda based on todo items.
This will ultimately likely be handled by existing tools such as awk, but as part of larger ideological ambitions I’m starting with using C.
The agenda objects will make use of date/time values and so
time.h
will be included(1).
_X_OPEN_SOURCE
is defined to allow use of
strptime
and the value ensures getline
remains
available.
#define _XOPEN_SOURCE 700
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
Each item may have zero or many attributes which represent the additional information in the markdown. The struct itself will be designed to capture the text in the markdwon directly along with any additional calculated fields which may or may not be valid/present. There is an assumption that the name of the attribute implies the semantics of its data.
struct Attribute {
char *name;
char *value;
time_t time;
};
An item itself keeps track of its location within the source files, its title text, and its attributes.
struct AgendaItem {
char *location;
char *title;
int attribute_count;
struct Attribute *attributes;
};
The AgendaItems will be collected into a binary tree to facilitate sorting, this type captures the vertices within that tree.
This will also capture one or more weights which will be used to sort the nodes.
struct AgendaItemNode {
struct AgendaItem item;
struct AgendaItemNode *left;
struct AgendaItemNode *right;
long *weights;
};
The purpose of defined functions will be explained here. When I get around to making my tangle program slightly better or switch to using an established solution the forward references can be defined alongside the implementation, but currently everything is kept in order and C requires things to be declared before reference to reduce required passes.
Process each file while providing appropriate resource management. Any parsed AgendaItems will be collected into the passed tree. The new root of the tree will be returned.
struct AgendaItemNode *collect_from_filename(struct AgendaItemNode *collection, char *filename);
Check with the current line has an agenda item and should be parsed.
const int has_item(const char *line);
Parsing is broken up where each field is populated in separate methods and then the complete parsed item is returned. The original implementation of this advanced the pointer through the line as it parsed, but that seems likely to be an attempt to be too clever and so is avoided for now. Support for tracking parsed vs. unparsed data seems potentially useful but needs more substance before pursuing.
const struct AgendaItem parse_item(char *line, const char *location);
void populate_attributes(struct AgendaItem *ai, char *line);
void populate_title(struct AgendaItem *ai, const char *line);
A small helper will be provided to return the Attribute with the requested name if present (NULL if not).
struct Attribute *attribute_with_name(struct AgendaItem ai, char *name);
The output of this program is the AgendaItems in some form, these function takes care of rendering an item to stdout.
void print_item(struct AgendaItem* ai);
Sorting will consist of determining the weights for AgendaItemNodes as they are added, and comparison of weights to determine placement within the tree.
The algorithm for weighing items will be hardcoded for now.
long *weigh_item(struct AgendaItem ai);
long compare_weights(long *x, long *y);
The function will consist of populating a tree of items and then traversing the tree with some handler. collect returns the new root of the tree.
struct AgendaItemNode *collect(struct AgendaItemNode *collection, struct AgendaItem item);
void visit(struct AgendaItemNode *collection, void (*handle)(struct AgendaItem *ai));
As tracking file is one of the purposes, each source file should be passed as an argument and then it will be processed by name.
The collected collection will then be output.
For the moment I’m leaving memory cleanup to the process exit, but more explicit cleanup is likely to be added afterward.
int
(int argc, char **argv) {
mainstruct AgendaItemNode *collection = NULL;
for (argv++; *argv; argv++) {
= collect_from_filename(collection, *argv);
collection }
(collection, print_item);
visit
return EXIT_SUCCESS;
}
Collecting from files handles the management of the file resource and walks through each line of file parsing items and adding them to the collection when relevant. The root of the collection is returned.
struct AgendaItemNode
*collect_from_filename(struct AgendaItemNode *collection, char *filename) {
FILE *fp = fopen(filename, "r");
if (!fp) {
(stderr, "Error opening %s\n", filename);
fprintfreturn collection;
}
char *line = NULL;
size_t len = 0;
for(; getline(&line, &len, fp) != -1;) {
if (!has_item(line)) continue;
struct AgendaItem ai = parse_item(line, filename);
= collect(collection, ai);
collection }
(fp);
fclosereturn collection;
}
The predicate currently used is a dumb string check.
const int
(const char *line) {
has_itemreturn strstr(line, "TODO") != NULL;
}
Most of the actual parsing logic is done in the populate methods so this function creates the struct and passes it through to those.
const struct AgendaItem
(char *line, const char *location) {
parse_itemstruct AgendaItem ai = {.location = strdup(location) };
(&ai, line);
populate_title(&ai, line);
populate_attributesreturn ai;
}
Currently attributes are expected to be enclosed within curly braces and in \(name=\)value format. Currently this stays as dumb as possible so the end brace isn’t validated and there’s nothing supported in the way of escaping or quoting or spaces in values. This will be made more robust as needed (though likely replaced by something else).
void populate_attributes(struct AgendaItem *ai, char *line) {
This will start by advancing to the brace enclosed metadata block and returning if there is none.
char *metadata = strstr(line, "{");
if (!metadata) return;
To consolidate allocation the number of attributes will be counted.
With the simple parsing that is assumed to be the number of
=
s before the end of the metadata block or the end of the
line.
int equals = 0;
for (char *e=metadata; e && *e != '}'; e++)
if (*e == '=') equals++;
if ((ai->attributes = calloc(equals, sizeof(struct Attribute))) == NULL) {
(stderr, "Error allocating memory for attributes!");
fprintfreturn;
}
->attribute_count = equals; ai
The parsing of attributes (like their counting) will be driven by
=
s and will make use of two pointers: split
will track where the =
indicates the split between the name
and the value and mark
will be used to bound and collect
the values. The name will be retrieved from the text between
split
and the trailing mark
which is left at
the start of the name after which mark
will be advance past
split to the end of the value which will then be collected.
Those pointers are defined here and each previously allocated
attribute is selected for population. The initialization value of mark
is normally irrelevant but seems likely to help with the edge case of an
initial =
.
char *split = line;
char *mark = line;
for (int att=0; att < equals; att++) {
struct Attribute *attribute = &ai->attributes[att];
Names are extracted by leaving mark as a trailing pointer.
split
looks for the next =
and any time it
passes over an attribute boundary (a space) it sets mark
to
point to the character after that boundary such that when
split
lands on an =
, mark
points
to the beginning of the text after the preceeding boundary. This trusts
some of the conditions that should have been validated by the previously
run attribute counting.
for(; *split != '='; split++) {
if (*split == ' ') mark = split + 1;
}
->name = strndup(mark, split-mark); attribute
Extracting the value drops mark
after the
split
and advances to the next boundary and then copies the
intervening text.
This also blindly tries to set additional representations of the
value which currently consists of time
; any valid values
will be populated and any invalid values should just leave a zero value.
A better solution would likely be to rely on additional syntax which can
clearly indicate intent/richer types. Such an approach is likely to be
adopted as options are evaluated but this bit of DWIMmery addresses
immediate needs.
for(mark=++split; *mark != ' ' && *mark != '}'; mark++) {
->value = strndup(split, (mark-split)+1);
attributestruct tm parsed;
(&parsed, 0, sizeof(struct tm));
memsetif (strptime(attribute->value, "%Y-%m-%d", &parsed) == NULL) continue;
time_t t = mktime(&parsed);
if (t > 0) attribute->time = t;
}
}
}
Title population is currently relatively straightforward: assume the
first {
is the start of metadata and everything up to that
character is the title (or the full line with no newline if no brace is
present).
void
(struct AgendaItem* ai, const char* line) {
populate_titlechar *metadata = strchr(line, '{');
if (metadata) ai->title = strndup(line, metadata-line);
else ai->title = strndup(line, strlen(line)-1);
}
Iterate over the Attributes and return the first with the matching name (there could be multiple).
struct Attribute
*attribute_with_name(struct AgendaItem ai, char *name) {
for (int i = 0; i < ai.attribute_count; i++) {
if (!strcmp(ai.attributes[i].name, name)) return &ai.attributes[i];
}
return NULL;
}
An item will be printed with its location and title. For now some
harcoded special treatment of the scheduled
attribute will
also print that if it is present (specifically the first one found since
there’s no enforcement of unique names).
void
(struct AgendaItem *ai) {
print_item("%s: %s", ai->location, ai->title);
printf
struct Attribute *scheduled = attribute_with_name(*ai, "scheduled");
if (scheduled) printf(" [%s]", scheduled->value);
("\n");
printf}
Weight will currently be based on any priority and scheduled date. This will be done in strict order of priority first and then date.
The priority will be similar to niceness values in that negative
numbers can be used and lower the number the higher the priority.
Anything without a provided priority defaults to 0
.
The current algorithm will sort lower weights earlier and those without dates should be at the end and therefore if there is value then the weight will default to the present.
long
*weigh_item(struct AgendaItem ai) {
long *weights = calloc(2, sizeof(long));
if (weights == NULL) return NULL;
struct Attribute *priority = attribute_with_name(ai, "priority");
[0] = priority ? atoi(priority->value) : 0;
weights
struct Attribute *scheduled = attribute_with_name(ai, "scheduled");
if (scheduled && scheduled->time) weights[1] = scheduled->time;
else weights[1] = time(NULL);
return weights;
}
Multiple weights are compared to allow for levels of sorting, so this will return based on the first difference detected or indicate equality if there are no differences. Any negative value indicates x is less than y, positive indicates y is less than x, and zero indicates they are equal or unable to be compared.
long
(long *x, long *y) {
compare_weightsfor (;x && y; x++, y++) {
long diff = *x - *y;
if (diff) return diff;
}
return 0;
}
Collection involves creating an AgendaItemNode to contain the passed item and then adding that node to the existing binary tree. The tree will be sorted such that lower weights are to the left. Currently the tree is not balanced in any way so worst case performance is linear. The current expected size of data makes this a non-issue but it should be addressed later.
struct AgendaItemNode
*collect(struct AgendaItemNode *collection, struct AgendaItem ai) {
struct AgendaItemNode *node = malloc(sizeof(struct AgendaItemNode));
if (node == NULL) {
(stderr, "Allocation failure");
fprintfreturn NULL;
}
->left = node->right = NULL;
node->item = ai;
node->weights = weigh_item(ai);
node
// Become root if collection is empty.
if (!collection) return node;
for(struct AgendaItemNode *possibleParent = collection;;) {
if (compare_weights(node->weights, possibleParent->weights) < 0) {
if (!possibleParent->left) {
->left = node;
possibleParentreturn collection;
}
= possibleParent->left;
possibleParent } else {
if (!possibleParent->right) {
->right = node;
possibleParentreturn collection;
}
= possibleParent->right;
possibleParent }
}
}
Visit traverses the tree and calls the handler on each element. This may match the Visitor pattern but I lose track of the specifics of patterns and therefore want to avoid inaccurate implications.
This consists of a depth first in-order traversal. As the tree should remain fairly shallow this will just make use of recursion with the execution stack.
void
(struct AgendaItemNode *collection, void (*handle)(struct AgendaItem* ai)) {
visitif (collection->left) visit(collection->left, handle);
(&collection->item);
handleif (collection->right) visit(collection->right, handle);
}