Personal Software Practices

I have a lot of projects that I work on. Over the years, I’ve developed some simple, straightforward conventions that apply across languages and project types. Following these conventions helps lower the “context switching” cost. Many of these practices are essentially arbitrary in their details, but I try to follow them because simply having some convention saves time and increases readability, even if they’re not perfect.

Even projects I haven’t worked on in a while feel familiar if I know I can always open docs/Development.md to quickly see how to run the tests, or that I can easily find the tests related to the place I need to make a change by looking at the “test ID” TID:YYMMDD comments in the code.

If these conventions seem useful to you, I encourage you to adapt them to your own needs. Many of the conventions described here are adaptations of existing conventions I myself merely came across one day and thought were useful.

Source Code

Git Commits

Inspired by Conventional Commits, below are some of the commit prefix tags I use to give some semantic structure to my commit messages.

feat — Adds new user-facing functionality

Case: f6edaf9 feature: Support `--diff-context` option to `wolfram-cli paclet test`

Case: 5970650 feature: ` #[init]` macro to automatically define a safe library initialization function (#13)

fix — Fixes a user-facing behavior bug

cleanup — Refactoring or reorganization that does not change user-facing behavior

polish — Improves the behavior of existing user-facing functionality

chore — Change not related to the main functionality of the project, typically administrative or build system related.

docs — Changes to documentation, e.g. adding new content

test — Adding or updating tests; should avoid changing library behavior

release — Actions related to performing a versioned release, e.g. updating version numbers in build files or updating the CHANGELOG.md.

fixup — Fix a minor typo-like mistake that hasn’t been part of a release yet (otherwise this would be a bugfix or featfix).

Managing History

My approach to sizing git commits is based on two goals:

At every commit, a repository should compile, build, and run tests successfully.

Each commit should be the minimum number of self-contained lines.

One advantage of this approach is that it effectively nudges towards an easy to understand total ordering of changes.

Making small commits does not mean commit as quickly as possible. It is not unheard of for me to have large diffs of uncommitted changes that number in the hundreds of lines. However, that most often happens because I’m exploring possible approaches or tradeoffs to implementing the change.

Often, the exploratory process reveals possible implementation choices that seem plausible at first, but are not ideal for subtle reasons. So, when I’ve built up a large amount of uncommitted changes and feel confident in the direction, I look for some subset of those lines that don’t depend on the rest, and I commit those changes. And I repeat that process until there are no more uncommitted changes.

This ensures that even though the exploration process may have been messy, its easy for a reader to later understand exactly what changed, and why.

Test ID Tags

Often when writing or reading code, its useful to know both whether its tested at all, and if so, where those tests are so you can review them. It’s useful to be able to check your understanding of the logic against a concrete example that exercises it, and for lots of code, particularly gnarly internal logic, tests are the best examples available.

Test ID Tags (TIDs) address that discovery problem by being a simple, light-weight way to provide a searchable string that connects logic and associated tests. By placing a TID both in your library code and in the corresponding test, a simple code search or grep for the TID will find the corresponding test(s):

TIDs have the format: TID:YYMMDD/N: description, where:

TID — fixed string that indicates this is a test identifier

YY — last two digits of the current year

MM, DD — two digit number of the current month and day

N — incrementing counter for pseudo uniqueness

description — optional (but recommended) one-line description of the behavior being tested

All TIDs necessarily must appear in at least two places (within the code and within at least one test). It is recommended but not required that the description line at both locations be kept in sync.

As an example, here’s a typical use of a TID in Diagrams, where some code that raises an error links to the test that exercises that condition:

Source
Test
Grep

Code from BlockStack.wl:

	If[KeyExistsQ[$regions, id],
		(* TID:240721/5: Duplicate IDs in BlockStackDiagram *)
		Raise[
			DiagramError,
			"Region value already defined for ID: ``, value: ``",
			InputForm[id],
			InputForm[value]
		];
	];

Code from TestBlockStackDiagram.wlt:

(* TID:240721/5: Duplicate IDs in BlockStackDiagram *)
VerificationTest[
	BlockStackDiagram[{
		{1, {
			DiaID["A"] @ "A",
			DiaID["A"] @ "B"
		}}
	}, "Regions"]
	,
	Failure[DiagramError, <|
		"CausedBy" -> Failure[DiagramError, <|
			"MessageTemplate" -> "Region value already defined for ID: ``, value: ``",
			"MessageParameters" -> {
				InputForm[DiaID["A"]],
				InputForm[Rectangle[{1, 0}, {2, 1}]]
			}
		|>],
		"MessageTemplate" -> "Error creating BlockStackDiagram",
		"MessageParameters" -> {}
	|>]
]

You can use a grep tool (here ripgrep) to search for TIDs with context to see where they appear.

$ rg TID:240721/5 -A 5
Diagrams/Tests/TestBlockStackDiagram.wlt
22:(* TID:240721/5: Duplicate IDs in BlockStackDiagram *)
23-VerificationTest[
24-	BlockStackDiagram[{
25-		{1, {
26-			DiaID["A"] @ "A",
27-			DiaID["A"] @ "B"

Diagrams/paclets/Diagrams/Package/Kinds/BlockStack.wl
47:		(* TID:240721/5: Duplicate IDs in BlockStackDiagram *)
48-		Raise[
49-			DiagramError,
50-			"Region value already defined for ID: ``, value: ``",
51-			InputForm[id],
52-			InputForm[value]

The TID format is meant to be simple to write and search for. The included date can help give a sense of the age of the code, and acts as a easy-to-remember bit of entropy that avoids collisions between TIDs. However, collisions are still possible, particularly in multi-developer projects.

Additionally, TIDs are not globally unique. They’re optimized for being easy and quick to write (so that developers actually write them!). TIDs are primarily intended for humans, not computer processing.

When adding a TID, be sure to:

Choose a TID value based on the current date and an N value starting at 1, and check that the combination is unique by first doing a search for existing uses.

Include the TID in a comment just above the test itself.

Include the TID in a comment above the function or statement in your main program logic that is intended to be tested.

A TID on a piece of code should indicate which test ought to fail if some aspect of that line of code is changed. This increases confidence when changing subtle code—just lookup the corresponding tests by TID and see what input they pass.

When changing code doesn’t cause any related tests to fail (a sign of missing test coverage perhaps), finding associated tests via TIDs can be beneficial to the reader by providing a concrete example of expected inputs, clarifying the purpose and behavior of the code as the programmer “simulates” the execution in their head.

Projects I’ve used this convention in include:

WolframResearch/codeparser

ConnorGray/Diagrams

ConnorGray/Markdown

Documentation

Standard Layout

docs/Development.md — shows concrete examples of common commands to develop the project, including building artifacts and running tests

docs/Maintenance.md — describes steps that need to be taken to maintain the project, e.g. places in the code or documentation to update when making a new release.

docs/CHANGELOG.md — standard user-visible description of behavior changes; I follow the Keep a Changelog convention.

docs/CommandLineHelp.md

Generated by my clap-markdown crate

Case: wolfram-app-discovery – CommandLineHelp.md

Case: wolfram-cli – CommandLineHelp.md

docs/FeatureOverview.md

Feature Overview Documents

A ‘feature overview’ document is a type of documentation that attempts to be a non-narrative enumeration of all of the functionality of a project.

I find these documents useful because they are a low-overhead, unstructured way to quickly make note of new features I’m adding to a project undergoing rapid development, without getting bogged down in the more careful thinking that is necessary to write long-form narrative/tutorial type documentation. Getting something on the page is better than waiting to write something ‘perfect’.

Having a document that describes all the facets of a project early on helps me by making it clear what features exist and what level of polish they’re at, and is intended to help potential early adopters by giving a “flavour” of what is possible with the project.

Some examples of where I’ve applied this convention include:

NotebookWebsiteTools 'Feature Overview' example


     md2nb

'everything but the kitchen sink' example