Posts tagged with "BlogMore"

It was such a simple request

5 min read

As mentioned a couple of times in the last couple of days, aside from one particular issue I found and fixed, I'm in more of a "let's review some of the code and tidy things up" phase with the codebase. This process is at times me hand-making changes, and also in part me directing the agent to make a very specific improvement that I want.

Yesterday evening I did a little experiment of getting Gemini CLI to look for code that really needed some cleaning up, and then I had it write the issue text which I fed directly to Copilot/Claude and had it do the work. Finally, when that was done, I had Gemini review the work that Copilot had done (it was "happy" with the changes).

So, this morning, I thought I'd tackle another little thing I'd noticed in the code that rubbed me up the wrong way. Early on in the development lifecycle of BlogMore I added the optional minification of CSS and JS files (HTML too eventually, but that's not involved here). Because it's often been a convention I also prompted Copilot to ensure that if a file called whatever.css was minified, it be called whatever.min.css.

The resulting code did something that made sense, but which I wouldn't ever have done. The constants that held the filenames looked like this:

CSS_FILENAME = "style.css"
CSS_MINIFIED_FILENAME = "styles.min.css"
SEARCH_CSS_FILENAME = "search.css"
SEARCH_CSS_MINIFIED_FILENAME = "search.min.css"
STATS_CSS_FILENAME = "stats.css"
STATS_CSS_MINIFIED_FILENAME = "stats.min.css"
ARCHIVE_CSS_FILENAME = "archive.css"
ARCHIVE_CSS_MINIFIED_FILENAME = "archive.min.css"
CALENDAR_CSS_FILENAME = "calendar.css"
CALENDAR_CSS_MINIFIED_FILENAME = "calendar.min.css"
GRAPH_CSS_FILENAME = "graph.css"
GRAPH_CSS_MINIFIED_FILENAME = "graph.min.css"
TAG_CLOUD_CSS_FILENAME = "tag-cloud.css"
TAG_CLOUD_CSS_MINIFIED_FILENAME = "tag-cloud.min.css"
GRAPH_JS_FILENAME = "graph.js"
GRAPH_JS_MINIFIED_FILENAME = "graph.min.js"
CODE_CSS_FILENAME = "code.css"
CODE_CSS_MINIFIED_FILENAME = "code.min.css"
THEME_JS_FILENAME = "theme.js"
THEME_JS_MINIFIED_FILENAME = "theme.min.js"
SEARCH_JS_FILENAME = "search.js"
SEARCH_JS_MINIFIED_FILENAME = "search.min.js"
CODEBLOCKS_JS_FILENAME = "codeblocks.js"
CODEBLOCKS_JS_MINIFIED_FILENAME = "codeblocks.min.js"

Like... sure, 10/10 for not hard-coding these all throughout the codebase as magic strings1, but this feels a little redundant. Personally I think I'd have just mentioned the non-minified name and then I'd have a function that generates the minified name from it. While technically, it would add the smallest amount of runtime overhead to the code, I think the single-source-of-truth pay-off is worth it.

For a good while though I left this alone. I was having fun playing with other things in the application, and adding all sorts of other amusing toys. But now that I'm more into a "how can this code be improved and what issues does the code have" mode, it felt like time to tackle this.

Given that a change here would touch so much of the code, and given I wasn't massively keen on spending ages walking through all the code and making the changes related to this, I decided to prompt Copilot to get on with this. It felt like something it couldn't get that wrong.

While it didn't get it wrong, as such, it made some questionable choices along the way. It did do the main thing I would have done: make a function to turn a filename into a minified filename. The initial version looked like this:

def minified_filename(source: str) -> str:
    """Compute the minified output filename for a given source filename.

    Transforms the file extension: ``.css`` becomes ``.min.css`` and
    ``.js`` becomes ``.min.js``.  For example, ``theme.js`` becomes
    ``theme.min.js`` and ``style.css`` becomes ``style.min.css``.

    Args:
        source: Source filename ending in ``.css`` or ``.js``.

    Returns:
        The corresponding minified filename.

    Raises:
        ValueError: If *source* does not end with ``.css`` or ``.js``.
    """
    if source.endswith(".css"):
        return source[: -len(".css")] + ".min.css"
    if source.endswith(".js"):
        return source[: -len(".js")] + ".min.js"
    raise ValueError(f"Unsupported file extension for minification: {source!r}")

That string-slicing with len and so on is nails on a chalkboard to me. When something like removesuffix exists, why on earth would "you" elect to do this? Of course the answer is obvious, but still... ugh.

Now, I will have to give credit to the process though. So the above was the initial version of the code. Once the PR had been created by Copilot, and I'd pulled it down for review and testing, it kicked off a review of its own. Reviewing its own code, it pushed back on itself:

In src/blogmore/generator.py, lines 90-93: The slice syntax source[: -len(\".css\")] is less readable than using source.removesuffix(\".css\"), which is available in Python 3.9+. Since this codebase targets Python 3.12+, consider using removesuffix() for clarity.

It then went on to do a further commit to tidy this up. I approve. Bonus point to Copilot here.

So now we have this:

def minified_filename(source: str) -> str:
    """Compute the minified output filename for a given source filename.

    Transforms the file extension: ``.css`` becomes ``.min.css`` and
    ``.js`` becomes ``.min.js``.  For example, ``theme.js`` becomes
    ``theme.min.js`` and ``style.css`` becomes ``style.min.css``.

    Args:
        source: Source filename ending in ``.css`` or ``.js``.

    Returns:
        The corresponding minified filename.

    Raises:
        ValueError: If *source* does not end with ``.css`` or ``.js``.
    """
    if source.endswith(".css"):
        return source.removesuffix(".css") + ".min.css"
    if source.endswith(".js"):
        return source.removesuffix(".js") + ".min.js"
    raise ValueError(f"Unsupported file extension for minification: {source!r}")

At this point the code is less worse. I don't think it's great, but it's less worse. Honestly, I think I'd be more inclined to do something with PurePath.suffixes and PurePath.suffix, leaning into the fact that we're dealing with filenames here, and so making it less about pure string slicing.

I also have other issues with the code, which I might still fix by hand:

  • The fact that it makes a point of only handling .css and .js files, and throws an error otherwise, is an odd choice. I mean, in context, that's what it's here to serve, but it seems oddly-specific and an attention to detail that wasn't really necessary.
  • The hard-coding of .min a couple of times grates a little.
  • The hard-coding of both .css and .js a couple of times, with the doubled-up if feels unnecessary.

It's a small function. It works in context. It does the job. But it also could be more elegant in the way it does it.

I'd also like to go on a small aside for a moment, because there's something else in the above that bothers me: yesterday evening I spent some time directing Copilot to tidy up all the docstrings in the code. While any agent I've thrown at it does seem to have taken note of the AGENTS.md file, and the instructions on how to write the docstrings (Google style please), it seems to have decided it was aiming more at Sphinx when it came to the content. That's fine, I hadn't been explicit.

So last night I made it clear that I wanted something more like I use in all my Python code, that aims to work with mkdocstrings. It should use the inline code and cross-reference styles that are more common when using that tool. I even made a point of telling Copilot to update AGENTS.md to make it clear that this is the preference:

- All inline code and cross-references in docstrings **must** use mkdocstrings-compatible Markdown style:
    - Inline code: use single backticks (\`like_this\`).
    - Cross-references: use mkdocstrings reference-style Markdown links (e.g., [`ClassName`][module.ClassName] or [module.ClassName][]).
    - Do **not** use Sphinx roles (e.g., :class:`ClassName`) or double-backtick code (``ClassName``).

Now go back and look at the docstring for minified_filename. So much for agents making a point of following the instructions from AGENTS.md.

Anyway, back to the main flow here: given that I was thinking that I might rewrite minified_filename by hand so that it works "just so", I made a point of checking that it had written tests for this; something I couldn't take for granted.

Again, to the credit of the agent, it had written some tests:

class TestMinifiedFilename:
    """Test the minified_filename utility function."""

    def test_css_extension_becomes_min_css(self) -> None:
        """Test that a .css extension is replaced with .min.css."""
        assert minified_filename("style.css") == "style.min.css"

    def test_js_extension_becomes_min_js(self) -> None:
        """Test that a .js extension is replaced with .min.js."""
        assert minified_filename("theme.js") == "theme.min.js"

    def test_hyphenated_css_filename(self) -> None:
        """Test that a hyphenated CSS filename is handled correctly."""
        assert minified_filename("tag-cloud.css") == "tag-cloud.min.css"

    def test_hyphenated_js_filename(self) -> None:
        """Test that a hyphenated JS filename is handled correctly."""
        assert minified_filename("search.js") == "search.min.js"

    def test_unsupported_extension_raises(self) -> None:
        """Test that an unsupported extension raises ValueError."""
        with pytest.raises(ValueError, match="Unsupported file extension"):
            minified_filename("style.txt")

It's a start, but I think it could be done better. There's the test of the intended outcomes, and the test of the ValueError for passing something that isn't a .js or a .css file. Meanwhile, that business of testing "hyphenated" seems oddly specific for no good reason. But it's even worse: the test for a "hyphenated" JS file doesn't use a hyphenated file name.

Hilarious.

That's not all. What about the more obvious things like testing what happens if you pass a filename that has no extension, or a filename that already has two extensions, or a filename that already ends in .min.js, or a filename that has .min.css somewhere in its path that isn't at the end of the name, or an empty string, or...

Also why aren't most of these tests done using pytest.mark.parametrize?

As I said a few days ago: the code is mostly fine. It gets the job done. I've seen worse. I reviewed worse. I've inherited worse. I think the thing that concerns me the most is that there has to be a lot of code like this being uncritically accepted after generation2, which in turn is surely going to be feeding back into future training. So while I can't deny that something has improved in the last six or so months, when it comes to agent-generated code, might it be that we are at peak quality right now? Might it be that from this point on we start to decline as "eh, it's... fine" code starts to overwhelm the most popular forge we have?

This is fine

I suppose the main benefit still is that this approach is nice and cheap. Right?


  1. Actually, I think it did hard-code the filenames throughout the codebase, initially, until I asked it not to. Perhaps I'm misremembering, but agents do seem to love magic strings and numbers for some reason (I think we know the reason). 

  2. As I have been doing with BlogMore, on purpose. 

At least there are tests

3 min read

In a post yesterday I finished off by saying:

At least I have, as of the time of writing, 1,380 tests to check that I've not broken anything when I do hand-clean the code. But, hmm, there's a question: can I actually trust those tests? It's not like I wrote them.

This was, of course, slightly tongue-in-cheek, because I did anticipate that the coverage might not be as useful as you'd hope an agent would deliver, and especially not at the level you'd personally aim for. On the other hand, I did expect it to have covered some of the fundamentals.

Being serious about wanting to hand-tidy some of the code as a way to start to get myself into the codebase1, I set out to look at validate_path_template in content_path.py. My plan for how to tidy the code had overlap with how both Claude and Gemini had approached it, but also with a slightly different take. Nothing too radical, with the main difference being that I didn't want a baked-in default for which variables were required (to recap: both the agents saw the need to make this configurable rather than hard-coded into the body of the function, but both still kept a "backward-compatible" default that had a "mixing of concerns" code smell about it).

A function such as validate_path_template, which has a core use, is intended to be of fairly general utility, and which has a very obvious set of outputs given certain inputs, and which has zero side effects and no dependencies, seems like a really obvious candidate for a good set of unit tests. This in turn should have meant that I could modify the code with confidence, and experiment with confidence, knowing that said tests would let me know when I've screwed up.

I went looking for those tests so I could run them and them alone as I did this work.

Keep in mind, at this point, there are 1,380 tests that Copilot/Claude has written. That's a lot of tests. Of course there will be some direct tests of validate_path_template!

Spoiler: there weren't. No tests. At all. 1,380 tests inside the tests/ directory and not one that directly tested this utility function.

Now, sure, the function did have coverage. Before making any changes, the codebase itself had 94% coverage and content_path.py itself had 93% coverage. In fact, the only thing that wasn't covered was the code that raised an exception if a template looked broken.

Coverage in main

This, for me anyway, is a good example of how and where coverage doesn't help me. Sure, other code that is being tested is calling this and if I change this code in ways that breaks that other code, I'll (probably) get to know about it. But if I want to properly understand the code (remember, I didn't write it, this is like getting to know someone else's2 code) it's really helpful to see a set of dedicated tests for that specific function.

There were none.

For a moment, I'm going to give Copilot/Claude an out. When I started BlogMore, right at the very start, just as I was messing about to see what would happen, I gave no thought to tests. It was only after a short while that I asked it to a) create a set of tests for the current behaviour and b) made it clear that all new code had to have tests. It is possible, just possible, that the content of content_path.py fell through that crack. I don't know for sure without going back and looking through the PR history. I'm not that curious right now.

What is interesting though is that, in setting both Copilot/Claude and Gemini on the same problem with the same prompt, and having them both identify the same area for improvement, neither seemed to arrive at the conclusion that adding dedicated tests was something worth doing.

So the point here -- which isn't a revelation at all, but I think has been nicely illustrated by what I've seen happen -- is that an agent might indeed create a lot of tests, and perhaps even achieve pretty good coverage too, but it's no guarantee that they're going to be useful tests when you want to get your hands dirty in the codebase.

Turns out that some of those tests might still need writing by hand, like I did for this tidy-up of content_path.py. Well, I say, "by hand", I did take this as an opportunity to test being pretty lazy about typing out the tests I wanted.

PS: While looking through the tests and tidying some code related to the above, I came across this:

from blogmore.pagination_path import (
    DEFAULT_PAGE_1_PATH,
    DEFAULT_PAGE_N_PATH,
    # ...other imports removed for brevity...
)

class TestDefaults:
    """Tests for the default constant values."""

    def test_default_page_1_path(self) -> None:
        """The default page_1_path should be 'index.html'."""
        assert DEFAULT_PAGE_1_PATH == "index.html"

    def test_default_page_n_path(self) -> None:
        """The default page_n_path should be 'page/{page}.html'."""
        assert DEFAULT_PAGE_N_PATH == "page/{page}.html"

Brilliant. I guess line goes up has come to agent-written tests. But look! 1,380 tests guys!


  1. Remember: up until this point this has mostly been an experiment in uncritically letting Copilot do its thing. 

  2. Arguably this is someone else's code, with extra steps. 

A different approach

4 min read

As mentioned in the previous post, I've been having a play around with Copilot/Claude vs Gemini when it comes to getting the agents to seek out "bad" code and improve it. In that first post on the subject, I highlighted how both tools noticed some real duplication of effort, both addressed it in more or less the same way, and neither of them took the clean-up to its logical conclusion (or, at the very least, neither cleaned it up in a way that I feel is acceptable).

The comparison of the two PRs (Gemini vs Claude via Copilot) is going to be a slow and occasional read, and if I notice something that catches my interest, I'll note it on this blog.

Initially, I was looking at which files were touched by both. With Gemini it was:

And with Copilot/Claude:

On the surface, it looks like Claude might have done a better job of finding untidy issues in the code. Of course a proper read/assessment of the outcome is needed to decide which is "better"; not to mention the application of a lot of personal taste.

So, with the initial/surface impression that "Claude went deeper", I took a look at the first file they had in common: content_path.py. This is documented as a module related to:

Shared path-resolution utilities for content output paths.

This module provides the generic building blocks used by page_path and post_path. Each content type supplies its own allowed-variable set and variable dict; this module handles the common validation, substitution, and safety checks.

There's 3 functions in there:

  • validate_path_template -- for validating a format string used in building a path.
  • resolve_path -- given a template and some values to populate variables in the template, create a path.
  • safe_output_path -- helper function for joining paths and ensuring they don't escape the output directory.

These seem like sensible functions to have in here, and I can imagine me writing a similar set in terms of the problem they seek to solve.

Both agents seemed to agree on what needed some work: validate_path_template. Both also seem to agree that building knowledge of which variable is required into the function itself isn't terribly flexible; I feel this is a reasonable review of the situation. However, the two agents seem to disagree on how this should be resolved.

Claude's take on this is that the function should grow an optional keyword argument called required_variable, which defaults to slug. It also adds an assert to test if the required variable exists in the allowed_variables (okay, I could quibble about this but given this is a code-check rather than a user-input check, eh, I can go with it). Finally it does the check using the new variable and also makes the error reporting a touch more generic too.

--- /Users/davep/content_path.py        2026-04-30 13:20:00.737955197 +0100
+++ src/blogmore/content_path.py        2026-04-30 13:20:04.560178727 +0100
@@ -17,13 +17,15 @@
     template: str,
     config_key: str,
     allowed_variables: frozenset[str],
-    item_name: str,
+    item_name: str = "",
+    *,
+    required_variable: str | None = "slug",
 ) -> None:
     """Validate a path format string for a content type.

     Checks that *template* is non-empty, well-formed, references only
-    variables from *allowed_variables*, and includes the mandatory
-    ``{slug}`` placeholder.
+    variables from *allowed_variables*, and (when *required_variable* is
+    not ``None``) includes the mandatory placeholder.

     Args:
         template: The path format string to validate.
@@ -33,11 +35,19 @@
             template.
         item_name: The human-readable name of the content type used in
             the uniqueness error message (e.g. ``"page"`` or ``"post"``).
+            Ignored when *required_variable* is ``None``.
+        required_variable: The variable name that must appear in the
+            template, or ``None`` if no variable is mandatory.  Defaults
+            to ``"slug"`` for backward compatibility.

     Raises:
         ValueError: If the template is empty, malformed, references an
-            unknown variable, or omits the ``{slug}`` placeholder.
+            unknown variable, or omits the required placeholder.
     """
+    assert required_variable is None or required_variable in allowed_variables, (
+        f"required_variable {required_variable!r} is not in allowed_variables"
+    )
+
     if not template:
         raise ValueError(f"{config_key} must not be empty")

@@ -61,9 +71,9 @@
             + f". Allowed variables are: {', '.join(sorted(allowed_variables))}"
         )

-    if "slug" not in field_names:
+    if required_variable is not None and required_variable not in field_names:
         raise ValueError(
-            f"{config_key} '{template}' must contain the {{slug}} variable so that "
+            f"{config_key} '{template}' must contain the {{{required_variable}}} variable so that "
             f"each {item_name} can be uniquely identified"
         )

Gemini, on the other hand, has a very similar idea but allows for the fact that the caller might want to specify multiple required variables. So in this case it adds required_variables (as a positional/keyword argument rather than a pure-keyword argument) and defaults it to a frozenset that contains "slug". The rest of the change is also about making the test for the required variables, and the reporting of the error, generic. It doesn't do anything about checking that the required variables are within the allowed variables.

--- /Users/davep/content_path.py        2026-04-30 13:20:00.737955197 +0100
+++ src/blogmore/content_path.py        2026-04-30 14:47:41.607748447 +0100
@@ -18,12 +18,13 @@
     config_key: str,
     allowed_variables: frozenset[str],
     item_name: str,
+    required_variables: frozenset[str] = frozenset({"slug"}),
 ) -> None:
     """Validate a path format string for a content type.

     Checks that *template* is non-empty, well-formed, references only
-    variables from *allowed_variables*, and includes the mandatory
-    ``{slug}`` placeholder.
+    variables from *allowed_variables*, and includes the
+    *required_variables*.

     Args:
         template: The path format string to validate.
@@ -33,10 +34,13 @@
             template.
         item_name: The human-readable name of the content type used in
             the uniqueness error message (e.g. ``"page"`` or ``"post"``).
+        required_variables: The set of variable names that MUST appear
+            in the template to ensure uniqueness. Defaults to
+            ``{"slug"}``.

     Raises:
         ValueError: If the template is empty, malformed, references an
-            unknown variable, or omits the ``{slug}`` placeholder.
+            unknown variable, or omits a required variable.
     """
     if not template:
         raise ValueError(f"{config_key} must not be empty")
@@ -61,10 +65,12 @@
             + f". Allowed variables are: {', '.join(sorted(allowed_variables))}"
         )

-    if "slug" not in field_names:
+    missing = required_variables - set(field_names)
+    if missing:
         raise ValueError(
-            f"{config_key} '{template}' must contain the {{slug}} variable so that "
-            f"each {item_name} can be uniquely identified"
+            f"{config_key} '{template}' must contain the "
+            + ", ".join(f"{{{v}}}" for v in sorted(missing))
+            + f" variable(s) so that each {item_name} can be uniquely identified"
         )

For the most part I think I prefer what Gemini is trying to do, although Claude's sanity check that the required variable is one of the possible variables makes sense. I kind of feel like both of them missed the point when it came to handling the fact that "slug" is required: given that validate_path is otherwise built to be pretty generic, I think I would have defaulted to nothing and simply left it up to the caller to be explicit that "slug" is required, because that matters in context of the caller. This feels like a pretty obvious case of a "business logic" vs "generic utility code" separation of concerns scenario.

As mentioned in passing in another post, it's interesting to see that neither of them noticed the opportunity to turn this:

unknown = set(field_names) - allowed_variables
if unknown:
    ...

into this:

if unknown := (set(field_names) - allowed_variables):
    ...

I know at least one person who would be happy about this fact.

So where does this leave me? At the moment I'm not inclined to merge either PR, but that's mainly because I want to carry on reading them and perhaps writing some more notes about what I encounter. What this does illustrate for me is something we know well enough anyway, but which I wanted to experiment with and see for myself: the initial implementation of any working code written by an agent seems optimised for that particular function or method, perhaps class if you're lucky. It will happily repeat the same code to solve similar problems, or perhaps even use very different approaches to solve the same problem. What it won't do well is recognise that this problem is solved elsewhere and so either use that other code by calling it, or perhaps modify it slightly to make it more generic and more applicable in more situations.

On the other hand, it has shown that with a bit of prompting (and keep in mind that the prompt that arrived at this comparison was really quite vague) it is possible to get an agent to "consider" the problem of duplication and boilerplate and to try and address that.

Having seen the two solutions on offer here, it's hard not to conclude that the best solution would be for me to take the PRs as flags marking places in the code that could be cleaned up, and do the tidy myself.

At least I have, as of the time of writing, 1,380 tests to check that I've not broken anything when I do hand-clean the code. But, hmm, there's a question: can I actually trust those tests? It's not like I wrote them.

Guess that's a whole other thing to worry about at some point...

Duplication of effort

3 min read

While I don't, for a moment, think that the work on BlogMore is complete, I think it's fair to say that the rate of new feature additions has slowed down. Which is fine, there's only so much I need from a self-designed/directed static site generator; at a certain point there's a danger of adding features for the sake of it.

Around this point I think I want to start to pay proper attention to the code quality and maintainability of the ongoing experiment.

As I mentioned the other day, while working through this, I had noticed plenty of bad habits that Copilot (and in this case pretty much always Claude Sonnet 4.6) has. All were very human (obviously), but also the sort of thing you'd expect a human developer to educate themselves out of.

Yesterday evening, out of idle curiosity, I installed Gemini CLI because I wanted to see what would happen if I pointed it at the v2.18.0 codebase and asked it to look for things to clean up, and then what would happen if I did the same with Copilot CLI.

I've saved the results as a PR for what Gemini came up with and what Copilot came up with1. I've not given them a proper read over yet, but while having a quick glance at them something leapt out at me: in the code before the request, there was this in utils.py:

def count_words(content: str) -> int:
    """Count the number of words in the given content.

    Strips common Markdown and HTML formatting before counting so that only
    prose words are included.  The same normalisation rules as
    :func:`calculate_reading_time` are applied.

    Args:
        content: The text content to analyse (may include Markdown/HTML).

    Returns:
        The number of words in the content.

    Examples:
        >>> count_words("Hello world")
        2
        >>> count_words("word " * 10)
        10
    """
    # Remove code blocks
    content = re.sub(r"```[\s\S]*?```", "", content)
    content = re.sub(r"`[^`]+`", "", content)

    # Remove markdown links but keep the text: [text](url) -> text
    content = re.sub(r"\[([^\]]+)\]\([^\)]+\)", r"\1", content)

    # Remove markdown images: ![alt](url) -> ""
    content = re.sub(r"!\[([^\]]*)\]\([^\)]+\)", "", content)

    # Remove HTML tags
    content = re.sub(r"<[^>]+>", "", content)

    # Remove markdown formatting characters
    content = re.sub(r"[*_~`#-]", " ", content)

    return len([word for word in content.split() if word])


def calculate_reading_time(content: str, words_per_minute: int = 200) -> int:
    """Calculate the estimated reading time for content in whole minutes.

    Uses the standard reading speed of 200 words per minute. Strips markdown
    formatting and counts only actual words to provide an accurate estimate.

    Args:
        content: The text content to analyze (can include markdown)
        words_per_minute: Average reading speed (default: 200 WPM)

    Returns:
        Estimated reading time in whole minutes (minimum 1 minute)

    Examples:
        >>> calculate_reading_time("Hello world")
        1
        >>> calculate_reading_time("word " * 400)
        2
    """
    # Remove code blocks (they typically take longer to read/understand)
    content = re.sub(r"```[\s\S]*?```", "", content)
    content = re.sub(r"`[^`]+`", "", content)

    # Remove markdown links but keep the text: [text](url) -> text
    content = re.sub(r"\[([^\]]+)\]\([^\)]+\)", r"\1", content)

    # Remove markdown images: ![alt](url) -> ""
    content = re.sub(r"!\[([^\]]*)\]\([^\)]+\)", "", content)

    # Remove HTML tags
    content = re.sub(r"<[^>]+>", "", content)

    # Remove markdown formatting characters
    content = re.sub(r"[*_~`#-]", " ", content)

    # Count words (split by whitespace and filter out empty strings)
    words = [word for word in content.split() if word]
    word_count = len(words)

    # Calculate minutes, rounding to the nearest minute with a minimum of 1
    minutes = max(1, round(word_count / words_per_minute))

    return minutes

I think this right here is a great example of why the code that these tools produce is generally kind of... meh. Let's just really appreciate for a moment the duplication of effort going on there. But it's even more fun. Look at the docstring2 for count_words: it says right there that the "same normalisation rules as calculate_reading_time are applied". It "knows" it copied the work that went into calculate_reading_time too, but never once did it then "think" to pull the common code out and have both of the functions call on that helper function.

Back to the parallel invitations to refactor, having asked:

please do a review of this codebase and see if there is any scope for refactoring so there's less duplication

Both Gemini and Claude noticed this and did something about it. Gemini came up with a:

def _strip_formatting(content: str) -> str:

with all the regex-based-markdown-stripping code in there and then rewrote count_words and calculate_reading_time to call on that. The Copilot/Claude cleanup did something very similar:

def _strip_markdown_formatting(content: str) -> str:

So it's a good thing that both of them "noticed" this duplication of effort and cleaned it up. What I do find interesting though is what the result was. Stripping docstrings and comments for a moment, here's what I was left with, by Gemini, for count_words and calculate_reading_time:

def count_words(content: str) -> int:
    content = _strip_formatting(content)
    return len([word for word in content.split() if word])

def calculate_reading_time(content: str, words_per_minute: int = 200) -> int:
    content = _strip_formatting(content)
    words = [word for word in content.split() if word]
    word_count = len(words)
    minutes = max(1, round(word_count / words_per_minute))
    return minutes

and here's what Copilot/Claude came up with:

def count_words(content: str) -> int:
    return len([word for word in _strip_markdown_formatting(content).split() if word])

def calculate_reading_time(content: str, words_per_minute: int = 200) -> int:
    words = [word for word in _strip_markdown_formatting(content).split() if word]
    return max(1, round(len(words) / words_per_minute))

In both cases calculate_reading_time is still doing the work of counting words when count_words is right there to be called! Don't even get me started on how the Gemini version of calculate_reading_time is so obsessed with assigning values to variables that only get used once in the next line3. Were I reviewing these PRs (oh, wait, I am reviewing these PRs!), I'd request the latter function be turned into:

def calculate_reading_time(content: str, words_per_minute: int = 200) -> int:
    return max(1, round(count_words(content) / words_per_minute))

I would imagine that there's a lot more of this going on in the code, and under ideal conditions this sort of thing would not have made its way into the codebase in the first place. Part of the point of this experiment was to mostly get the agent to do its own thing, without me doing full-on reviews of every PR. Were I to use this sort of tool in a workplace, or even on a FOSS project that wasn't intended to be this exact experiment, I'd be far more inclined to carefully review the result and request changes.

Or, perhaps, hear me out... I have a third agent that I teach to be just like me and I get it do the work of reviewing the PRs for me. What could possibly go wrong?


  1. Again, I guess I should stop referring to Copilot in this case and instead refer to Claude Sonnet. 

  2. Note to self: I need to educate the agents in how I prefer and always use the mkdocstrings style of cross-references

  3. Yes, I know, this is a favoured clean code kind of thing in some circles, but it can be taken to an unnecessary extreme. 

BlogMore v2.18.0

1 min read

After releasing the graph view yesterday I got to thinking that it might be nice if the "tooltips" for the nodes in the graph were a little richer. Since we already know how many posts are within a category, or have a specific tag, it makes sense that those counts should be shown; posts themselves have descriptions available and some even have cover images that could be turned into thumbnails. Why not make use of all of that?

So I've made use of all of that. As mentioned, categories and tags simply show the count of posts related to them:

A tag tooltip in the graph

Posts will show the title, date, description and the cover image if available:

A post showing its tooltip

I'll admit that the transparency is a little distracting -- this comes from the library being used for the graph -- but I kind of like it. I'm going to roll with it now and see how I feel about it as time goes on. It's not like I expect a reader to read the post in the tooltip, it's an invitation to click through and read the actual post.

Another small change is something I've been meaning to address for a while. While BlogMore supports a modified time for a post it never shows it or uses it in any meaningful way. So now I've updated the way the time of a post is displayed so that, if there is a modified time, it's also shown:

Showing when a post was last modified

The final change came in as a request over on Mastodon. The wish being that there was an easy method, that didn't require the user spin out their own copy of a template just to do it, of changing the title of the backlinks section on a post from "References & Mentions" to something else. That seemed fair so I've introduced backlinks_title.

blogmore.el v4.3.0

1 min read

After adding the email comment invite facility to BlogMore it only made sense that I add some commands to blogmore.el to make it easier to edit the front matter that can help drive that feature.

So... I've released v4.3.0 of blogmore.el that adds two new commands:

  • blogmore-toggle-invite-comments -- toggles the comment invitation property
  • blogmore-invite-comments-to -- makes it easy to set, edit or remove the email address to use when making the invite

I've also added the two commands to the transient menu, using C-t for the former and C-a for the latter.

BlogMore v2.17.0

4 min read

I did some more tinkering with BlogMore yesterday, adding two new features. The first is one I've been considering adding for a wee while now.

For a large part of the lifetime of this blog I used Disqus to provide a comments section on every post. It was, as you'd imagine for a small personal blog, a pretty quiet thing; I'd get the odd comment from time to time but it wasn't significant. This worked well for the longest time, until Disqus decided that they were going to force adverts into your pages if you were using the free tier. Now, I'm fine with paying for tools I use, but I wasn't using Disqus enough to make the cost worth it. I'm also not opposed to a bit of subtle advertising to help cover costs either.

What Disqus did wasn't subtle. It was far from subtle. It was a horror show of the worst kind of sleazy advertising you can imagine.

So I removed it and called it a day on comments.

After the work on BlogMore was well under way I did start thinking about this problem again. Given how BlogMore is constructed, anyone using it could override a template and include whatever they want; with this in mind I looked at static-site-friendly comment options but nothing really stood out. Every solution seemed to either heavily rely on a third party service (see above for possible problems), self-hosting such a service (spinning up hosts and web servers and databases and stuff is the antithesis of using a static site generator to get stuff done easily), or some hacky use of a social media platform or other discussion venue that would require the reader jump through hoops that really looks like "go away, I don't want to hear from you".

So I concluded that it just wasn't worth the effort and I've done nothing with it.

Meanwhile: on occasion I have had people just email me about a post. Good old email, like in the good old days of the Internet. I kind of liked that. In fact I really liked that. So over the weekend, after receiving just such an email the other day, I decided I'd add a feature to BlogMore that provided just that: an invitation to send an email at the end of every post.

The configuration file now has two new properties that support this. The first is invite_comments. This is a boolean value that simply turns on or off the feature. The second is invite_comments_to. This should be set to an email address that the reader will be invited to direct their comment or question or whatever.

I've made the latter a little smart, in that it's actually a template, so that you can control the email address used per-post. This could be great for filtering, etc. Examples could be:

  • blog-comment@example.com
  • blog-comment-{year}{month}{day}@example.com
  • {author}+comment@example.com

And so on. You get the idea.

Further to this there's also post frontmatter properties of the same name. In this case the frontmatter setting always overrides the configuration file setting, for that single post. Also the invite_comments_to frontmatter setting isn't a template -- it's being set for a single post so that didn't seem necessary. The point of the frontmatter is it gives the flexibility to turn the invite off for an individual post (or indeed turn it on if the global setting is for it to be off).

The effect of all of this is that, if the invitation setting is on and if there is an email address available, this little box will appear at the bottom of a post:

An invitation to send me an email

When the reader clicks on the link it should open their MUA of choice and pre-fill the to address, and should also pre-fill the subject with the title of the post they're emailing from.

The second addition is prompted by the final paragraph in the post announcing the previous release of BlogMore:

At some point in the future it might be interesting to take this even further and produce a map of interconnected posts; for now though I think this is enough.

Apparently "some time in the future" was the following day; because that also got added while I was hacking on the sofa. There's a new --with-graph command line option, and with_graph configuration file setting, that adds a Graph page to the top "menu" of the blog. The result looks something like this:

Initial graph view

Given the nature of the graph and that the viewer is naturally going to want to explore, it can be toggled into a "full screen" (well, "mostly most of the page") mode too:

In full screen mode

The graph itself (built using force-graph) can be explored in the ways you'd reasonably expect, allowing zooming, panning around, dragging nodes around to get a better view of things, and so on.

Zoomed in on the graph

If you click on any of the nodes the graph will show you everything that's linked to it:

Highlighted links

and if you click the node again it will take you to the post, tag archive or category archive, depending on what it is you are clicking on.

So far I'm finding this is working really well as yet another method of discovering posts and themes, etc; it's already helped me find some "under-used" tags that deserved to be added to posts to better connect things. I suspect the feature will need refining over time, especially from a cosmetic point of view, but the result feels very usable as it stands.

BlogMore v2.16.0

1 min read

BlogMore has had a new release, bumping the version to v2.16.0. There are two main changes in this update, both coming from a single idea: internal back-links.

Where it makes sense, I always try and link posts in this blog to other related posts, but I've never really had a sense of how interconnected things are. So, the first new thing I added was a with_backlinks configuration option. This is off by default, but when turned on, will add a list of any referring posts to the bottom of a post.

A list of references to a post

Like some of the work I did in the stats page, this feels like another interesting method of discovering posts and related subjects within a blog.

Once this work was done, it seemed to make sense to use the link-gathering code to then get a sense of which posts are most often linked to within a blog, and so a table of most-linked posts has been added to the stats page.

Internal link stats

This particular table will only appear in the stats if with_backlinks is set to true.

At some point in the future it might be interesting to take this even further and produce a map of interconnected posts; for now though I think this is enough.

BlogMore v2.15.0

1 min read

I've just made a small update to BlogMore. This fixes a minor cosmetic issue that's been bugging me for a while, but one that I kept forgetting to address. I noticed it again on a recent post. The issue is that if there are enough tags on a post that the collection of tags runs to a second line, there was no space between those lines.

Before

Now, as of v2.15.0, there's a little bit of breathing room between those lines.

After

Much better.

blogmore.el v4.2

2 min read

Another wee update to blogmore.el, with a bump to v4.2.

After adding the webp helper command the other day, something about it has been bothering me. While the command is there as a simple helper if I want to change an individual image to webp -- so it's not intended to be a general-purpose tool -- it felt "wrong" that it did this one specific thing.

So I've changed it up and now, rather than being a command that changes an image's filename so that it has a webp extension, it now cycles through a small range of different image formats. Specifically it goes jpeg to png to gif to webp.

With this change in place I can position point on an image in the Markdown of a post and keep running the command to cycle the extension through the different options. I suppose at some point it might make sense to turn this into something that actually converts the image itself, but this is about going back and editing key posts when I change their image formats.

Another change is to the code that slugs the title of a post to make the Markdown file name. I ran into the motivating issue yesterday when posting some images on my photoblog. I had a title with an apostrophe in it, which meant that it went from something like Dave's Test (as the title) to dave-s-test (as the slug). While the slug doesn't really matter, this felt sort of messy; I would prefer that it came out as daves-test.

Given that wish, I modified blogmore-slug so that it strips ' and " before doing the conversion of non-alphanumeric characters to -. While doing this, for the sake of completeness, I did a simple attempt at removing accents from some characters too. So now the slugs come out a little tidier still.

(blogmore-slug "That's Café Ëmacs")
"thats-cafe-emacs"

The slug function has been the perfect use for an Emacs Lisp function I've never used before: thread-last. It's not like I've been avoiding it, it's just more a case of I've never quite felt it was worthwhile using until now. Thanks to it the body of blogmore-slug looks like this:

(thread-last
  title
  downcase
  ucs-normalize-NFKD-string
  (seq-filter (lambda (char) (or (< char #x300) (> char #x36F))))
  concat
  (replace-regexp-in-string (rx (+ (any "'\""))) "")
  (replace-regexp-in-string (rx (+ (not (any "0-9a-z")))) "-")
  (replace-regexp-in-string (rx (or (seq bol "-") (seq "-" eol))) ""))

rather than something like this:

(replace-regexp-in-string
 (rx (or (seq bol "-") (seq "-" eol))) ""
 (replace-regexp-in-string
  (rx (+ (not (any "0-9a-z")))) "-"
  (replace-regexp-in-string
   (rx (+ (any "'\""))) ""
   (concat
    (seq-filter
     (lambda (char)
       (or (< char #x300) (> char #x36F)))
     (ucs-normalize-NFKD-string
      (downcase title)))))))

Given that making the slug is very much a "pipeline" of functions, the former looks far more readable and feels more maintainable than the latter.