Posts tagged with "Blogging"

Animated blogging

2026-06-15 18:09:21 UTC+01:00

1 min read; 15 GFI

Having generated an animation of the development of BlogMore, I thought it might be interesting to visualise the evolution of the repository for this blog. So here it is:

I do like how it's really obvious when I changed from Jekyll to Pelican (a little after 3:11), and again from Pelican to BlogMore (a little after 6:06).

Blogging Video writing YouTube

Recreating my blog stats

2026-06-02 17:01:42 UTC+01:00

4 min read; 13 GFI

Tech

Introduction¶

Having recently added the dump command to BlogMore I've been thinking I should try and learn a little more about jq. It's one of those tools that's been on my radar for ages, which I've used on very rare occasions to get something done quickly, but which I've never really used in anger.

So I thought it might be fun to see about recreating some of the stats from the stats page using jq alone. Well, I say "alone", I mean "from the JSON data that is produced by the BlogMore dump command", and of course that makes it easier given it dumps some of the key calculated values. In other words I won't be using jq to calculate the word count, or reading time, or GFI, etc.

Post count¶

To start with, working out the number of posts in my blog is simple enough:

jq '. | length'

Category count¶

Getting the list of categories would be:

jq '[.[] .safe_category] | unique'

[
  "ai",
  "coding",
  "creative",
  "emacs",
  "gaming",
  "life",
  "meta",
  "music",
  "python",
  "tech",
  "til"
]

and so getting the count of them is simple enough:

jq '[.[] .safe_category] | unique | length'

Tags count¶

Getting the count of tags takes a little more work, as safe_tags is a list too, so I start out with a list of lists, which I need to flatten first.

jq '[.[] .safe_tags] | flatten | unique | length'

This, right away, is an interesting finding. In my stats page, as of the time of writing, the number of tags is reported as 243, but here I'm getting 224. Given I'm using the safe_tags property, which ensures all similar tags end up with the same value (so Hello World, hello world, and all variations, become hello-world), that would suggest the stats page isn't taking that into account. That's an issue to address.

A date/time interlude¶

Here's where things get a little interesting for a moment. In the output of the dump command from BlogMore, the dates of the posts are given in ISO 8601 format; specifically the date and time with offset format. From what I can tell, while jq does have some date/time parsing support, it can't handle that format specifically.

This means that if I try:

jq '.[0] .date | fromdate'

I just get:

jq: error (at <stdin>:27293): date "2015-06-18T14:53:00+01:00" does not match format "%Y-%m-%dT%H:%M:%SZ"

After some searching around it seems the only approach I can really take is to drop the timezone offset and pretend every time is a Z time:

jq '.[0] .date[:19] + "Z" | fromdate'

1434639180

From here I can then get a fully-parsed list of date/time values using gmtime:

jq '.[0] .date[:19] + "Z" | fromdate | gmtime'

This isn't ideal for what I'd like to do, it's going to skew some of the values related to time, but it's close enough for experimenting.

Posts per year¶

Now that I have a way of breaking the posting time into a workable array of values, getting the number of posts per year becomes:

jq -r '[.[] .date[:19] + "Z" | fromdate | gmtime[0]] | group_by(.) | .[] | "\(.[0]): \(length)"'

Although, to be fair to jq, that's kind of long-winded when I could just pull the year itself out of the posting time:

jq -r '[.[] .date[:4]] | group_by(.) | .[] | "\(.[0]): \(length)"'

Posts by month¶

At this point getting the posts by month of year seems obvious too:

jq -r '[.[] .date[5:7]] | group_by(.) | .[] | "\(.[0]): \(length)"'

Posts by weekday¶

For this, I need to go back to the more involved version of the posting date handling query, where I use gmtime to break down the time. It turns out that the penultimate value is the day of the week as a number. So, while it's not quite as readable in that I don't have day names, I can get the values:

jq -r '[.[] .date[:19] + "Z" | fromdate | gmtime[-2]] | group_by(.) | .[] | "\(.[0]): \(length)"'

In this case Sunday is the first day (the 0 day here).

Posts by hour¶

Getting the posts by the hour is really just a variation on the date-chopping query used for the posts by year and the posts by month; it's all there in the string version of the date.

jq -r '[.[] .date[11:13]] | group_by(.) | .[] | "\(.[0]): \(length)"'

First and last posting dates¶

Getting the date of the first and latest post seems nice and easy:

jq -r '[.[] .date[0:10]] | {first: min, last: max}'

{
  "first": "2015-06-18",
  "last": "2026-06-01"
}

Although, from what I can tell, jq doesn't have anything that makes date arithmetic easy so working out the elapsed time between the two isn't so straightforward. It can be done, but it's not as easy as it might be with a bit of Python code, for example. The best I could come up with was:

jq '[ .[] | .date[:19] + "Z" | fromdate ] | ((max - min) / (365.25 * 24 * 60 * 60))'

10.95438841990519

For an approximate value of "year", of course.

Word counts¶

From here on in many of the stats that can be pulled out from the JSON, with jq, become easier to handle. Each post has a word_count property, so I only need to do this:

jq -r '[.[] .word_count] | {least: min, most: max, average: (add / length)}'

{
  "least": 24,
  "most": 2792,
  "average": 475.0700808625337
}

Reading times¶

A post's reading time can be accessed by reading_time, so it's as easy to handle as the word counts:

jq '[.[] .reading_time] | {least: min, most: max, average: (add / length)}'

{
  "least": 1,
  "most": 11,
  "average": 1.8921832884097034
}

Gunning fog index¶

The Gunning fog index is available as the gfi property so there's no work to do to figure it out. It is, however, a floating point value and I want counts in each integer "bucket". That can be done with round.

jq -r '[.[] .gfi | round] | group_by(.) | .[] | "\(.[0]): \(length)"'

As for working out the mean, median and mode... while I worked out the above queries by reading the docs, experimenting, and using Gemini on occasion to either help me understand an error message or to explain why an approach works the way it did, I'm going to have to leave this one 100% to Gemini. Here's its approach to using jq to work out those averages:

jq '
  [ .[] | .gfi | select(. != null) ] as $raw_gfi
  | [ $raw_gfi[] | round ] as $rounded_gfi
  | ($raw_gfi | length) as $count

  # 1. Mean Calculation
  | (($raw_gfi | add) / $count) as $mean

  # 2. Median Calculation
  | ($raw_gfi | sort) as $sorted_gfi
  | (if $count % 2 == 1 then
       $sorted_gfi[($count - 1) / 2]
     else
       ($sorted_gfi[($count / 2) - 1] + $sorted_gfi[$count / 2]) / 2
     end) as $median

  # 3. Mode Calculation (using the rounded values)
  | [ $rounded_gfi
      | group_by(.)
      | map({gfi: .[0], frequency: length})
      | sort_by(.frequency)
      | reverse
      | .[]
    ] as $frequencies
  | [ $frequencies[] | select(.frequency == $frequencies[0].frequency) | .gfi ] as $modes

  # Final Object Assembly
  | {
      count: $count,
      mean: $mean,
      median: $median,
      mode: $modes
    }
'

{
  "count": 371,
  "mean": 9.908842231503396,
  "median": 9.979198312236287,
  "mode": [
    11
  ]
}

As of the time of writing: that's bang on what I get in the stats. Honestly though, by this point, I think I'd be reaching for Python or something similar to do this sort of work. For sure, I can't say if this is a good jq query, if it's in any way idiomatic, or even if it's error-free. The numbers match what BlogMore says though.

Conclusion¶

This has been a useful exercise in getting to know a little more about jq, and I can see myself reaching for it to do quick little jobs now that I've finally taken some time to dive into it. As it turns out, it's also been a useful little audit of the content of the stats page because I've even found a bug that needs addressing; so that's a bonus.

Blogging BlogMore jq JSON

A full month of blogging

2026-05-31 15:26:46 UTC+01:00

1 min read; 11 GFI

Meta

I've just realised that, somehow, I've managed to post something on this blog, every single day this month.

May 2026

A large part of this is, of course, because I've been doing a lot of stuff on BlogMore, but there's no getting away from the fact that BlogMore exists and I feel compelled to make use of it, and also that blogmore.el helps make it a lot easier to kick off and edit a post.

Thinking back to other blogs I've maintained over the past couple of decades, I don't think I've ever come close to this. Last month did come close, broken only by the couple of days I was chilling in Whitby. The month before also came close, minus 2 days where I just didn't have something to write.

I don't imagine this will last; in fact, I know for sure that there will be fewer posts next month (I have a trip coming up). What I do know is that I feel more compelled to jot something down when an idea turns up, and I'm enjoying the habit of blogging more frequently. While I expect this run to calm down, I hope I don't fall back to leaving it months at a time before opening a fresh Emacs buffer and kicking off some new Markdown.

Blogging BlogMore blogmore.el writing

Tidying and spelling

2026-05-30 11:29:42 UTC+01:00

2 min read; 11 GFI

Meta

Since kicking off the work on BlogMore and blogmore.el, I've absolutely found that I've reduced the friction involved when it comes to writing a quick (or not so quick) blog post. I've also found that I want to go back and tidy up lots of my old posts. Over the past few weeks I've gone and cleaned up the size and positioning of images; converted most images to WebP format; cleaned and consolidated the tags used; hunted down and fixed broken internal links; and a few other things besides.

Another thing I want to do is go back and hunt down, and clean up, typos and spelling mistakes, and the like. While I'm careful to try and not make any errors when typing out a post, and while I've always made a point of reading my posts back to try and catch problems, I've not always been successful. Sometimes I'm just blind to the errors, sometimes I'm just rushing. There's over a decade of mistakes on this blog.

So, with this in mind, I've added a couple of little tools to the build environment for this blog to help me go back and catch problems that might need addressing.

The main tool is a script for running aspell over all the Markdown and building a list of errors. This shows the names of the Markdown files that have errors, and lists the unknown words for them. For example:

=== content/posts/2019/2019-11-04-my-pylint-shame.md ===
flycheck
prepending
whitespace

=== content/posts/2020/2020-08-23-the-pep-8-hill-i-will-die-on.md ===
parens
whitespace

=== content/posts/2020/2020-06-22-swift-til-1.md ===
backticks

=== content/posts/2020/2020-06-14-my-journey-to-the-dark-side-is-complete.md ===
Macbook
Macbooks
scrollbars

=== content/posts/2020/2020-01-19-dnote-el.md ===
dnote
Dnote

=== content/posts/2020/2020-01-11-where-i-live-and-work.md ===
adwaita
eshell
powerline

This alone makes it nice and easy to go back and clean up some obvious issues. A problem I ran into though was that I was getting a lot of false reports for things in the front matter of the files (especially parts of the cover: file name) and also in the end-of-file comments I like to use. So, with a little help from Gemini (because it's a moment since I last wrote any awk in anger), I wrote a filter to "clean" the Markdown content before running it through aspell.

Already, using this setup, I've caught a few things that deserved cleaning up, and because there will be a lot of words that are correct but particular to this blog and what I write about, I'm also building up a local ignore list.

While this setup isn't going to make the content of this blog error-free, it should give me everything I need to go back and slowly improve some of the older text, and to harmonise some of the spellings of some technical terms.

aspell awk Blogging writing

Converted to WebP

2026-05-15 19:27:00 UTC+01:00

1 min read; 11 GFI

Meta

The job is finally done. After considering moving all the images in the blog over to WebP, and then finally getting the migration under way, I'm all done.

As I mentioned before: I've done this by hand, one post at a time, also adding missing covers as I go. The process went faster than I anticipated and I found that adding linting support to BlogMore really helped with this process. Each time I made a batch of changes I could run the linter to make sure I'd not broken any image links.

As for the result: I've brought the total size of images on the blog down from around 56MB to about 32MB, give or take (keep in mind the latter figure also includes all the WebP images I've added while blogging since I started this process). While I don't really have to worry so much about the storage costs of these images (I'm using GitHub Pages after all), overall, over time, there should be savings in the time it takes for readers to load any given page.

Blogging web webp

The linter helped already

2026-05-11 08:29:57 UTC+01:00

1 min read; 10 GFI

Meta

The new linting tool I've added to BlogMore has paid off already. While it is the case that it helped me find a couple of broken links and one or two other things to tidy, as I was working on the feature; by the time I released it, my blog was lint-free.

But last night I did a little more work on the slow migration of images over to WebP. As I've mentioned before: this is a process I'm doing by hand, one post at a time, for a couple of different reasons. The thing is, I'm in a part of my blog now where I was often posting about updates to projects I was working on (Tinboard being a good example), and the cover for all of the posts would be the same. To save having multiple copies of the cover image, all subsequent posts would point back to the first cover image¹.

So what was happening was, I'd have a cover image that got transitioned from PNG to WebP, and then the covers of a number of posts, later in time, would be broken. While I would get to them eventually, if I'd called it a day there and rebuilt my blog, those would have been published broken.

Using blogmore lint while making those changes yesterday evening alerted me to this right away.

It's worth noting that I break down the post attachments by day. ↩

Blogging BlogMore web webp

The webp migration is under way

2026-05-06 19:45:43 UTC+01:00

2 min read; 11 GFI

Meta

I've finally made a proper start on the planned migration to webp for images. I did consider writing a tool that would go through and migrate the files, and update the Markdown, all in one go, but something about that makes me kind of nervous. While it wouldn't be a destructive approach (the whole blog is under version control after all), I just have this niggling feeling that I'd miss something and it would sit broken, unnoticed, for ages.

So instead I've decided to take a one-post-at-a-time approach, making the migration by hand. As well as having the benefit of letting me go slowly and check my work as I go, I can also do some tidying up of old posts. So while I do this I'm also going to tidy up obviously broken links when I notice them, and also remove embedded tweets (swapping to the simple blockquote version).

Another thing I'm doing is adding cover images where possible. I'd been running this blog for a long time before I started to use cover (it might be that I didn't start until I moved to Pelican). Since then I've tried to use it any time there's an appropriate image in a post. More recently, I added cover images to the graph view so they're even more useful now. Back-adding a cover to older posts will make them more appealing to discover in the graph because those older notes will acquire attention-grabbing thumbnails too.

One thing I wanted to do was have an easy way to keep track of where I'm up to in the migration. It's going to be a steady process that's going to take a few days, doing a few posts at a time. So to aid this I've added this to the Makefile of the blog:

cd content/extras/attachments
find -E ./ -iregex '.*\.(png|jpg|jpeg)$' | cut -d'/' -f2,3,4 | sort -u

With this I get a handy list of dates of posts that still have unconverted PNG or JPEG files.

Of course, for a wee while, this will not get to an empty list because I want to make sure some of the more recent posts still have their older images available as they might be in feeds out there. More recently I've only been using webp for images, so once the webp-using posts fill the main RSS and Atom feeds I can clean out the last of the bulkier images.

Blogging web webp

Considering a rescue

2026-04-28 20:16:50 UTC+01:00

4 min read; 10 GFI

Meta

Ever since I kicked off the work on BlogMore I've had a renewed interest in writing on this blog (as you can probably tell from the stats and the calendar). But not just writing: also tweaking it, tidying it up, thinking about maintaining it into the future, thinking about the links and the categories and so on.

In doing so, I've also been looking at other folk who persist in keeping a blog, and especially those who maintain blogs built with static site generation tools, and in some cases I'm mildly envious of how far back some of them stretch.

When it comes to the world of blogging I was kind of late to the party. The first version was just a section of my self-developed website, hosted on www.davep.org. Don't go looking there for it now, it was long ago removed. In fact my personal website is mostly just a placeholder for what once was. The Wayback Machine still has a copy though, so I can see that the first blog post I wrote for my site was dated 2003-03-31.

My first blog post

I maintained this for a while, the engine for it all being some self-written PHP engine that was what could be best described as a dynamic static site (in other words it generated everything on request from underlying text files and HTML snippets because I had no wish to be faffing around with databases on a web host). Eventually though the blog side of this got to be too much trouble and I jumped over to Blogger.

I maintained that blog for quite a few years, with the first post being made in 2006 and the last in 2011. Sadly it's all quite broken now. I used to include a lot of images and, while some of them are embedded in the site itself, most were hosted on the older version of my website, as part of the photo gallery I also had there.

This all fell apart when I finally killed off the PHP version of my site and all the images were removed. Now the blog is a wasteland of broken image icons (not to mention a wasteland of broken external links -- so many of the sites I referred to back then have fallen off the net).

I hate this. I hate that thirty-something me was fired up enough to want to write stuff down and communicate to other people (and to future me) and it's all decayed. I especially dislike that the original version of my blog, now only stored on the Wayback Machine (and perhaps on a hard drive that I think is in a box somewhere in storage, perhaps also on some burnt-as-a-backup DVDs) is otherwise inaccessible. Much like I did with my original photoblog, I want to rescue this. I want to rescue all of this.

The technical challenges of teasing out the original posts from the Internet Archive and from Blogger aren't too great. Turning a bunch of HTML into Markdown isn't impossible either -- the library that I use in OldNews should do the job fine there. All that sort of work feels like a fun little challenge that will keep me amused for a few evenings.

There are two main things that cause me to pause when thinking about doing this.

The first is that some of those very old posts, as I mentioned above, link to places that don't exist and haven't existed for a long time. It raises the question: do I even care to preserve things that have no context any more?

The second is that many of the posts in the Blogger blog, as I mention, relied on images hosted on my old site. Right now I'm not actually sure where those photos are! While I took a backup of all the code and other data for www.davep.org when I did the big reboot (storing it all up on GitHub), I seem to have stripped out all of the photos. This makes sense as there was a lot of data there. Making sure I had a backup of those files feels like something I would do -- I hang on to all sorts of data -- but at the moment I can't locate them¹.

To make this work, for this to stand any chance of working, I need to pull them all back out from somewhere.

Will I do this? I don't know yet. The seed is there, the itch is waiting to be scratched. I look at the age span of this blog, and the calendar page, and think it could be really cool to really back-fill it from my older blogs. The graph might end up looking really funky.

On the other hand: am I just trying to preserve irrelevant things as a way to make work for myself (albeit "work" that is fun; after all coding is a hobby as well as a living).

On the gripping hand: if I can get the images back, a wasteland of links to sites that don't exist any more does, at the very least, provide a history of what was and is no longer.

I should point out that I have the original photos all backed up any number of ways and in multiple locations, but it's the specific jpeg files with their specific names as appeared in the photo library on my site that I need to make this work. ↩

Blogging Coding history photography

I should use webp

2026-04-16 21:31:32 UTC+01:00

2 min read; 10 GFI

Meta

For a good while now I've been pretty happy with the PageSpeed measurements of this blog, which in turn means I've been happy with the state of what's generated by BlogMore. I have pretty much everything that can be minified nice and minimal. At this point, the main thing that causes the speed measurement to fluctuate is image sizes.

I use a lot of PNGs on this blog. When I'm using images, they're almost always in posts that include screenshots, which in turn pretty much demand that I use a lossless format. When I take these screenshots I don't worry too much about the dimensions (within reason), and of course I don't really do anything to optimise how they'll work and appear on different display sizes. If I was to get too into that, it would add friction to writing something, and the whole point of this is to feel less friction when it comes to sitting at the keyboard.

So I've been living with the fact that some images can be pretty big. While I do make a point of using pngcrush on every image, it generally doesn't make a huge saving.

Then yesterday I read this post on Andy's blog and I suddenly realised what I had to do!

I should use webp

Borrowing from what Andy did, I used mogrify too, setting up this Fish abbr in my Fish configuration:

if type -q mogrify
    abbr -g mkwebp "mogrify -format webp -define webp:lossless=true -quality 100"
end

In my case, at least in the initial experiment, I decided to keep it all lossless. So far the results have been really good, cutting the image sizes down by a significant amount. For example, if I look at the images for yesterday's posts:

 90581 15 Apr 18:14 sl-overview.png
 33446 16 Apr 20:23 sl-overview.webp
392661 15 Apr 18:14 slstats-region-info.png
225392 16 Apr 20:23 slstats-region-info.webp
 36049 15 Apr 19:39 year-chart.png
 15590 16 Apr 20:23 year-chart.webp

That's a pretty reasonable saving!

So far all I've done is convert the few latest posts that make up the front page of my blog, just so I can see what impact it has. I'm getting improved load times on mobile, for sure.

There are a couple of downsides to this, of course.

Now I want to do the whole blog, so while I can easily go through and convert all the png files to webp, converting all the image markup in the Markdown files isn't quite so simple, and even if I do write something to automate it, I'm then going to want to review it to make 100% sure nothing has broken.
I can't just then remove all the png files to cut back on the space used by the generated site. The front page of the site has a feed, and all the categories have a feed each too. This means that there could be HTML out there from some of my oldest posts, referring to the png files, and just removing them will result in broken images.

Overall though, it might be worth doing at some point soon. Meanwhile, from now on, I think I'm going to replace my pngcrush step with a mkwebp step and just use webp instead of png now.

I guess I'm all modern now!

Blogging BlogMore fish ImageMagick pngcrush web webp

Discovering powRSS

2026-04-11 09:50:34 UTC+01:00

1 min read; 12 GFI

Tech

This was a nice find yesterday: I think I came across it when someone I follow on Mastodon boosted a post from the account related to the site; it's a site called powRSS. The concept is pretty simple: collect links to all sorts of small blogs on all sorts of topics, and then provide a honking great discovery feed/pool. You can read more about the idea on their about page.

For sure, this sort of thing isn't exactly novel: those of us of a certain age will fondly remember the fun of webrings and other similar initiatives, not to mention feed aggregation sites where you could discover trending blogs or see what your friends were reading, and all that. But, to some degree, that fell out of favour and/or the limelight when social media got really popular.

So with this in mind it's good to see people still providing such sites. I've added this blog to it and I'll be diving in there now and again to see if there's anything new I should be following.

It'll be fun to populate OldNews with more things to read.

atom Blogging RSS writing