Posts tagged with "Python"

But is the code that bad?

5 min read

There is, obviously and understandably, a lot of conversation online about AI and coding and agents and all that stuff. Much of it I get, much of it I agree with, I share the vast majority of the concerns. The impact on people, the impact on society, the impact on the environment, the impact on security... there's a good list of things to worry us there.

The one that crops up a lot though, that I don't quite get, is the constant claim I see that at best AI tools produce bad code, and at worst they produce unworkable code. That really isn't my recent experience.

Sure, going back to 2023 or 2024, when I first started toying with these new chatbot things some folks were raving about, the output was laughable. I can remember spending some fun times trying to coax whatever version of ChatGPT was on the go at the time into writing workable code and being amused by just how bad it was.

Even back in October last year, when I first tried out the free Copilot Pro that GitHub had given me to play with, I tried to get it to build a Textual application for me and it was terrible. The code was bad, it didn't really know how to use Textual properly, the application I was trying to get it to write as a test barely worked. It was a disaster.

A month later, in November of last year, I had a second go and better success. That time the (still not released, perhaps one day) application I was building was Swift-based and worked really well, but I can't really comment on the quality of the code or how idiomatically correct the code is in respect to the type of application it is (it's a wee game that runs on iOS, iPadOS, macOS).

By the time I tried my first serious experiment things seemed to be a little different. The code actually wasn't bad. It wasn't good, it was far from good, but it wasn't bad. Also, because it was Python, I was in a good place to judge the code.

Since I've started working on BlogMore I've noticed issues such as:

  • Lots of repetitive boilerplate code.
  • Lots of magic numbers.
  • Lots of magic strings.
  • Functions with redundant and unused parameters.
  • A default state of just adding more and more code to one file.
  • A habit of writing least-effort-possible type hints.
  • A habit of sometimes taking a hacky shortcut to solve a problem.
  • A habit of sometimes over-engineering a solution to a problem.
  • A weird obsession with importing inside functions.
  • An occasional weird obsession with guarding some imports with TYPE_CHECKING to work around non-existent circular imports.
  • An unwillingness to use newer Python capabilities (I've yet to see it make use of := without being prompted, for example).
  • A tendency to write what I would consider less-elegant code over more-elegant code.

The list isn't exhaustive, of course. The point here is that, as I've reviewed the PRs1, and read the code, I've seen things I wouldn't personally do. I've seen things I wouldn't personally write, I've seen things I've felt the need to push back on, I've seen things I've fully rejected and started over. Ultimately BlogMore isn't the code I would have written, but at the moment it is the application I would have written2.

So, here's the thing: every time I see someone writing a negative toot or post or article or whatever, and they talk about how the code it produces is unworkable, I find myself wondering about how they formed this opinion. Are they just writing the piece for the audience they want? Are they writing the piece based on their experience from months to years back, when these tools did seem to still be laughably bad? Are they simply cynically generating the piece using an LLM to bait for engagement? When I see this particular aspect of such a post it's a bit of a red flag about where they're coming from, kind of like how you suddenly realise that someone who seems to speak with authority might be full of shit when they start to spout questionable "facts" on a subject you understand well.

But wait! What about that list of dodgy stuff I've seen while building BlogMore with Copilot? What about all the reading and reviewing I've had to do, and what about the other crimes against Python coding I can probably still find in the codebase? Surely that is evidence that these tools produce terrible, unworkable, unusable code?

I mean, okay, I suppose I could reach that conclusion if I'd had a massively atypical experience in the software development industry and had never had to review anyone else's code, or had never needed to work on someone else's legacy code. Is what I'm seeing out of Copilot something I'd consider ideal code? Of course not. Is it worse than some of the worst code I've had to deal with since I started coding for a living in 1989? Hell no!

From what I'm seeing right now I'm getting code whose quality is... fine. Mostly it does the job fine. Often it needs a bit of coaxing in the right direction. Sometimes it gets totally confused and goes down a rabbit hole which needs to just be blocked off and we start again. Occasionally it needs rewriting to do the same thing but in a more maintainable way.

All of which sounds very familiar. I've had times where that describes my code (and I would massively distrust anyone who says they've never had the same outcomes in their time writing code). For sure it describes code I've had to take over, maintain or review.

It's almost like it was trained on lots of code written by humans.

Meanwhile... not every instance of using these tools to get code done needs to be about writing actual code. More and more I'm finding Google Gemini (for example) to be a really handy coding buddy and faster "Google this shit 'cos I can't remember this exact thing I want to achieve". I'll ask, I'll almost always get a pretty good answer, and then I can generally take that snippet of code and implement it how I want.

I've seldom had to walk away from that sort of interaction because it was getting me nowhere.

All of which is to say: I remain concerned about a great many things in the AI space at the moment, but I'm also as equally suspicious of someone who just flatly says "and the code it produces just doesn't work". If that's part of an article or post I'm left with the feeling that the author put zero actual effort into forming their opinion, let alone actually writing it.


  1. To varying degrees. Sometimes I have plenty of time to kill and I read the PR carefully, other times I glance it over, be happy there's nothing horrific there, and then decide to push back or merge based on the results of hand-testing and automated testing. 

  2. To be fair, it's the application I would still be writing and would be some time off finishing; there's no way it would be as feature-complete as it is now had I been 100% hand-coding it. 

BlogMore v2.16.0

1 min read

BlogMore has had a new release, bumping the version to v2.16.0. There are two main changes in this update, both coming from a single idea: internal back-links.

Where it makes sense, I always try and link posts in this blog to other related posts, but I've never really had a sense of how interconnected things are. So, the first new thing I added was a with_backlinks configuration option. This is off by default, but when turned on, will add a list of any referring posts to the bottom of a post.

A list of references to a post

Like some of the work I did in the stats page, this feels like another interesting method of discovering posts and related subjects within a blog.

Once this work was done, it seemed to make sense to use the link-gathering code to then get a sense of which posts are most often linked to within a blog, and so a table of most-linked posts has been added to the stats page.

Internal link stats

This particular table will only appear in the stats if with_backlinks is set to true.

At some point in the future it might be interesting to take this even further and produce a map of interconnected posts; for now though I think this is enough.

BlogMore v2.15.0

1 min read

I've just made a small update to BlogMore. This fixes a minor cosmetic issue that's been bugging me for a while, but one that I kept forgetting to address. I noticed it again on a recent post. The issue is that if there are enough tags on a post that the collection of tags runs to a second line, there was no space between those lines.

Before

Now, as of v2.15.0, there's a little bit of breathing room between those lines.

After

Much better.

BlogMore v2.14.0

1 min read

Quick little update for BlogMore, with a bump up to v2.14.0. This release comes from another feature request from Andy1, where he asked if it would be possible to have a year-based bar chart in the stats page.

Funnily enough I'd been thinking about the same thing just yesterday. I'd been wondering if it was worth adding, or if it would be overkill given the numbers can be seen in the archive. Having been asked by someone else... that was all the prompting I needed to kick that off.

Posts per year for my blog

Now I'm glad I did this. I like the result, it's a different way to visualise the values, and it's yet another way for people to discover past posts on the blog.

For sure BlogMore is now feature complete.


  1. Who recently wrote an interesting article about his experience of migrating his blog from Hugo to BlogMore 

OldNews v1.4.0

1 min read

OldNews

Yesterday evening I released v1.4.0 of OldNews, my terminal-based client for TheOldReader.

The change in this release is pretty straightforward, but something I kept finding myself wanting. I've added three new commands to the application:

  • JumpToSubscriptions - Jump to the subscriptions panel
  • JumpToArticles - Jump to the articles panel
  • JumpToArticle - Jump to the article panel

By default they're bound to 1, 2 and 3. So now skipping around the UI and navigating to a different article or blog is just a bit quicker.

If you're a user of TheOldReader and fancy interacting with it from the terminal: OldNews is licensed GPL-3.0 and available via GitHub and also via PyPI. It can also be installed using uv:

uv tool install oldnews

If you don't have uv installed you can use uvx.sh to perform the installation. For GNU/Linux or macOS or similar:

curl -LsSf uvx.sh/oldnews/install.sh | sh

or on Windows:

powershell -ExecutionPolicy ByPass -c "irm https://uvx.sh/oldnews/install.ps1 | iex"

If uv isn't your thing then it can also be installed with pipx:

pipx install oldnews

Once installed, run the oldnews command.

BlogMore v2.13.0

1 min read

Following on from yesterday's release of BlogMore, I've been looking at some more information in the Google Search Console, which helped me uncover a couple more bugs in relation to URL generation.

This time I noticed a couple of issues, both related to the clean_urls setting. The first was that, in the recently added calendar page, all of the URLs for the links into the date-based archive weren't taking clean_urls into account. That's now fixed.

The second problem was the canonical <link> tag in the headers of the various archive pages (categories, tags, date-based): none of the URLs used in the tag were being cleaned up if clean_urls was true. That's now also fixed.

The main "problem" those two issues were causing was Google was seeing the sitemap for my blog declare one URL, but discovering different versions of the URL elsewhere; the main offending part here being the canonical URL declaration that disagreed with the sitemap.

To the best of my understanding the above fixes should clean a lot of that up.

Also in this new release is a small new feature. After cleaning up the sitemap generation in v2.12.0 I got to thinking that, perhaps, there would be occasions where a user would want to be able to add extra items to the sitemap. With this in mind I've added the sitemap_extras configuration property. With this you can declare extra URLs to drop into the sitemap, if one is being generated.

sitemap_extras:
  - /some/path/
  - /some/file.html

I don't think I have a use for this right now, I'm not sure I'll ever have a use for it, but it feels like a low-cost feature to add that could be useful to someone at some point.

obs2nlm v1.2.0

1 min read

Three months back I released obs2nlm, a tool that takes an Obsidian vault and turns it into a single Markdown file so it can be used as a source for NotebookLM.

Since then I've been using it a lot and it's working out really well.

Meanwhile, one of my vaults has started to creep up towards the documented word limit for a single source in NotebookLM (500,000 words). Right now it's sitting at around 75% and is steadily creeping up.

So, with this in mind, I've made a change I've been planning from the start and have added a --split option. If used, if the generated file looks like it's going to hit the word limit, a second (or more) file will be created. The naming scheme is simple enough: if you ask obs2nlm to create an output file called dirt.md and it needs to run over, it'll then create dirt-2.md, dirt-3.md, and so on. The idea then is that, rather than upload that single Markdown file as a source, you upload all of the generated Markdown files.

Given you get up to 50 sources per notebook, this should see me right for any reasonable vault. As for if it will affect the quality of the results I get when I query the notebook... that's hard to say until I find myself in that situation. If Google are to be believed it shouldn't be an issue, and the alternative is to fall foul of the limit so this seems like the only sensible solution.

I've also added a --dry-run command line switch too; this should be handy for checking how big a vault is when compared to the word limit, without actually generating any files.

BlogMore v2.12.0

1 min read

Since kicking off building BlogMore and swapping this blog over to using it I've been playing with the Google Search Console. It's something I've not used in decades, but felt it was time to dip back in again and understand how it works these days.

There are two motivations for this: the first is that, when it comes to my day job, I have cause to interact with people who do use the search console a lot, and so it's worth understanding what they work with and why it matters to them. The second reason is it's a reasonable measure of how good a site BlogMore generates.

Page index inclusion progress

So far the results have been pretty good, and the console has helped me find oddities and things that need tidying up.

So this release of BlogMore includes a couple of changes that stem from looking at the latest updates in the console.

The first is that I've cleaned up how the sitemap.xml gets generated. I noticed that if I had any HTML inside my extras directory it was turning up in the sitemap; something I didn't intend and didn't want. So that's now fixed: only pages generated by BlogMore will appear in sitemap.xml1.

The second is that the stats page, despite being in the sitemap, had a noindex header for some reason. That's now been fixed. The only generated page I've intentionally set up so that it isn't indexed is the search page.

Finally, there's one change unrelated to the above: I realised that if you have with_read_time set to false, the reading time stats still appeared on the stats page; that seems unnecessary and unwanted on a site that doesn't show reading times. So, as of v2.12.0, that section of the stats won't show if reading times are turned off.


  1. Now I think about it, I suppose there might be occasion where someone wants extra HTML to appear in the sitemap. I might consider the idea of allowing extra entries to be declared via the configuration file. 

BlogMore v2.11.0

2 min read

After adding the streak display to the stats a couple of days back, I got a little more obsessed with knowing what sort of runs of days of posting to the blog I had. I even said in that post:

It almost makes me want to do a whole-blog-lifetime version of it, or perhaps some sort of more calendar-oriented version of the archive.

Despite saying that I fancied the idea of that calendar-type view, first off I got to thinking it would be interesting to see a table of my 10 longest streaks. So that got added and can now be found in the stats page.

A table of my 10 longest streaks

Having added that, I kept thinking about the whole-blog visual view of "here's the whole time of the blog, and here are the days you posted". I did think it might be interesting to use the same style and layout as the streak display -- perhaps something that would look like my whole contribution history on GitHub that I wrote about back in 2023 -- but the problem with that is it's tricky to make it work well on all display types. I needed something that would collapse better on smaller displays.

So I decided that a more conventional calendar display might work better. While it took a bit of work to get it to really land as I wanted, it turned out pretty much how I wanted.

So now there is a with_calendar configuration option that, if set to true, will add a calendar link at the top of the site. By default it looks like this:

The default calendar view

If it looks a little unconventional at first glance, that's because it is. I wanted something that started with the most recent month in which there's a post, and which then worked backwards. This way I can see things as a proper history. But I can also see that this might seem odd to some people. Given this, I've also added a forward_calendar configuration option that can be used (when set to true), to flip the calendar into a more normal flow.

The alternative calendar view

As you might expect, the calendar links to other parts of the site: clicking on a day with a post takes you to the archive for that day, clicking on a month name where there are posts in a month takes you to the archive for that month, and the same again for a year title.

I'm pretty pleased with the result. In testing it seems nicely responsive to different display types and I'm also finding it to be yet another interesting way to discover older posts (and get a sense of when I was encouraged to post going back over the last 11 years of this particular blog1).

One final little feature I've added is a small enhancement to the read time that can appear on each post. While it's long since been possible to decide if you want it there or not, the calculation itself has been hard-wired to the assumption that 200 wpm is the reading speed of the reader. I've now added read_time_wpm as a configuration option so you can set it to suit your own taste.


  1. I have other, much older, blogs out there on the net. One day I might merge them with this one and back-fill the whole thing. 

BlogMore v2.10.0

1 min read

I've released an update to BlogMore, with another little straightforward addition. This time I'm revisiting the statistics page and adding a streak tracker, of sorts.

My blog streak

Modelled after the GitHub contribution tracker, or indeed any number of other streak trackers, it shows which days in the recent past I've blogged on, and also an indication of how many posts I've made that day.

Of course, it's not quite a full streak tracker. It's only going to show the days up to the day the site was last generated; so when a reader visits and looks, if you've not generated the site for a month, it's not going to show that you've not blogged for a month1. The point is that if you last blog in January, come March or so the reader isn't going to see 2 months of empty days, until you regenerate the site.

So, not perfect, but good enough I think. Also it gives the reader another method of discovering posts (each cell will take them to the archive for that day, so they can read the post or posts for that day).

I've also tried to make it vaguely responsive. There are narrower date ranges as the display gets narrower. We start out at 10 months (as you can see above), then drop to 9 months:

Last nine months

and then dropping to 5 months once we get to mobile-type screens:

Just five months

For all its flaws, I feel it's kind of fun and I like it as a new discovery tool. It almost makes me want to do a whole-blog-lifetime version of it, or perhaps some sort of more calendar-oriented version of the archive. For now though I'm going to settle with this and see if it encourages me to keep up a blogging streak.

While it isn't my intention to write posts for the sake of it, I am enjoying writing something more frequently, so this might just help keep me doing that.


  1. I could solve this problem by having the whole thing generated on the fly with some JavaScript, but that felt like it wasn't in the spirit of a static site generator.