Posts tagged with "Google"

Duplication of effort

3 min read; 11 GFI

While I don't, for a moment, think that the work on BlogMore is complete, I think it's fair to say that the rate of new feature additions has slowed down. Which is fine, there's only so much I need from a self-designed/directed static site generator; at a certain point there's a danger of adding features for the sake of it.

Around this point I think I want to start to pay proper attention to the code quality and maintainability of the ongoing experiment.

As I mentioned the other day, while working through this, I had noticed plenty of bad habits that Copilot (and in this case pretty much always Claude Sonnet 4.6) has. All were very human (obviously), but also the sort of thing you'd expect a human developer to educate themselves out of.

Yesterday evening, out of idle curiosity, I installed Gemini CLI because I wanted to see what would happen if I pointed it at the v2.18.0 codebase and asked it to look for things to clean up, and then what would happen if I did the same with Copilot CLI.

I've saved the results as a PR for what Gemini came up with and what Copilot came up with1. I've not given them a proper read over yet, but while having a quick glance at them something leapt out at me: in the code before the request, there was this in utils.py:

def count_words(content: str) -> int:
    """Count the number of words in the given content.

    Strips common Markdown and HTML formatting before counting so that only
    prose words are included.  The same normalisation rules as
    :func:`calculate_reading_time` are applied.

    Args:
        content: The text content to analyse (may include Markdown/HTML).

    Returns:
        The number of words in the content.

    Examples:
        >>> count_words("Hello world")
        2
        >>> count_words("word " * 10)
        10
    """
    # Remove code blocks
    content = re.sub(r"```[\s\S]*?```", "", content)
    content = re.sub(r"`[^`]+`", "", content)

    # Remove markdown links but keep the text: [text](url) -> text
    content = re.sub(r"\[([^\]]+)\]\([^\)]+\)", r"\1", content)

    # Remove markdown images: ![alt](url) -> ""
    content = re.sub(r"!\[([^\]]*)\]\([^\)]+\)", "", content)

    # Remove HTML tags
    content = re.sub(r"<[^>]+>", "", content)

    # Remove markdown formatting characters
    content = re.sub(r"[*_~`#-]", " ", content)

    return len([word for word in content.split() if word])


def calculate_reading_time(content: str, words_per_minute: int = 200) -> int:
    """Calculate the estimated reading time for content in whole minutes.

    Uses the standard reading speed of 200 words per minute. Strips markdown
    formatting and counts only actual words to provide an accurate estimate.

    Args:
        content: The text content to analyze (can include markdown)
        words_per_minute: Average reading speed (default: 200 WPM)

    Returns:
        Estimated reading time in whole minutes (minimum 1 minute)

    Examples:
        >>> calculate_reading_time("Hello world")
        1
        >>> calculate_reading_time("word " * 400)
        2
    """
    # Remove code blocks (they typically take longer to read/understand)
    content = re.sub(r"```[\s\S]*?```", "", content)
    content = re.sub(r"`[^`]+`", "", content)

    # Remove markdown links but keep the text: [text](url) -> text
    content = re.sub(r"\[([^\]]+)\]\([^\)]+\)", r"\1", content)

    # Remove markdown images: ![alt](url) -> ""
    content = re.sub(r"!\[([^\]]*)\]\([^\)]+\)", "", content)

    # Remove HTML tags
    content = re.sub(r"<[^>]+>", "", content)

    # Remove markdown formatting characters
    content = re.sub(r"[*_~`#-]", " ", content)

    # Count words (split by whitespace and filter out empty strings)
    words = [word for word in content.split() if word]
    word_count = len(words)

    # Calculate minutes, rounding to the nearest minute with a minimum of 1
    minutes = max(1, round(word_count / words_per_minute))

    return minutes

I think this right here is a great example of why the code that these tools produce is generally kind of... meh. Let's just really appreciate for a moment the duplication of effort going on there. But it's even more fun. Look at the docstring2 for count_words: it says right there that the "same normalisation rules as calculate_reading_time are applied". It "knows" it copied the work that went into calculate_reading_time too, but never once did it then "think" to pull the common code out and have both of the functions call on that helper function.

Back to the parallel invitations to refactor, having asked:

please do a review of this codebase and see if there is any scope for refactoring so there's less duplication

Both Gemini and Claude noticed this and did something about it. Gemini came up with a:

def _strip_formatting(content: str) -> str:

with all the regex-based-markdown-stripping code in there and then rewrote count_words and calculate_reading_time to call on that. The Copilot/Claude cleanup did something very similar:

def _strip_markdown_formatting(content: str) -> str:

So it's a good thing that both of them "noticed" this duplication of effort and cleaned it up. What I do find interesting though is what the result was. Stripping docstrings and comments for a moment, here's what I was left with, by Gemini, for count_words and calculate_reading_time:

def count_words(content: str) -> int:
    content = _strip_formatting(content)
    return len([word for word in content.split() if word])

def calculate_reading_time(content: str, words_per_minute: int = 200) -> int:
    content = _strip_formatting(content)
    words = [word for word in content.split() if word]
    word_count = len(words)
    minutes = max(1, round(word_count / words_per_minute))
    return minutes

and here's what Copilot/Claude came up with:

def count_words(content: str) -> int:
    return len([word for word in _strip_markdown_formatting(content).split() if word])

def calculate_reading_time(content: str, words_per_minute: int = 200) -> int:
    words = [word for word in _strip_markdown_formatting(content).split() if word]
    return max(1, round(len(words) / words_per_minute))

In both cases calculate_reading_time is still doing the work of counting words when count_words is right there to be called! Don't even get me started on how the Gemini version of calculate_reading_time is so obsessed with assigning values to variables that only get used once in the next line3. Were I reviewing these PRs (oh, wait, I am reviewing these PRs!), I'd request the latter function be turned into:

def calculate_reading_time(content: str, words_per_minute: int = 200) -> int:
    return max(1, round(count_words(content) / words_per_minute))

I would imagine that there's a lot more of this going on in the code, and under ideal conditions this sort of thing would not have made its way into the codebase in the first place. Part of the point of this experiment was to mostly get the agent to do its own thing, without me doing full-on reviews of every PR. Were I to use this sort of tool in a workplace, or even on a FOSS project that wasn't intended to be this exact experiment, I'd be far more inclined to carefully review the result and request changes.

Or, perhaps, hear me out... I have a third agent that I teach to be just like me and I get it do the work of reviewing the PRs for me. What could possibly go wrong?


  1. Again, I guess I should stop referring to Copilot in this case and instead refer to Claude Sonnet. 

  2. Note to self: I need to educate the agents in how I prefer and always use the mkdocstrings style of cross-references

  3. Yes, I know, this is a favoured clean code kind of thing in some circles, but it can be taken to an unnecessary extreme. 

But is the code that bad?

5 min read; 10 GFI

There is, obviously and understandably, a lot of conversation online about AI and coding and agents and all that stuff. Much of it I get, much of it I agree with, I share the vast majority of the concerns. The impact on people, the impact on society, the impact on the environment, the impact on security... there's a good list of things to worry us there.

The one that crops up a lot though, that I don't quite get, is the constant claim I see that at best AI tools produce bad code, and at worst they produce unworkable code. That really isn't my recent experience.

Sure, going back to 2023 or 2024, when I first started toying with these new chatbot things some folks were raving about, the output was laughable. I can remember spending some fun times trying to coax whatever version of ChatGPT was on the go at the time into writing workable code and being amused by just how bad it was.

Even back in October last year, when I first tried out the free Copilot Pro that GitHub had given me to play with, I tried to get it to build a Textual application for me and it was terrible. The code was bad, it didn't really know how to use Textual properly, the application I was trying to get it to write as a test barely worked. It was a disaster.

A month later, in November of last year, I had a second go and better success. That time the (still not released, perhaps one day) application I was building was Swift-based and worked really well, but I can't really comment on the quality of the code or how idiomatically correct the code is in respect to the type of application it is (it's a wee game that runs on iOS, iPadOS, macOS).

By the time I tried my first serious experiment things seemed to be a little different. The code actually wasn't bad. It wasn't good, it was far from good, but it wasn't bad. Also, because it was Python, I was in a good place to judge the code.

Since I've started working on BlogMore I've noticed issues such as:

  • Lots of repetitive boilerplate code.
  • Lots of magic numbers.
  • Lots of magic strings.
  • Functions with redundant and unused parameters.
  • A default state of just adding more and more code to one file.
  • A habit of writing least-effort-possible type hints.
  • A habit of sometimes taking a hacky shortcut to solve a problem.
  • A habit of sometimes over-engineering a solution to a problem.
  • A weird obsession with importing inside functions.
  • An occasional weird obsession with guarding some imports with TYPE_CHECKING to work around non-existent circular imports.
  • An unwillingness to use newer Python capabilities (I've yet to see it make use of := without being prompted, for example).
  • A tendency to write what I would consider less-elegant code over more-elegant code.

The list isn't exhaustive, of course. The point here is that, as I've reviewed the PRs1, and read the code, I've seen things I wouldn't personally do. I've seen things I wouldn't personally write, I've seen things I've felt the need to push back on, I've seen things I've fully rejected and started over. Ultimately BlogMore isn't the code I would have written, but at the moment it is the application I would have written2.

So, here's the thing: every time I see someone writing a negative toot or post or article or whatever, and they talk about how the code it produces is unworkable, I find myself wondering about how they formed this opinion. Are they just writing the piece for the audience they want? Are they writing the piece based on their experience from months to years back, when these tools did seem to still be laughably bad? Are they simply cynically generating the piece using an LLM to bait for engagement? When I see this particular aspect of such a post it's a bit of a red flag about where they're coming from, kind of like how you suddenly realise that someone who seems to speak with authority might be full of shit when they start to spout questionable "facts" on a subject you understand well.

But wait! What about that list of dodgy stuff I've seen while building BlogMore with Copilot? What about all the reading and reviewing I've had to do, and what about the other crimes against Python coding I can probably still find in the codebase? Surely that is evidence that these tools produce terrible, unworkable, unusable code?

I mean, okay, I suppose I could reach that conclusion if I'd had a massively atypical experience in the software development industry and had never had to review anyone else's code, or had never needed to work on someone else's legacy code. Is what I'm seeing out of Copilot something I'd consider ideal code? Of course not. Is it worse than some of the worst code I've had to deal with since I started coding for a living in 1989? Hell no!

From what I'm seeing right now I'm getting code whose quality is... fine. Mostly it does the job fine. Often it needs a bit of coaxing in the right direction. Sometimes it gets totally confused and goes down a rabbit hole which needs to just be blocked off and we start again. Occasionally it needs rewriting to do the same thing but in a more maintainable way.

All of which sounds very familiar. I've had times where that describes my code (and I would massively distrust anyone who says they've never had the same outcomes in their time writing code). For sure it describes code I've had to take over, maintain or review.

It's almost like it was trained on lots of code written by humans.

Meanwhile... not every instance of using these tools to get code done needs to be about writing actual code. More and more I'm finding Google Gemini (for example) to be a really handy coding buddy and faster "Google this shit 'cos I can't remember this exact thing I want to achieve". I'll ask, I'll almost always get a pretty good answer, and then I can generally take that snippet of code and implement it how I want.

I've seldom had to walk away from that sort of interaction because it was getting me nowhere.

All of which is to say: I remain concerned about a great many things in the AI space at the moment, but I'm also as equally suspicious of someone who just flatly says "and the code it produces just doesn't work". If that's part of an article or post I'm left with the feeling that the author put zero actual effort into forming their opinion, let alone actually writing it.


  1. To varying degrees. Sometimes I have plenty of time to kill and I read the PR carefully, other times I glance it over, be happy there's nothing horrific there, and then decide to push back or merge based on the results of hand-testing and automated testing. 

  2. To be fair, it's the application I would still be writing and would be some time off finishing; there's no way it would be as feature-complete as it is now had I been 100% hand-coding it. 

I want to like Gboard

3 min read; 8 GFI

I want to like Gboard. On paper it looks really rather good. It's a keyboard from Google, it ties in with your account, it syncs things, it has clever searching for emoji and GIFs and the like... what's not to like?

Problem is, I've been a user of SwiftKey since around 2011 (I think it was). I'm very used to how SwiftKey works and it also contains a lot of handy things. I like that it has smart completion, that it learns how I type a bit skewed and that it takes this into account, that I can turn off the fancy swipe typing and instead make use of handy gestures like swipe-left to delete a word. I like some of the themes a lot.

Into the mix comes my iPad, which I use on occasion. The standard Apple keyboard is horrible and, sadly, I find SwiftKey on iOS just as frustrating. It seems to lack enough key features there (especially the word deletion gesture, as far as I can tell) that it's also a bit annoying. My dream of a consistent typing experience across all devices just wasn't happening -- until I found Gboard on iOS.

That felt almost right. And from what I could tell it worked almost exactly the same on iOS and Android. So it felt like a good time to try and force myself to use Gboard on my Google Pixel and Nexus 7.

Sadly, though, I'm just not getting on with it. It's okay. It's not bad. It's just... not good. I'm finding that it lacks enough useful things that it's a frustrating experience. Little things like: when I enter Google Search, there's no word completion in the keyboard (SwiftKey has that); the word deletion gesture (swipe left from the backspace key) seems very hit-and-miss; the most obvious completion for a word sometimes appears in the middle slot but, other times, in the left slot. And so on.

Nothing huge. Nothing that's a show-stopper. But a handful of a little things that make me miss the comfortable home that is SwiftKey.

Don't get me wrong, it does have some very handy and cleaver features too. The searching for emoji -- including showing them up as word completions -- is rather clever. The GIF-search thing is all kinds of fun too (mostly used to annoy the hell out of my son on twitter).

None of those quite make up for the bits I miss from SwiftKey though.

All that said, I've being making a point of pushing on with Gboard, thinking that most of my issues might just be because I'm too used to my "old home". Mostly this was working well, until I noticed something this morning. While reading the description for Gboard I noticed this handy thing in the "Pro Tips" section:

Sync your learned words across devices to improve suggestions (enable in Gboard Settings→ Dictionary → Sync learned words).

Useful! I'd assumed that this was the case anyway -- it's Google after all -- but it's good to know I can ensure it's turned on. So I went to turn it on. This is what I found:

Gboard WTF

What the hell Google? Sure, I do have a Gsuite account on my phone -- as in various apps have access to a Gsuite account (Gmail, Drive, etc...) -- but it's not the primary account on my phone and it's not the account I'd really want to be doing the dictionary sync with anyway. If I've got dictionary sync I want it tied to the keyboard no matter the app I'm in, and no matter the account I'm using in that app. I want the keyboard to be tied to a specific account when it comes to sync (just like SwiftKey does it).

This, I think, is a show-stopper for me.

I can overlook the other niggles, I can learn to cope with it not being quite so perfect in some situations; but the blanket inability to do something as simple as cloud-sync the predictions and learn from how I type -- things that are, these days, central to what Google's about -- it's frankly stupid.

I guess I'm going to have to keep Gboard as a backup keyboard for those times when I need to find the perfect GIF.

Google WTF

Google Now Achievements?

1 min read; 7 GFI

Over the past couple or so weeks I've been having some issues with Google Now. It first seemed to start on my Nexus 7, then appeared on my Nexus 6. More recently, even as of today, I've seen it on my Google Pixel. The problem is that, in the Google Now launcher (or on the Pixel, in the Pixel launcher), the Google Now page (that you swipe to the left for) sits empty for ages. All I see is the little animated waiting circle and nothing else. Once or twice I've had the Google app die and restart or, more often than not, after quite some time it finally loads up.

The latter happened a little earlier and I noticed something I'd not seen before:

Blank Google Now

What's with that "Achievements" menu option? You'll notice that the whole of the menu is blank -- no profile picture or anything and none of the menu options seemed to work.

Eventually, after I'd left it for a while, it ended up working.

Google Now finally working

And, once this happened, no "Achievements" option.

Presumably this is some back-end server issue, I'm being served up something I'm not supposed to be seeing and it's confusing the client app. Okay, I don't know that's the case, but it has that sort of feel.

So now I need to go looking for what this Achievements thing is all about.

Using Google, obviously.

Hello Google Pixel

2 min read; 9 GFI

For the past two years I've, mostly, being happily using a Google Nexus 6 as my phone. In the past six months or so I've started to notice that it hasn't been quite as good as it was. The main problem, for me, was that the camera was starting to play out. The issues were the ones that I've seen reported elsewhere: use of the camera would quickly make the phone laggy, very slow response times on pressing the shutter, occasional failure to save an image, etc. This was generally frustrating and, even more so, because I'd got back into photoblogging.

Meanwhile... I've been lusting over the Google Pixel ever since it was originally shown off. I was some way off my phone contract renewal and the price of a new Pixel was something I just couldn't justify. Last week though an offer cropped up that meant I could renew early and get a Pixel (including a free Daydream headset thrown in).

Fast forward to Monday just gone and...

My new Pixel

So far I'm liking it rather a lot. It is odd that it's smaller in my hand than the Nexus 6 was (the XL wasn't an available option and I was also starting to think it was time to drop down in size a little again) but I'm also finding it a little easier to work with; it's also nice that it fits in trouser pockets as well as jacket pockets.

It feels very fast (although every Android phone and tablet I've ever had have felt fast to start with) and smooth to use. I especially like the default feedback vibration -- it's a lot smoother yet also more reassuring than any I've felt before.

The Google Assistant is proving to be very handy. I'm sort of used to it anyway thanks to having owned an Android Wear watch for a couple of years but having it on the phone like this seems like a natural next step.

Another thing I'm getting very used to very quickly, and really liking a lot, is fingerprint recognition. I didn't think I needed it but now I'm wondering how I ever managed without it. Combined with the notification pull-down gesture that the recognition area supports it seems like a perfect way to open the get going with a phone.

There's a couple of niggles with it, of course. The main one for me is the lack of wireless charging. That was something I really liked about the Nexus 6: I could be sat at my desk and have the phone sat on top of a charging pad, staying topped up. No such handy setup with the Pixel. The other thing is the lack of water resistance. To be fair: it's not something I've ever really felt I needed with other phones and I'm not in the habit of sticking them under water; but knowing that it doesn't matter too much if it gets exposed to rain would be nice.

Other than that... there's not much else to say right now. It works and works well, the move from the N6 to it was pretty smooth and the Pixel has fallen perfectly into my normal routine.

Until next alarm is back

1 min read; 8 GFI

Now and again Google seem to actually listen. While they do generally have a bad habit of removing things from things and saying it's for everyone's good (because options are bad and they can't maintain them, apparently) it seems they can do the odd turnaround now and again.

One thing they removed from Android recently was the "until next alarm"" option when putting a device in "do not disturb" mode.

Seems they've added that back in 6.0.1:

Google sees sense

It's a small thing, but it makes so much more sense and makes things so much easier (even if it's a trivial thing).

Nice one Google. More of this please.

I miss "Until next alarm"

1 min read; 7 GFI

I actually can't remember when the change was now, it was either Android 5.0 or one of the 5.x point releases, but I can recall the frustration of Google having changed how you make an Android device silent, or not. The idea seemed clever enough but it was a real pain to switch to and use. Previously there'd simply been this neat system of setting he volume to either be some non-off value, vibrate or totally silent. I even had a neat little widget on the home screen of my phone to allow me to toggle between these 3 states.

It was simple, and worked well.

The new system though.... ugh. It was confusing and so much more long-winded to work with.

At some point though they added one big redeeming feature: "Until next alarm". When I got into bed I could tell my tablet to go totally silent until my alarm went off in the morning, and then it would all work as normal. That was an utterly brilliant idea.

So it made sense that if they changed anything about this in Marshmallow they'd keep that in and make it even more awesome, right? Right?!?

Nope

Well fuck!

Why? Just..... why?!? I actually prefer how the new one works. They've more or less solved the problem of how it was more faff to deal with, they've solved the problem of having to cock about with the volume rocker to get at the settings and then set the settings. I like all that.

But taking "Until next alarm" away? That's just nuts.

Sometimes I really get the impression that the Android developers are like the Chrome OS developers: they're having a ton of fun improving and onward developing the system but they have little connection to how people actually use this stuff.

Voice search failing on Nexus 6

1 min read; 10 GFI

It's been quite a while since I used voiced search on my Nexus 6. Ever since I got the Moto 360 I've not really had a need to say "OK Google" to my phone because I could simply say it to my wrist. Today though, because I wanted to quickly look something up and my phone was to hand, I spoke to it and got this:

Voice search fail

Brilliant.

I've been here before. I had exactly this sort of problem with my Xperia Z at one point. The problem appeared to go away eventually (actually, it sort of came and went a few times over a matter of weeks, if I recall correctly), although I never really got to the bottom of the cause.

I've tried rebooting the phone and that hasn't helped at all. While it's more of a vague annoyance than anything else (like I say above, my Android Wear device is my goto tool for talking to Google these days) it does frustrate a little when fairly expensive tools don't "just work".

Unknown promo

1 min read; 11 GFI

Ahh, Google, knower of all things that can be known about me, tracker of all things that can be tracked about me, controller of my phone and even my watch, able to use Now to suggest stuff I need to know even before I need to know it.

Tell me again what promo that 10GB was from...

Unknown Promo

Wear timer issue fixed, sort of

1 min read; 7 GFI

Following on from yesterday's problem with the Android Wear timer I think I now have a solution. It came up while chatting with Mike McLoughlin about the issue (on Google+) .

I got to thinking that this problem felt like one that I've seen a number of times before with Google stuff. One thing that's rather common (in many cases for very obvious reasons -- you can't cover the whole world in one go) with Google is how they struggle to get languages and localisation right. This felt like it was a similar issue. Mike had reported that his watch appeared to be unaffected by the issue (I'm guessing he's on the latest version of Wear -- the conversation headed off in a different direction before that became necessary) so I checked what his language was on his phone. Turns out he was the same as me: British English.

So much for that idea.

But then he suggested switching to US English and back again.

Happy enough to apply a very Microsoft "turn it off and on again" approach to a Google device (really, all big tech companies really are the same and really do suffer the same issues) I switched to en-US on the phone and tried setting a timer in voice on the watch.

It worked!

So then I switched back to en-GB on the phone and...

I appear to have timers working again

...it still worked!

I've tried setting timers in voice on the watch a few times since and it's yet to fail.

It would appear, as odd as it is, that this is the fix. Well, a fix.