Brute force and ignorance

My experiment with coding with an agent continues, and from the point of view of the usability of the results, I'd say it's going well. From the point of view of learning more about this sort of tool and the pitfalls: I'd also say it's going well.

The one thing I'm really starting to get an appreciation for is just how much it employs a sort of brute force and ignorance approach to getting to a solution for some sort of problem. I noticed this on and off as I watched the agent's chat, as it worked away.

One example I enjoyed was when it was trying to add a new dependency to BlogMore. For reasons I should probably try and understand¹ it knew it should be using uv to add a dependency and it knew how to use it to update the lock file, but it didn't have uv to hand:

The lock file (uv.lock) needs to be updated to include rcssmin. Since uv isn't available, I'll need to note this in the PR. The lock file should be generated by uv when the pyproject.toml is updated.

So it decided the best solution was probably to... "hand-edit" the lock file.

The lock file needs to be updated with rcssmin. Since uv is not available in this environment, I need to update it manually or find another way.
I need to update the uv.lock file to include rcssmin. The uv.lock file needs to include:
The rcssmin package entry with hashes
The rcssmin dependency in the blogmore package section
Let me look at the PyPI page for rcssmin 1.2.2 to get the hash.

It then grabs rcssmin from PyPI and checks out the information it wants; after which it goes:

I need to update the uv.lock file with the rcssmin package information. The uv.lock file uses SHA256 hashes. Let me look at the existing format in more detail to understand what I need to add, and then I'll query PyPI for the hashes.
Actually, let me think about this differently. The uv.lock file is a generated file, and updating it manually is error-prone. The proper way to update it is to run uv sync or uv lock. Since uv is not available in this environment, I have two options:
Let the CI update the lock file (risky if CI fails on the locked version)

So... around this point it has some sort of plan for hand-patching the lock file, but also recognises that this is a dodgy approach and it should be using uv to update the lock file.

I also really like the part where it says it has two options, then goes on to provide an exhaustive list of... one.

After this it runs this code:

import json, urllib.request
url = 'https://pypi.org/pypi/rcssmin/json'
try:
    with urllib.request.urlopen(url, timeout=10) as f:
        data = json.load(f)
    v = '1.2.2'
    releases = data['releases'].get(v, [])
    for r in releases:
        print(f'filename: {r[\"filename\"]}')
        print(f'  url: {r[\"url\"]}')
        print(f'  sha256: {r[\"digests\"][\"sha256\"]}')
        print(f'  size: {r[\"size\"]}')
        print(f'  upload_time: {r[\"upload_time_iso_8601\"]}')
except Exception as e:
    print(f'Error: {e}')

after which it concludes:

Now I have all the information I need to update the uv.lock file. I need to add:
The rcssmin package entry in the lock file
Add rcssmin to the blogmore package dependencies

I mean... okay, I guess. Again, I'll admit that I'm still getting used to setting this sort of thing up and working with it (and keep in mind none of this was done locally; this is all via GitHub Copilot, up on GitHub), but it's kind of funny and wild that the approach taken was to recognise that using uv to remake the lock file was the best approach so... let's hand-edit the lock file.

But that's not the best part. This is where it gets really fun. It had decided that the lock file must be in alphabetical order. Okay, sure, why not:

The lock file has packages sorted alphabetically, so rcssmin should go between pygments and python-dateutil (since "r" comes before "py" actually... no, wait - "r" comes after "p" in the alphabet. Let me check the order again).

Since "r" comes before "py."

Since "r" comes before "py."

o_O

O_o

Extra credit here for giving me a good giggle. I really appreciate the way that it catches itself mid-sentence and remembers how the alphabet actually works.

As to the outcome of all of this? Sure, the feature I wanted to add got added; it worked away and got to a working solution in the end. But the route it took was, I think it's fair to say, a "brute force and ignorance" approach.

I've not been spending too much time reading the agent's own chatter, but when I have I've found myself amused by the dead ends it'll wander down and then work its way back out. There is, without question, a recognisable process here: I believe it would be a dishonest developer who says they've never had times in the past, or just one of those off days, where they've fallen down a rabbit hole of a solution, only to realise it's the right solution implemented in the worst possible way. There's also a resemblance here to how more junior developers work a problem until they really develop their skills.

I think I'm going to keep an eye on the agent chat a bit more from now on. While I imagine things will only improve as these tools improve, for the moment it's a good source of coding comedy.

Presumably there's things I can be doing to make its life easier. ↩