git2gantt -- Simple tool to visualise coding runs

Posted on 2019-12-08 13:44 +0000 in Python • Tagged with Python, documentation • 3 min read

At the start of this year, as part of a much bigger process to review the work that had taken place over the previous 12 months, I was asked (at work) to provide some information about how much time I'd spent on various projects. Now, for me, there's really only one project, but there's lots of different tools and libraries that I've written to support the main work I do. All of these are split into different repositories in the company-internal instance of GitLab. This meant that getting a rough idea of what I was working on and when would be easy enough -- it's all there in the commit history.

Given that this information would make up a couple of slides at most during a far bigger presentation, I wanted something that would be snappy and easy for non-developers to follow and understand. I spent a bit of time pondering some options and decided that (ab)using a gantt chart layout would make sense.

That choice was made all the more easier given that GitLab supports the use of mermaid charts within its Markdown. This meant I could quickly write some code that took the git log of each repository, turned it into mermaid code, and then render it (by hand, this was all about getting things done quickly) via GitLab.

This sounded like it could be a fun personal project. The result was some Python code called git2gantt.

As mentioned above, the output isn't anything too clever, it's just code that can be used to create a plot via mermaid. For example, running git2gannt over itself:

  title git2gantt output
  dateFormat YYYY-MM-DD

  section git2gantt
  Development: devgit2gantt20190208, 2019-02-08, 2019-02-13
  Development: devgit2gantt20190214, 2019-02-14, 2019-02-15
  Development: devgit2gantt20190303, 2019-03-03, 2019-03-04
  Development: devgit2gantt20191203, 2019-12-03, 2019-12-04

Usage is pretty straightforward: Screenshot 2019-12-08 at
13.18.12.png As you can see, it can be run over multiple repos at once, and there's also an option to have it consider every branch within each repository. Another handy option is the ability to limit the output to just one author -- perhaps you just want to document what you've done on a repo, not the contributions of other people.

Also especially handy, if you don't want to bore people with too much detail, is the "fuzz" option. This lets you tell git2gannt how relaxed you want it to be when it tries to decide how long a run of work on a repo lasted. So, perhaps, you're working on and off on a library that supports some other system you're documenting, but you might only be making changes every other day or so. With the correct fuzz value you can make it clear you were working on the library for a couple of weeks, despite there only being a commit every other day.

An example of running the output over a handful of projects would look something like this:

Screenshot 2019-12-08 at 13.34.41.png

This is one of those tools I knocked up quickly to get a job done, and haven't quite got round to finishing off fully. One thing I'd really like to do is add mermaid support directly within it, so that it actually has the option to emit plots, not just mermaid code (or, perhaps, drop the mermaid approach and use something else entirely).

Meanwhile though, if you're looking for something quick and dirty that will help you visualise what you've been working on and when for a good period of time... perhaps this will help.

pydscheck -- A quick hack that keeps slowly growing

Posted on 2019-10-26 13:19 +0100 in Python • Tagged with Python, documentation • 3 min read

Something I always try to do when I'm coding is be consistent. I feel this is important. While people's coding standards may differ, I think different approaches are easier to handle if someone has been consistent with their style across all of their code.

This also stands for documentation too.

In my current position, I do a lot of Python coding, and one of the things I like about Python (there are things I don't like too, but that's not for now) is that it has doc-strings (just like my favourite language). I use them extensively, ensuring every function and method has some form of documentation, and generally I use Sphinx to generate documentation from those doc-strings.

Early on I was bothered by the fact that, just by the simple act of making typos, I wasn't keeping the form of the doc-strings consistent. And in this case it was a really simple thing that was bugging me. Normally, if I'm writing a single-line doc-string, I'll write like this:

def one_liner():
    """Here is a one-line doc-string."""

So far, so good. But, if the doc-string is a multi-liner, I prefer the ending quotes to be on a line of their own, like this:

def multi_liner():
    """Here is the first line.
    Here is another line.
    Here is the final line.

But, sometimes, by accident, I'd leave a doc-string like this:

def multi_liner():
    """Here is the first line.
    Here is another line.
    Here is the final line.""""

While it's really not a big deal, it would bug me and every time I found one like this I'd "fix" it.

Eventually, it bugged me enough that I decided I was going to write a little tool to find all such instances in my code and report them. My first approach was to think "I could just do this with some regexp magic", which was really a bad idea. Then I though, I know, I should use this as an excuse to to play with Python's ast library.

That worked really well! I had the first version of the code up and running in no time. It was simple but did the job. It ran through Python code I threw at it and alerted me to both missing doc-strings, and doc-strings with the ending I didn't like.

That served me for a while, until one day I realised that it wasn't quite doing the job correctly; it was only really looking at top-level functions and top-level methods in classes. Sometimes, not often, but sometimes, I'll define functions within functions, and I feel they deserve documentation too. So then I modified the code to ensure it walked every part of the AST.

Since then, when I've run into new things and had new ideas, pydscheck has grown and grown. I've added checks that all mentioned parameters have a type; I've added checks that any function/method that returns something actually documents the return value; I've added checks that any documentation of a returned value includes its type; I've added checks that any function or method that yields a value documents that fact; I've added checks that ensure that every parameter is documented in some way.

Each time I've done this it's helped uncover issues in my code's documentation that could be cleaner, and it's also given me a pet project to slowly better understand Python's AST.

It could be that there are better tools out there, I'd have thought that a good doc-string linting tool would be something someone had already written. But this time around I was happy to NIH it because I needed a fun learning exercise that would also have some benefits for my day-to-day work.

I'll caveat this with the fact that it's very particular to how I work and how I like my documentation to look, but if it sounds useful, here it is:

There's still lots I could do with it. First off I should really properly package it up so it can be installed as a command line tool via pip. Other things that would be handy would be to allow some form of customisation of how it works. I'm sure there's other fun things I can do with it too.

That's part of the fun of having a pet project: you can tinker when you like and also get benefits from it as you use it.