(setf (point) some-location-or-other)
(setf (buffer-string) "")
There’s a whole background to why I’ve tended to code like that, that stems from enjoying Common Lisp, my days reading (and sometimes posting to)
comp.lang.lisp, and I think some of the stuff Erik Naggum wrote back in the day. I won’t get into it all now; I’m not sure I can even remember a lot of how I got there given how far back it was.
Wanting to quickly get to the bottom of why the above was suddenly an issue, I dived into the
NEWSfile and found the following:
** Many seldom-used generalized variables have been made obsolete. Emacs has a number of rather obscure generalized variables defined, that, for instance, allowed you to say things like: (setf (point-min) 4) These never caught on and have been made obsolete. The form above, for instance, is the same as saying (narrow-to-region 4 (point-max)) The following generalized variables have been made obsolete: 'buffer-file-name', 'buffer-local-value', 'buffer-modified-p', 'buffer-name', 'buffer-string', 'buffer-substring', 'current-buffer', 'current-column', 'current-global-map', 'current-input-mode', 'current-local-map', 'current-window-configuration', 'default-file-modes', 'documentation-property', 'frame-height', 'frame-visible-p', 'global-key-binding', 'local-key-binding', 'mark', 'mark-marker', 'marker-position', 'mouse-position', 'point', 'point-marker', 'point-max', 'point-min', 'read-mouse-position', 'screen-height', 'screen-width', 'selected-frame', 'selected-screen', 'selected-window', 'standard-case-table', 'syntax-table', 'visited-file-modtime', 'window-height', 'window-width', and 'x-get-secondary-selection'.
As suggested above… this is my thing, this is how I coded some Elisp stuff. Look through much of my Emacs Lisp code and you’ll find me
setfing stuff all over the place.
Apparently my style is “obscure”. Actually, I’m kinda okay with that if I’m honest.
This is going to be a bit of a pain in the arse; I’m going to have to go through a whole bunch of code and make it “less obscure”, at some point.
This isn’t the part that had me thinking I must be getting old. Oh no. The
NEWSfile had another little surprise in store:
** The quickurl.el library is now obsolete. Use 'abbrev', 'skeleton' or 'tempo' instead.
That…. that’s me that is. Well, it’s one of the me things. If you run
about-emacs, dive into
Authors, and search for my name, in any copy of GNU Emacs from the last decade or two, you’ll find this:
Dave Pearson: wrote 5x5.el quickurl.el
quickurl.elwas a package I wrote back in the late 1990s, back when I was a very heavy user of Usenet, and often found myself posting the same URLs in posts again and again; especially in
comp.lang.clipper. As a fairly quick hack I wrote the code so that I could very quickly insert often-used URLs.
Some time later, I got an email from the FSF (I actually think it was from RMS – but that’s an
mboxI’ve long ago lost – or a backup of it might be in storage back in England, on a DVD), asking if I wanted to contribute it to Emacs proper. This seemed like an odd thing to add to Emacs but, sure, why the hell not?
And so I had my second contribution to a body of code I used a lot (the first being
5x5.el– which itself was my first ever attempt at writing some non-trivial Elisp code).
So… yeah… here we are. I’m now old enough to have written some Emacs Lisp code, had it required by the FSF for inclusion in Emacs, had it live in there for something like two decades, and then become obselete!
A couple of days back (for vague values of “couple”, of course), first of the month, having my morning coffee, I go and open my bank’s mobile app to move a bit of money about and pay a couple of things. This happens every month. This is so routine I do it almost on autopilot.
Yeah, yeah, I know, it’s banking, pay attention! But still… morning, coffee, routine.
I get to the final movement/payment and then notice something:
That…. that text! WTF? So then I look back at my payment history and notice that all but one payment hadn’t gone through! O_o
This alone is fine. Stuff happens. Things fail. I’m okay with that. It’s an inconvenience for sure but doubtless whatever the problem is will be fixed and I can make the payments again later. But…
That result. There’s a tick. A GREEN tick. And a “Thank you”. It’s natural to see that image, know that it’s always meant “shit worked” and just carry on.
In one of my systems at work there’s a tool I wrote for checking a repository of code to make sure it conforms to a certain standard. When folk use it they get a night big, bold and bright green thumb-up above the text that says everything is cool. If there’s a problem, any sort of problem at all, then the display is red and there’s no jolly icon and it’s obvious that things are different and you likely want to pay attention to the explanation of what isn’t right.
This isn’t news, of course. This isn’t some revelation about UI design or anything. We know this stuff. I think what boggles my mind a little bit about this is that something as important – and hopefully by this point as mature – as a mobile banking app should get something as obvious as this right.
But here we are, with a nice friendly green icon showing a tick and a friendly big “Thank you” followed by smaller text going “aye shit didn’t work pal”.
For well over a year now I’ve been recording my VR gameplay and uploading it to YouTube. Less as a “content creation” thing, more as a nice record of games I’ve played and, on occasion, as a little bit of help to others; in the past I’ve watched other folk play games I like to get ideas for approaches to them, and I’ve also received the odd comment now and again where my play-through has helped someone else.
A question I’ve had a couple of times is what I use to do the recording, so I thought I’d make an effort to write it all down here.
First up, a couple of things to note: I started recording PCVR around April 2021 and the initial setup was a bit trial-and-error and Google searching and blog reading. As such, not all of the details of how to set up will be here, and I may even miss off some stuff I changed and is worthy of note; at the same time I might mention stuff that’s just an obvious default.
Consider this blog post as being a written version of one of my videos: it’s for my own fun and benefit and might also help me in the future should I want to apply some of this again, and if it helps someone else that’s a lovely bonus.
While it’s not exactly the point of this post, I guess it’s worth mentioning the hardware I use as of the time of writing. Given this is about PCVR, I of course have a PC which is running Windows. The machine information within Windows says it’s a:
Intel(R) Core(TM) i5-10400F CPU @ 2.90GHz
Warning: I don’t do hardware. I buy it from time to time, but hardware leaves me bored. It runs VR on a PC. This is fine.
The machine itself has 16 GB of memory, is running Windows 10 Home and has a GeForce RTX 3060 for handling the graphics.
The headset I’m using is a
n OculusMeta Quest 2. I’ve had this since around November 2020, playing Quest-native games for the first few months, until I cracked and got the PC mentioned here to get into PCVR.
The headset is connected to the PC with a USB cable.
Finally, for recording voice, I use a USB lapel microphone with a really long cable.
It should be said that, yes, sometimes, I do get a little caught up in things with two cables hanging off me. If I could give one tip here it would be that running the microphone cable up your trousers and shirt makes life a ton easier. As a bonus I have the USB cable for the headset running around the headset’s strap and connected to it at the back and then running down my back.
The core software used is OBS Studio. This has got to be one of the best bits of free software I’ve ever used, in terms of interface and what it delivers.
Years back my son used to record and upload gameplay to YouTube and I can remember him having no end of issues using different recording software; some working with one game but not another, some other working with a different set of games, video and sync issues, etc… Lots of pain quite often. With OBS Studio the only issues I’ve ever had have been my own mistakes.
At this point I have to confess that when I set it up I didn’t make a point of keeping a recording of what I changed – I was experimenting and not expecting much to come of it. So what I note here are the things that feel like they’re important, and only the things that relate to recording PCVR, not streaming it (that might end up being a different blog post).
That said, here are things I seem to remember as being key:
The items in the output pane in settings that I have and which might be important are:
Recording Quality: High Quality, Medium File Size
Recording format: mkv
Encoder: Hardware (NVENC)
I do remember the recording format being set to
mkvas something that’s really important. I think it’s
mp4by default, or was when I first installed, and if your machine crashes or OBS were to crash or something, you could end up with footage that can’t be used. Using
mkvmeans you can still use the footage (as I understand it). It does mean that once you’re finished you have to use the “remux” option under the
Filemenu, but that’s a small price to pay.
I can say that at least once I’ve had to hard-reboot my machine when a game and SteamVR and the like all got upset. I likely saved 45 minutes or more of footage thanks to
Nothing really special in here, I simply have both the base and output resolutions set to the desktop resolution. This might be something for me to tinker with in the future, but so far I’ve not run into any problems.
Now, of course, all of the above is great and fine and all but there’s the issue of how you capture the VR gameplay. I approach this a couple of different ways. The first is I use the OpenVR Capture plugin for OBS. This makes capturing footage from SteamVR really easy. The only downside I found is that out of the box there’s no default crop setting for using a Quest 2 (or I guess the Rift, as the Quest 2 sort of appears as a Rift to SteamVR games). As such I remember playing trial and error with that until I was happy I was getting as much footage as possible without having black bars and the like.
Something I also like about the OpenVR Capture plugin is you can say if you want to capture the left or right eye. Normally not that big a deal for some things, but if you’re playing a shooter and want people to see exactly what your dominant eye is seeing, that matters.
Sadly, of course, not every game can be captured with that plugin. So far I’ve found that any game that can’t be has its own mirror window on the desktop. In that case I use a
Game Capturesource and set it to capture that specific window. I could of course just get it to capture the focused window or something like that but I prefer to know that it’s only grabbing what I want it to grab.
That’s pretty much it I think. There’s not a lot to it, although on occasion a lot can go wrong. Mostly it’s a wonder any of it works. I mean, think about it, I have a computer with two screens strapped to my face, with two controllers in my hands talking to it; it’s then connected via the Oculus Link to the Oculus Home; from which I start up SteamVR; and from the SteamVR home I start up the game and then “live” inside the game. It’s a virtual world inside a virtual world inside a virtual world inside a real world; with lots of software along the way, all talking at once.
That is then being recorded.
Sometimes, on occasion, it takes a reboot or five to make it all work together.
Really, it’s a wonder it ever works. ;-)
I’m back! Almost. More or less. In more ways than one. First off, as often happens with blogs (we’ve all been there right?), I’ve been away from blogging for a while. I’ve still been online, still been waffling away on twitter, and have also stumbled into fosstodon as well. Doubtless plenty of other things.
A big distraction for me, and one that is ongoing, is mucking about on YouTube. Since the last time I wrote anything on the blog I got myself a VR setup, and then a PCVR setup, and then finally fibre came to the village and I could stream, and… well, you can see how that would go.
So, in short, that’s where I’ve been and that’s what’s been keeping me busy. Now that I’m paying some attention to blogging again (hopefully!) I imagine some of that will end up on here – I’d quite like to write about VR and gaming amongst other things.
Now, I said I’d been away in more ways than one. Another way is explained by this post from back in 2019, where I said I was going to head over to Hashnode and carry on blogging there, obviously with an emphasis on development and just development.
That kept me busy for a while and worked out well, mostly. But… well, see above in part; I sort of ran out of steam when it came to purely-development topics. But I still wanted to write, a bit, and wanted to write about more than just development.
Also, something else was bothering me about being over on Hashnode. In the past year, in terms of what they promote themselves, especially blogs and posts they promote on their Twitter feed, they seem to have started to lean really hard into crypto and web3 and NFTs and all that stuff. This left me feeling like that was all a bit icky and it was time to put some distance between that platform and myself.
So over the past couple of weeks, low-level and as a background task, I’ve been back-porting posts form over there back into this blog. Starting with this post all new blog content, be it about software development or anything else, will be on here. If I’m really sensible and don’t get distracted by new shiny… this should be how it remains now.
Expect some changes over the next few weeks. While I’m aiming to stick with the core tech (Github pages, Markdown and Jekyll, Emacs to edit, etc), I’d like to tinker with the look and layout of the blog. The content will remain the same though.
So, yeah, anyway, if you’re reading this… hey, it’s good to be back. :-)
This post will cover the most important content of a 2bit file: the actual sequence data itself. In the first post I wrote about the format of the file’s header, and in the second post I wrote about the content of the file’s index.
At this point that’s enough information to know what’s in the file and where to find it. In other words we know the list of sequences that live in the file, and we know where each one is positioned within the file. So, assuming we have our index in memory (ideally some sort of key/value store of sequences names and their offsets in the file), given the name of a sequence we can know where to go in the file to load up the data.
So the next obvious question is, what will we find when we get there? Actual sequence data is stored like this:
Content Type Size Comments DNA size Integer 4 bytes Count of bases in the sequence N block count Integer 4 bytes Count of N blocks in the sequence N block starts Integer Array 4*count bytes Positions are zero-based N block sizes Integer Array 4*count bytes Mask block count Integer 4 bytes Count of mask blocks in the sequence Mask block starts Integer Array 4*count bytes Positions are zero-based Mask block sizes Integer Array 4*count bytes Reserved Integer 4 bytes Should always be 0 DNA data Byte Array See below
Breaking the above down:
As mentioned in passing in the first post: technically it’s necessary to encode 5 different characters for the bases in the sequences. As well as the usual T, C, A and G, there also needs to be an N, which means the base is unknown. Now, of course, you can’t pack 5 states into two bits, so the 2bit file format solves this by having an array of block positions and sizes where any data in the actual DNA itself should be ignored and an
Nused in its place.
This is where my ignorance of bioinformatics shows, and where it’s made very obvious that I’m a software developer who likes to muck about with data and data structures, but who doesn’t always understand why they’re used. I’m actually not sure what purpose mask blocks serve in a 2bit file, but they do affect the output. If a base falls within a mask block the value that is output should be a lower-case letter, rather then upper-case.
The DNA data
So this is the fun bit, where the real data is stored. This should be viewed as a sequence of bytes, each of which contains 4 bases (except for the last byte, of course, which might contain 1, 2 or 3 depending on the size of the sequence).
Each byte should be viewed as an array of 2 bit values, with the values mapping like this:
Binary Decimal Base 00 0 T 01 1 C 10 2 A 11 3 G
So, given a byte whose value is
27, you’re looking at the sequence
TCAG. This is because
27in binary is
00011011, which breaks down as:
00 01 10 11 T C A G
How you pull that data out of the byte will depend on the language and what it makes available for bit-twiddling; those that don’t have some form of bit field will probably provide the ability to bit shift and do a bitwise
and(it’s also likely that doing bitwise operations is better than using bit fields anyway). In the version I wrote in Emacs Lisp, it’s simply a case of shifting the two bits I am interested in over to the right of the byte and then performing a bitwise and to get just its value. So, given an array called
2bit-baseswhose content is this:
(defconst 2bit-bases ["T" "C" "A" "G"] "Vector of the bases. Note that the positions of each base in the vector map to the 2bit decoding for them.")
I use this bit of code to pull out the individual bases:
(aref 2bit-bases (logand (ash byte (- shift)) #b11))
Given code to unpack an individual byte, extracting all of the bases in a sequence then becomes the act of having two loops, the outer loop being over each byte in the file, the inner loop being over the positions within each individual byte.
In pseudo-code, assuming that
endhold the base locations we’re interested in and
dna_posis the location in the file where the DNA starts, the main loop for unpacking the data looks something like this:
# The bases. bases = [ "T", "C", "A", "G" ] # Calculate the first and last byte to pull data from. start_byte = dna_pos + floor( start / 4 ) end_byte = dna_pos + floor( ( end - 1 ) / 4 ) # Work out the starting position. position = ( start_byte - dna_pos ) * 4 # Load up the bytes that contain the DNA. buffer = read_n_bytes_from( start_byte, ( end_byte - start_byte ) + 1 ) # Get all the N blocks that intersect this sub-sequence. n_blocks = relevant_n_blocks( start, end ) # Get all the mask blocks that interest this sub-sequence. mask_blocks = relevant_mask_blocks( start, end ) # Start with an empty sequence. sequence = "" # Loop over every byte in the buffer. for byte in buffer # Stepping down each pair of bits in the byte. for shift from 6 downto 0 by 2 # If we're interested in this location. if ( position >= start ) and ( position < end ) # If this position is in an N block, just collect an N. if within( position, n_blocks ) sequence = sequence + "N" else # Not a N, so we should decode the base. base = bases[ ( byte >> shift ) & 0b11 ] # If we're in a mask block, go lower case. if within( position, mask_blocks ) sequence = sequence + lower( base ) else sequence = sequence + base end end end # Move along. position = position + 1 end end
Note that some of the detail is left out in the above, especially the business of loading up the relevant blocks; how that would be done will depend on language and the approach to writing the code. The Emacs Lisp code I’ve written has what I think is a fairly straightforward approach to it. There’s a similar approach in the Common Lisp code I’ve written.
And that’s pretty much it. There are a few other details that differ depending on how this is approached, the language used, and other considerations; one body of 2bit reader code that I’ve written attempts to optimise how it does things as much as possible because it’s capable of reading the data locally or via ranged HTTP GETs from a web server; the Common Lisp version I wrote still needs some work because I was having fun getting back into Common Lisp; the Emacs Lisp version needs to try and keep data as small as possible because it’s working with buffers, not direct file access.
Having got to know the format of 2bit files a fair bit, I’m adding this to my list of “fun to do a version of” problems when getting to know a new language, or even dabbling in a language I know.
subscribe via RSS