Miskatonic University Press

LibGuides

code4lib libraries

There’s that great old quote from Jamie Zawinski (though there’s more behind it):

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

I paraphase:

Some librarians, when confronted with a problem, think “I know, I’ll make a LibGuide.” Now they have two problems.

Combining multiple GPX tracks into one file

code geo

As I mentioned, I use OsmAnd when I’m out on my walks. I record my “tracks”—the path I take as I walk—and the data is stored in GPX files, one per day. I like to see them all at once on the map, so I can see everywhere I’ve walked, but now I have 53 GPX files and turning them all on or off is a pain. I needed some way to combine them all into one.

It turns out GPSBabel does the trick. I wrote this shell script to do the merging. It depends on all the GPX files being in a gpx/ directory.

#!/bin/bash

(for FILE in gpx/*
 do
     echo -n " -f $FILE"
 done
) > files-to-combine.txt

gpsbabel -i gpx -b files-to-combine.txt -o gpx -F covid-walks.gpx

rm files-to-combine.txt

sed -i "s#<name>.*</name>#<name>Covid-19 Walks</name>#" covid-walks.gpx

The -b is for batch processing and makes it easy to specify all the input files. The sed line gives the new file a nice name, instead of being a concatenation of all of the old ones. There may be some way to do this in GPSBabel (it has a very long list of possible filters) but a GPX file is just XML, which is just text, and good old sed will always do you right in such a case.

Copying covid-walks.gpx into /Android/data/net.osmand.plus/files/tracks/rec on my phone (here’s how I mount it) makes it visible to OsmAnd, so I just have the one file that needs visibility turned on or off.

A Visitor for Bear read by Donnie Yen

kady.macdonald.denton

Donnie Yen read A Visitor for Bear, written by Bonny Becker and illustrated by my mother Kady MacDonald Denton, for Save the Children’s Save with Stories campaign: watch it here. It’s a great reading, and wonderful to hear it in a different language (Cantonese, I think). I just wish they’d showed the illustrations a couple more times!

Screenshot from the video, with Donnie Yen
Screenshot from the video, with Donnie Yen

Emacs refactoring

emacs

I spent a while updating my Emacs configuration, and this time nothing went wrong! I’m pleased with the refreshed and refactored setup. Everything looks the same as it did before, except for some colours in the dark Solarized theme, because I’m using a different package for that. Behind the scenes it’s all a lot tidier.

The main change is that everything now depends on John Wiegley’s use-package.

;; If it's not installed, get it.
(unless (package-installed-p 'use-package)
  (package-refresh-contents)
  (package-install 'use-package))

(eval-when-compile
  (require 'use-package))

;; Make sure that if I want a package, it gets installed automatically.
(setq use-package-always-ensure t)
Screenshot of Emacs while I'm editing this post
Screenshot of Emacs while I'm editing this post

With that in place, for every package I want to use, I use use-package to magically get it, install it and configure it. An average example is this, for the Emacs mode to use RuboCop, the Ruby syntax checker that tells you, as you’re writing, when you’ve made a mistake in your Ruby script.

;; Rubocop for pointing out errors (https://github.com/bbatsov/rubocop)
(use-package rubocop
  :diminish rubocop-mode
  :config
  (add-hook 'ruby-mode-hook 'rubocop-mode)
  )

The diminish line keeps away some cruft that says “RuboCop mode is on,” which I don’t need reminding; and the add-hook makes it so that whenever the editor is in Ruby mode (i.e., editing a Ruby script) then rubocop-mode is turned on automatically.

I’m now caught up to where a lot of other people were years ago. Next: moving it all into one big Org file!

One thing worth noting is that I fixed some problems I was having where M-x package-autoremove kept removing packages I actually wanted. It turned out the problem was with the variable package-selected-packages, which was introduced in 25.1 and is defined in custom.el. It had a huge list of packages, many of which I don’t want any more, but some I do want weren’t in it. Just one of those things.

I fixed it by brute force. I deleted the line from custom.el, quit Emacs, restarted, and ran M-x package-autoremove. Emacs said, Do you really want to remove all the 57 packages you have installed? I said yes. They got wiped out. I quit again, restarted, and this time use-package installed everything I wanted, updated package-selected-packages, and now everything is working correctly.

Backups and encrypting disks in Linux

unix

I did some maintenance on my backup system this week, and for posterity I’ll document how it works, including how to set up an encrypted hard drive in Linux (in my case, Ubuntu).

Hard drive docks

Backup drive in a dock
Backup drive in a dock

I do backups to hard drives sitting in a dock, which I attach when needed with USB. The dock I have right now is made by Vantec and takes two hard drives. I got it, and the drives, at Canada Computers (which these days, by the way, has a great curbside pickup service).

These docks make it easy to have lots of cheap storage when needed. You can leave drives sitting in them or take them out or swap them around (carefully) as needed and then you just plug it in and there are your terabytes of disk. For backups, where speed doesn’t matter, you can buy slower and cheaper drives. The larger 3.5” drives are cheaper than 2.5” drives that go inside laptops, or, of course, solid state drives.

Backup scripts

There’s a primary backup drive, to which I copy everything, and a mirror, which is an exact copy of the primary. Every week I run my backup scripts to refresh everything on the main drive, and then I refresh the mirror. The primary right now is CRYPT_THREE, and backup scripts look like this one, which backs up my web site from its host:

#!/bin/sh -x

BACKUP_DRIVE=CRYPT_THREE

PAIRBACKUPS=/media/wtd/${BACKUP_DRIVE}/backups/pair

rsync -avz --rsh="ssh -q" --delete --times --progress pair:public_html/miskatonic.org/ "$PAIRBACKUPS/miskatonic.org/"

(Now that I look at it, those rsync options need tidying. --times is redundant, and I like to use GNU-style long option names, so it should be --archive --verbose --compress --delete --progress --rsh="ssh -q".) I don’t think I need the rsh option, though. I’ll fiddle with it. But it works.)

Backing up my laptop looks like this:

#!/bin/sh -x

BACKUP_DRIVE=CRYPT_THREE

dpkg --get-selections > ~/backups/marcus-packages.txt

dump-library.sh

rdiff-backup --verbosity 5         \
             --include /home/wtd/             \
             --include /var/www/              \
             --include /etc/apache2/          \
             --include /etc/hosts             \
             --exclude '**'                   \
             / /media/wtd/${BACKUP_DRIVE}/backups/marcus/

The dpkg command makes a list of all of the packages currently installed, and dump-library.sh does a dump of the database behind my personal library catalogue, Mrs. Abbott.

rdiff-backup does differential backups, taking snapshots of my files at that moment. I can go back and look at things as they were last month or a couple of years ago. All the other backups are just mirroring what’s there on remote machines, but for the laptop where I do everything this means if I need something I deleted years ago I can go back and find it. (Every now and then I wipe out a year’s worth with something like rdiff-backup --remove-older-than 2014-01-01 /media/wtd/CRYPT_THREE/backups/marcus/.)

And now that I look at this I see there’s more in /etc/ I should be backing up, so I’ll tweak that.

All those scripts get everything onto my primary backup drive, currently CRYPT_THREE. This mirrors it to CRYPT_TWO (and again --times is redundant):

#!/bin/sh

BACKUP_DRIVE=CRYPT_THREE
MIRROR_DRIVE=CRYPT_TWO

rsync --archive --verbose --delete --times --exclude "/lost+found/" /media/wtd/${BACKUP_DRIVE}/ /media/wtd/${MIRROR_DRIVE}/

Making an encrypted drive

Making those encrypted drives takes some special commands. Here’s how I did it when I bought a new 4 TB drive because my 2 TB backups were running out of space. (Don’t ask me how dm-crypt actually works.)

First, I took out the other drives from the hard drive dock. I put the new drive in and turned it on. It’s unformatted, so it didn’t get mounted, but the computer knew it was there and gparted saw it. It was identified as /dev/sdb. I set the partition table type to GPT (GUID Partition Table).

Always make sure you’re formatting and encrypting the new hard drive and not the drive in your laptop with all your files on it! That way madness lies. Scrutinize the /dev/sdb stuff very carefully. My main drive is /dev/sda. A, B, A, B … I always check multiple times before I do anything serious.

Then:

sudo cryptsetup luksFormat /dev/sdb

Here I confirmed I know what I’m doing and then entered the passphrase for the disk. I always type the passphrase in by hand (twice) to make sure nothing funny happens from copying and pasting, like a weird end-of-line character or a space sneaking in accidentally.

Then I put a file system on it and named it, in this case CRYPT_THREE.

sudo cryptsetup open /dev/sdb CRYPT_THREE
sudo mkfs.ext4 /dev/mapper/CRYPT_THREE
sudo cryptsetup close CRYPT_THREE
sudo cryptsetup --type luks open /dev/sdb CRYPT_THREE
sudo cryptsetup close CRYPT_THREE

I don’t know where I got that incantation, but it’s copied from somewhere. I don’t know if the close followed directly by an open could be collapsed, but it works and I’m not going to touch it. (On the other hand, there’s nothing to lose when you’re setting up a new hard drive because it’s empty and you can just reformat it, so maybe next time I’ll look into it.)

Now it was ready. I safely unmounted the drive and turned the dock off, then turned it on and checked what happened. The system saw the drive and asked for the passphrase, then mounted it. On Ubuntu, it shows up as /media/wtd/CRYPT_THREE. It works!

Moving drives around

Before I got the new 4 TB drive, I had two 2 TB drives. Call them A and B: A was the primary, B was the mirror. I wanted to make the new drive the primary, and keep the mirror. The steps were:

  • Format the new drive, C.
  • Mirror A to C.
  • Change my scripts so that they back up to C.
  • Run all my scripts to do fresh backups to C.
  • Mirror C to B.
  • Confirm everything works. (If it doesn’t, A is still good to be the primary.)

Now C is my primary backup and B still the mirror. That left me with A, which became my offsite backup. Normally, next I’d take B offsite and bring back A, then mirror C to A and make A my local mirror, and keep switching A and B regularly. However, by then I will have moved everything to 4 TB drives, so I’ll have D and E. I’ll set them up and mirror C to both of them, then use D as my local mirror and take E offsite. When I bring back A that will leave me with A and B as superfluous 2 TB drives. I have some ideas for how to use them, and will write them up if things work out.

Here is my offsite backup tucked into a special protective case before I took it somewhere (obeying all isolation directives) to sit on a quiet shelf.

Backup drive in a protective case
Backup drive in a protective case

Monforte Dairy

covid19 wychwood.barns

Another regular stop at the Wychwood Barns farmer’s market (previously: ChocoSol, Clover Roads Organic Farm, Alchemy Pickle Company and Motherdough Mill and Bakery) was the Monforte Dairy table, where a friendly cheesemonger would have a few coolers of cheese and yogurt with samples for tasting. This was where I got pretty much all my cheese. I’d usually get one hunk and a bag of cheese curds. Their cheese curds are probably the best I’ve ever had: big, soft and thick, without any unnecessary colouring or flavouring.

I had resigned myself to getting cheese at the grocery store, but the ones I’m going to have a poor selection and nothing local. Then I discovered that Monforte does delivery! You can get a box of 1.5 kg of seven or so different types of cheese (they pick) for $60. They charge a flat rate of $50 per kilo for all their cheeses, so you’re getting $15 worth of free cheese delivered to your door.

This is my first delivery, last month.

Monforte Dairy Cheese
Monforte Dairy Cheese
Monforte Dairy Cheese
Monforte Dairy Cheese

The blue was a real treat.

Monforte Dairy Cheese
Monforte Dairy Cheese

A couple of weeks ago I got another box.

Monforte Dairy Cheese
Monforte Dairy Cheese
Monforte Dairy Cheese
Monforte Dairy Cheese

That Golden Child was astounding: fresh as anything, a bit runny with a lactic tang. Made quick work of that, believe me.

And the Water Buffalo Fresco was delicious … and seemed familiar … I ate more … it’s water buffalo … it’s fresh … wait a minute, this is mozzarella di bufala! (Except that’s a protected name, so it can’t be called that, and I guess it’s got a different shape). When the tomatoes are back this summer then I’m definitely going to get more of this and have a fresh Caprese salad. I hope I get more of this in my next box.

I got a good tip from the woman who was at the market over the winter. I’d been having a problem where when I unwrapped a soft cheese the rind would start to go moldy pretty fast. She suggested wrapping it in parchment paper and putting it in a sealed container, and this did the trick. It’s probably an indication that I need to really clean my fridge, but it’s a good cheese habit generally. (She also gave me a free piece of cheese to experiment on.)

Humans and dinosaurs

vagaries

The other day I remembered Stockwell Day, a bigoted ignoramus who was briefly the leader of one of the right-wing political parties in Canada during the Conservative schisms before the party reassembled and ultimately gained power (eventually losing to Trudeau Minor’s Liberals). He’s probably most famous for making a complete ass of himself trying to look cool by driving up to a shoreline news conference on a Sea-Doo.

Day was justly ridiculed for believing that humans and dinosaurs co-existed. He meant a few thousand years ago, like the Flintstones. It came up in a CBC documentary, as described in Day Lashes Out Against Liberal Attacks and the CBC (CBC, 15 November 2000):

Day also accused the CBC of practising what he called “yellow journalism.” Day’s anger with CBC Television is over a documentary aired Tuesday night on The National.

The report looked at the relationship between Day’s religious beliefs and his politics.

It included an interview with an academic who said he attended a Day speech at Red Deer College where the Alliance leader said there was as much evidence to support creationism as evolution. The man said Day told the audience that the earth was 6,000 years old and that humans and dinosaurs co-existed.

Here’s an op ed, Creationism and Stockwell Day (Globe and Mail, 17 November 2000):

Where faith and science become incompatible is in the rigid refusal to accept the signposts of science where they conflict with articles of faith. In a documentary aired Tuesday on CBC-TV’s The National,the head of natural science at Red Deer College in 1997 said he heard Mr. Day tell a crowd that the world is only several thousand years old and that men walked with dinosaurs. While that may be consistent with the literal word of Genesis, it is inconsistent with the evidence uncovered by geologists and others, and subjected to tests and challenges, that Earth is billions of years old and that, The Flintstones notwithstanding, dinosaurs died off tens of millions of years before humans first appeared.

But the thing is … dinosaurs and humans did co-exist and in fact co-exist today. I heard dinosaurs when I woke up this morning. Last week I ate roasted dinosaur (with roasted squash sprinkled with harissa, and green peas—it was delicious). For breakfast yesterday I had a dinosaur egg sunny side up.

The origins of bird evolution were unsettled twenty years ago, and Stockwell Day had an incorrect understanding of most facts, but it turns out he actually was (though not the way he meant) correct.

No more FLAC on Listening to Art

listening.to.art

Tomorrow I begin the fourth year of Listening to Art with volume seven number one (Marcel Duchamp’s Bicycle Wheel). I dropped the FLAC podcast feed and now there’s just the MP3 feed, but at a higher bit rate. The recordings are not of such incredible audio quality that having the FLAC version matters, and having two files for each issue (tomorrow is the seventy-third) was adding up to a fair bit of disk space.

People who follow the FLAC feed may see volumes 01 to 06 pop up in their player again and will have to ignore them, but they don’t need to go to a new URL. They’ll just get the MP3 versions from now on.

Recamán's Sequence

mathematics

Thanks to the Neil Sloane episode of the Numberphile podcast, I learned about Recamán’s sequence (look for the “listen” link and fool around with that):

a(0) = 0;

for n > 0,

a(n) = a(n-1) - n if nonnegative and not already in the sequence,

otherwise a(n) = a(n-1) + n.

The first twenty numbers in the sequence are: 0, 1, 3, 6, 2, 7, 13, 20, 12, 21, 11, 22, 10, 23, 9, 24, 8, 25, 43, 62.

In the comments, Neil Sloane said, “I conjecture that every number eventually appears.” But then in 2017 he added: “That was written in 1991. Today I’m not so sure that every number appears.”

As of 2018, 10^230 terms have been calculated, but 852655 has not yet shown up.

The Slightly Spooky Recamán Sequence video with Alex Bellos has some great visuals.

Ripping DVDs with ffmpeg

unix

For my future reference, here’s how I converted one video on a DVD into a better format. The video was family home video shot in 1994 on a camcorder and had later been digitized and put on DVD, but I wanted to get it off that and into a more modern, compressed format. I knew ffmpeg would do it, but I find video files confusing and didn’t know how.

When the DVD is mounted as data there’s just a VIDEO_TS directory in it

drwxr-xr-x 2 wtd wtd 4096 May 10 18:46 VIDEO_TS/

Inside that there were a few files. The VOB files are the actual video and the marvellous VLC will of course play them.

-rw-r--r-- 1 wtd wtd      12288 Dec 31  2005 VIDEO_TS.BUP
-rw-r--r-- 1 wtd wtd      12288 Dec 31  2005 VIDEO_TS.IFO
-rw-r--r-- 1 wtd wtd      30720 Dec 31  2005 VIDEO_TS.VOB
-rw-r--r-- 1 wtd wtd      40960 Dec 31  2005 VTS_01_0.BUP
-rw-r--r-- 1 wtd wtd      40960 Dec 31  2005 VTS_01_0.IFO
-rw-r--r-- 1 wtd wtd 1073739776 Dec 31  2005 VTS_01_1.VOB
-rw-r--r-- 1 wtd wtd  549941248 Dec 31  2005 VTS_01_2.VOB

The video is split into two files because there’s a maximum file size of 1 GB, so anything larger is divided up into multiple files. Thanks to the inevitable helpful answer on StackOverflow and some other Stack Overflow stuff, I got this going to concatenate the files and compress them with H.265 into smaller MP4 video files. (If there are more VOB files they can be added by extending the concat directive.)

ffmpeg -i "concat:VTS_01_1.VOB|VTS_01_2.VOB" -vcodec libx265 new-video-h265.mp4

This processed the video in more or less real time (about an hour), but it was on an old box. The result:

-rw-r--r-- 1 wtd wtd  304841184 May 10 19:34 new-video-h265.mp4

That’s 1.5 GB down to 290 GB, or about 80% reduction in file size. It looks just it like it did when we first watched it on the VCR.

Finally, whenever I think of old video, I always turn to The Gerry Todd Show.

List of all blog posts