Miskatonic University Press

Using Llama 3 for scripting help with curl and long podcast URLs

llm unix

Someone, I think it was Ed Summers, mentioned a podcast I didn’t know, specifically Episode #202: Why we can’t think beyond capitalism. - Neoliberalism (Mark Fisher, Capitalist Realism). I didn’t know anything about it, but there was an RSS feed (as there is for all true podcasts) so I had a look at that. Podcast RSS feeds are clear and easy ways to see what the show is like, without a lot of cruft. I thought I’d download episode 202 and give it a listen.

This is the URL for episode 202:

https://pdst.fm/e/chrt.fm/track/47G541/pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748

Wow! That looks very long and rather suspicious. The URL has a lot of domain names (such as chrt.fm) in it. What’s going on? The best tool to find out is curl. Here’s what I ran at the command line, with the output:

$ curl --head "https://pdst.fm/e/chrt.fm/track/47G541/pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748"
HTTP/2 302
date: Wed, 10 Jul 2024 23:24:45 GMT
content-type: text/html; charset=utf-8
access-control-allow-origin: *
location: https://chrt.fm/track/47G541/pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748
via: 1.1 google

The --head option has curl report the HTTP header (mostly metadata) instead of downloading anything. Notice the HTTP 302 response and the location line. That tells the client (browser or podcast program or whatever) that it should look to this URL to get what it wants. But if we get that URL …

$ curl --head "https://chrt.fm/track/47G541/pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748"
HTTP/2 302
content-length: 0
location: https://pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748
date: Wed, 10 Jul 2024 23:28:24 GMT
server: uvicorn
set-cookie: _chtbl=a225a87f117649acb2171c1dd6a37ffb; Domain=.chrt.fm; Path=/; SameSite=lax
x-cache: Miss from cloudfront
via: 1.1 4c6036e1a9755ebb992fa03bf694150e.cloudfront.net (CloudFront)
x-amz-cf-pop: YUL62-P2
x-amz-cf-id: qy9MNGbTQ1PvhMtqgaNesEm86ukfgOpUfy7CJ1FmX8XKInKY9bxv3g==

Another location header, still containing a lot of things that look like domain names in it. It’s going to take a while to get to the bottom of this.

I wrote myself a little function (for bash, the shell I use), so I can supply a URL and get the location header back.

$ function get_location() {  curl --silent --head "$1" | grep location | sed 's/.* //'; }
$ get_location "https://pdst.fm/e/chrt.fm/track/47G541/pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748"
https://chrt.fm/track/47G541/pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748

Nice!

Next I wanted to write a short script that would work through all these URLs until it got to the bottom. I spent five or ten minutes on this but ran into some kind of problem with checking if a string was empty or something like that. I thought, “Wait a minute. Simon Willison would use an LLM to do it for him!” So that’s what I tried, with DuckDuckGo’s Llama 3.

I told it:

Write a bash shell function that uses curl to find if a given URL has a location header, and if so, return that location URL.

It gave me:

get_location_header() {
  local url="$1"
  local location=$(curl -s -I -o /dev/null -w "%{http_code} %{redirect_url}" "$url")
  local code=$(echo "$location" | cut -d' ' -f1)
  local redirect_url=$(echo "$location" | cut -d' ' -f2-)

  if [ "$code" -eq 301 -o "$code" -eq 302 ]; then
    echo "$redirect_url"
  else
    echo "No location header found"
  fi
}

I said:

It is possible that running that function on a URL returns a URL that itself returns a location header. Use that function to write a recursive shell script that works through multiple levels of location headers until there are no more.

It gave me:

follow_redirects() {
  local url="$1"
  local max_redirects=10  # prevent infinite loops

  while [ $max_redirects -gt 0 ]; do
    local location=$(get_location_header "$url")
    if [ "$location" != "No location header found" ]; then
      url="$location"
      max_redirects=$((max_redirects - 1))
    else
      echo "Final URL: $url"
      return
    fi
  done

  echo "Error: too many redirects"
}

I tried that and it worked, but not how I wanted. I said:

That works, but make it show each URL as it recurses.

follow_redirects() {
  local url="$1"
  local max_redirects=10  # prevent infinite loops

  while [ $max_redirects -gt 0 ]; do
    echo "Checking URL: $url"
    local location=$(get_location_header "$url")
    if [ "$location" != "No location header found" ]; then
      url="$location"
      max_redirects=$((max_redirects - 1))
    else
      echo "Final URL: $url"
      return
    fi
  done

  echo "Error: too many redirects"
}

I said:

That works nicely.

I'm glad to hear that! The `follow_redirects` function should be helpful for tracing the chain of redirects for a given URL. If you have any more questions or need further assistance, feel free to ask!

Here’s how to run it all in a bash shell:

$ function get_location_header() {
  local url="$1"
  local location=$(curl -s -I -o /dev/null -w "%{http_code} %{redirect_url}" "$url")
  local code=$(echo "$location" | cut -d' ' -f1)
  local redirect_url=$(echo "$location" | cut -d' ' -f2-)

  if [ "$code" -eq 301 -o "$code" -eq 302 ]; then
    echo "$redirect_url"
  else
    echo "No location header found"
  fi
}
$ follow_redirects() {
  local url="$1"
  local max_redirects=10  # prevent infinite loops

  while [ $max_redirects -gt 0 ]; do
    echo "Checking URL: $url"
    local location=$(get_location_header "$url")
    if [ "$location" != "No location header found" ]; then
      url="$location"
      max_redirects=$((max_redirects - 1))
    else
      echo "Final URL: $url"
      return
    fi
  done

  echo "Error: too many redirects"
}
$ follow_redirects "https://pdst.fm/e/chrt.fm/track/47G541/pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748"
Checking URL: https://pdst.fm/e/chrt.fm/track/47G541/pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748
Checking URL: https://chrt.fm/track/47G541/pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748
Checking URL: https://pscrb.fm/rss/p/mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748
Checking URL: https://mgln.ai/e/433/claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748
Checking URL: https://claritaspod.com/measure/traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748
Checking URL: https://traffic.megaphone.fm/QCD6489207092.mp3?updated=1717391748
Checking URL: https://dcs.megaphone.fm/QCD6489207092.mp3?key=8cd61d704fe268b21da14bc81f17f7ba&request_event_id=9706319f-431f-465d-a85d-d55053cbb184&timetoken=1720741181_4922C6F958D58D4C0673A8DE2476B183
Final URL: https://dcs.megaphone.fm/QCD6489207092.mp3?key=8cd61d704fe268b21da14bc81f17f7ba&request_event_id=9706319f-431f-465d-a85d-d55053cbb184&timetoken=1720741181_4922C6F958D58D4C0673A8DE2476B183

I knew I had to use function to make bash know these were functions, but I could have told Llama to give me something I could copy and paste right into the shell. Also, it’s using short options for curl (such as -I) where I prefer long ones (such as --head) in anything scripted, because it helps readability. I could have told it to change that as well.

It works very nicely, and what’s more, the scripts are good. Better than anything I could hack in a few minutes, and I’ve never used local in a shell script in my life, but it seems like a good practice.

After all that, the final URL is

https://dcs.megaphone.fm/QCD6489207092.mp3?key=8cd61d704fe268b21da14bc81f17f7ba&request_event_id=9706319f-431f-465d-a85d-d55053cbb184&timetoken=1720741181_4922C6F958D58D4C0673A8DE2476B183

If you strip off the tracking, it’s

https://dcs.megaphone.fm/QCD6489207092.mp3

Now I can feed that into my audio player and finally listen to episode 202 of Philosophize This! I see it’s thirty-seven minutes long, which is shorter than it took me to write this.

Making the Firefox scrollbar bigger

firefox

The scrollbar in Firefox disappears when the pointer isn’t moving. For example, the top right of this site looks like this.

Part of a web page; no scrollbar
Part of a web page; no scrollbar

If I touch the touchpad, it appears and looks like this. It’s very small.

Part of a web page; tiny scrollbar
Part of a web page; tiny scrollbar

Thanks to this note by @mgorny@treehouse.systems I learned how to make it better. Go to about:config (and agree to the danger) then set widget.gtk.overlay-scrollbars.enabled to false and widget.non-native-theme.scrollbar.size.override to 24 or so. The scrollbar becomes permanent and clearly visible. I like this for a several reasons, a main one being that it’s always obvious how long a page is and where I am in it.

Part of a web page; nice big scrollbar
Part of a web page; nice big scrollbar

That’s far more technical than it should be to tweak this, but Firefox has bigger problems to fix (such as not getting bogged down in AI nonsense).

The Fall of the Sparrow

quotes

From The Fall of the Sparrow (1955) by Nigel Balchin. This scene happens in 1935 in England; the narrator is talking to people going to protest the British Union of Fascists.

Leah sat up and said, “Listen …” and then paused and closed her eyes again. There was a moment’s silence. Then she said quietly, “It’s like this. The Fascists go down there every Sunday, and a good many week days as well. They choose the places where they know there are a lot of Jewish people. They get up and they preach hatred of the Jews, and all that crazy Hitler stuff about their being responsible for everything. If a Jewish person passes he’s insulted, and if he says a word in protest he’s manhandled. Or she, if it’s a woman. Of the people listening, ninety per cent don’t agree with what’s being said. But they’re not organized and the Fascists are. People aren’t going to risk being knocked about. So they keep quiet and let the Fascists talk.”

I said, “But if they don’t agree …”

“Wait a minute. By the time this has happened week after week and the people see that no one stands up to the Fascists, they begin to think that nobody can—that it’s all hopeless. And since everybody likes to be on the winning side they begin to wonder whether the safest thing isn’t to start wearing a black shirt themselves. See?”

Jason said, “It’s a bluff, you see, Henry. Of course Fascism always is. After all, Mussolini’s march on Rome …”

“It’s a bluff,” Leah said curtly. “But it’s a bluff that’s got to be called everywhere, every time it’s tried. Every time. Everywhere.”

QR codes in Filofax inserts

stationery

It’s early days, but right now all Personal-size week-on-two-pages inserts for 2025 at Filofax have a QR code on every single day:

Screenshot of a sample page
Screenshot of a sample page

What the hell?

They work with the new Reminder App, which doesn’t look at all useful, even to people who want to mingle paper and digital scheduling. You’re locked into their app—there’s no integration with other calendar systems—and unless I misunderstood the video, after you scan in the QR code for a given day you still need to specify the date in the app.

Some highlights from the privacy policy for the app, which, remember, is scanning your appointments and your handwriting:

1.1 Account Information: To use the Filofax Diary app, you will need to create an account. We collect information such as your name, email address, and password.

1.2 Diary Page Information: When you use the app to create reminders associated with diary pages, you may provide images of diary pages and set reminders, times, and dates. These images and associated data will be stored on our servers.

2.3 Research and Analytics: We may use your information for research and analytics purposes to improve the functionality and features of the app.

3.1 Third-Party Service Providers: We may share your information with third-party service providers that help us with app functionality, hosting, and analytics. These providers are bound by confidentiality agreements.

‍3.2 Legal Requirements: We may disclose your information if required by law, regulation, or legal process.

‍3.3 Change of Ownership: In the event of a merger, acquisition, or sale of assets, your information may be transferred to the acquiring entity.

4.1 Security We take reasonable measures to protect your information from unauthorized access, loss, misuse, or alteration. However, no method of transmission over the internet or electronic storage is 100% secure, and we cannot guarantee absolute security.

The privacy policy here doesn’t match with what’s on Google Play, which says no data is collected or shared. (The CCPA clauses in the policy will make everyone outside California look admiringly.)

Even were this app the most secure and private thing possible, I think there will be very, very little interest in it. Are there any Filofax users who have a need for this? I hope they have the regular inserts for sale by October.

GrapheneOS

android grapheneos

Last month I got a new phone and installed GrapheneOS on it. It’s working wonderfully and I’m happy I did it. Here are a few notes.

Background

GrapheneOS is a privacy-focused phone operating system based on Android. I’ve used Android-based phone OSes before. I bought a Samsung Galaxy S III in 2012, and by 2015 it was getting a little old (Samsung had stopped updating it) so I installed CyanogenMod on it. “It’s like having a new phone,” I wrote. About eighteen months later I realized, “but one that isn’t as good as my old phone,” so I installed LineageOS on it, and said, “It really is like having a new phone.” Both OSes let me keep more up to date with Android and they had no vendor applications installed. Lineage was nice.

By 2019 the phone was seven years old and barely functioning. I had to carry a USB battery with me so I could recharge it every few hours, and it would sometimes spontaneously reboot while I was in the middle of a phone call. I bought a Moto G7 Play, which was reasonably priced and didn’t do anything special. Getting a new phone after seven years was fantastic: everything worked, and fast!

The new phone had some Motorola junk on it, which I ignored, and of course had the full Google suite of everything. I removed some applications and took other steps to have as much privacy as I could manage, but this was an increasing concern for me. After five years the phone was showing its age and there were no more operating system updates for it. It was past time for a new phone.

I’d been watching GrapheneOS for a while and had decided my next phone would run it, which meant I’d need to get a Pixel from Google—the current release is a Pixel 8, which I bought at a store. (Happily I didn’t need a new SIM card.)

Why run GrapheneOS on it? Because I wanted as much privacy and control as possible. I was guided by Michael Bazzell’s Extreme Privacy: Mobile Devices, where he says:

I believe GrapheneOS is the ultimate solution for our needs. It is the only option which meets all of my requirements, including the following.

  • It is completely open-source software which converts a traditional Google Pixel device into a pure native Android environment, including many enhanced privacy and security features, without any Google network services or connections.
  • It has a large community testing and verifying any changes, and updates are much more frequent than other builds.
  • It provides only the basics and allows you to customize the software you need.
  • It has a locked bootloader and does not require root access.
  • It allows sandboxed Google push services if appropriate for your needs which can easily be disabled or removed completely if desired.
  • It does not require microG for notifications.

I recommend buying the entire set of Bazzell’s books. (I hope he restarts the Intel Techniques podcast one day. It was great.)

My biggest worry (aside from bricking the phone) was that I wouldn’t be able to run Cisco’s Duo authentication app, which I need for work.

Installing

To warm up, I watched GrapheneOS: first impressions, stumbling blocks, and opinions by Veronica Explains on YouTube, which is now a year old but still relevant, and does a great job of showing how installation works and how easy it is.

As GrapheneOS recommends, I started up the phone out of the box, skipped over everything about logging in to Google, and upgraded the system and enabled developer mode. Then I used the web-based installer, which works like magic. I had to use my work laptop because it has Edge on it, which I needed to make the USB stuff work, but basically I plugged my new phone into my laptop, pushed a few buttons on a web page, and in a few minutes I had a new operating system on my phone. Compared to what I did years ago this is an unbelievable dream!

After that I followed along with Bazzell’s book for some basic configuration. Some points from my notes also include:

  • Setting up private DNS with NextDNS.
  • Configuring the phone to always use the same MAC address when on my home network (“This can be controlled per-network in Settings > Network & internet > Internet > NETWORK > Privacy”), and configuring my router first to always give that MAC address the same IP number, then to always route it through the VPN I use on the router.
  • Installing Google Play Services: I want to be able to receive phone calls and push notifications.
  • Installing F-Droid.
  • Installing Aurora Store through F-Droid, so I can anonymously install apps from the Play Store. This worked the first time for me, with no problems. I installed Duo Mobile and it worked: phew! With that done I knew everything would be okay now. I also installed Signal and Firefox here.
  • Installing more apps from F-Droid.
  • Setting up Termux.
  • Copying and restoring configurations and data from my old phone, including a backup of all my Signal chats.

I may write up how I backed up and restored my contacts and other settings with Termux, but I’ll stop here for now, with GrapheneOS installed and working.

So far

The phone is wonderful! Everything is working perfectly. I installed fewer applications on it, and I’m using it much less. When I do use it, I know it’s safe and secure. GrapheneOS is giving me a great user experience. Many thanks to everyone who works on it! I made a donation to the project and will do another soon.

One small feature that’s great to have back (Lineage had it) is scrambling the PIN input on the lock screen. Instead of having the standard numeric grid layout it mixes up the numbers each time, so anyone glancing over your should will find it much harder to see the PIN.

If you’re thinking about installing GrapheneOS but a bit worried something might go wrong, don’t be. Go for it.

Verified

reviews wikipedia

I just read Verified: How to Think Straight, Get Duped Less, and Make Better Decisions About What to Believe Online by Mike Caulfield and Sam Wineburg (University of Chicago Press, 2023) and I really liked it. This is a very good introduction to information and digital literacy, well written with a lively tone. I recommend it to anyone interested in assessing online information—from librarians doing IL to people wondering about apparent misinformation or vaguely conspiracy-minded links they get sent.

Book cover
Book cover

It uses the SIFT approach: Stop, Investigate the source, Find better coverage, and Trace back to the original context. (This is intentionally not a checklist like the PARCA test). There are many examples, illustrated with lots of images of web pages and Google searches. Someone new to this would learn good practices, and someone experienced would see a well-presented approach that gives insightful coverage of familiar issues. It’s good on the inherent “compellingness” of some web sites and sources, and how to do a vibe check. The chapter on Wikipedia is very good, and I like how they depend on Wikipedia as the best place for quick checks on the reliability of sources such as journals and think tanks. It is.

The book is grounded in an assumption of good faith—that someone wants to determine the truth and validity of some information they see online. Many do. But many don’t. If they don’t, they’re not going to read a solid, practical introduction like this. On the other hand, if you’re helping or teaching high school students or undergraduates, you have an ideal audience for this book.

There’s a postscript about ChatGPT and large language models, which went big just as the book was being finished. It’s as good as anyone could manage given the timing, but how well will the SIFT approach, or PARCA, or anything else, work if (when) everything is obscured by torrential amounts of bullshit? (Except Wikipedia, I hope.)

There’s a good review by Joel Blechinger in The Journal of Radical Librarianship (vol. 10, 2024) that goes deeper into some issues. See also this thoughtful review by Stephen Michaels in College & Research Libraries (vol. 85 no. 2, 2024).

Bonnie Clearwater quote

art quotes

There are two problems an artist faces: what to paint and how to paint it.

That’s Bonnie Clearwater in an interview about Frank Stella, in this week’s episode of The Week in Art. Later she adds, “Once you decide what to paint, you can just concentrate on how to paint it.”

J;ajsdf;ajdf;sjdaf;kjsdaf;kj af;sd jksd;f jsda;jlf sda;f s;adfj sda;jf dsa;fjj dsaj;f as;jf sadf

yorku

At York University, employees (including me) sometimes need to write documents (usually letters) on university-branded letterhead.

York letterhead example
York letterhead example

For decades this was specially-printed paper that had the logo and other information printed right on it, in colour. There would be boxes of it in the office supply cabinet. You’d put it into your typewriter and type on it. Over time that changed, and you’d put it in the office printer and print on it (after making sure your margins were correct so you didn’t print on top of the logo, and perhaps telling others not to print anything because you didn’t want their documents to use your paper). More than a decade ago we moved to doing it all digitally: people could print in colour or export to PDF.

At some point, I hypothesize before 2010, someone made a Word document that had the letterhead configured nicely and made it available as a template. People copied it and made new templates from it: perhaps a form that was updated annually, or a request for letters of reference that a committee would reuse over and over. That base template went into wide use, as a good template should.

But whoever made that original gave the document a title by mashing their fingers on the keyboard: J;ajsdf;ajdf;sjdaf;kjsdaf;kj af;sd jksd;f jsda;jlf sda;f s;adfj sda;jf dsa;fjj dsaj;f as;jf sadf. That title appears in the document properties—it’s not the title you can make by setting a style on text in the document, and it’s not the file name. You might never know it’s there. But if you export to PDF, it’s probably easier to see. Here’s the title bar on Evince, my PDF viewer, showing a document named foo.pdf that was generated using this template.

Evince title bar
Evince title bar

Here are the properties of the Word file, as shown in LibreOffice.

LibreOffice document properties
LibreOffice document properties

Try a DuckDuckGo search for the string. Scroll down, and down, and down … it goes on forever. The template has been used by every faculty, many branches of the admnistration, even in documents that people wrote for the faculty union.

DuckDuckGo screenshot
DuckDuckGo screenshot

The new brand templates say, “Please note that you no longer need to download the templates for PowerPoint, Word and Excel! Simply open a new document and select the York University option and you are set.” Even so, this old template will be around for years and years to come.

In my bashrc in Conforguration I define a function that uses ExifTool and QPDF to wipe all metadata from a PDF and make it more web-friendly. Running pdfwipe foo.pdf removes these titles. I use it before I send a template-sourced PDF to anyone outside York.

function pdfwipe() {
    for FILE in "$@"; do
	exiftool -all= "$FILE"
	qpdf --linearize --replace-input "$FILE"
    done
}

Keep scrolling … and scrolling … and scrolling …

STAPLR on hiatus

code4lib libraries sonification staplr

STAPLR (Sounds in Time Actively Performing Library Reference), my sonification of activity at the help and reference desks at York University Libraries, is on hiatus.

Yesterday we moved from a free and open source self-hosted system to LibAnswers (one of the proprietary hosted services rented out by Springshare, the most well known of which is LibGuides). I will look at how I can adapt STAPLR to use its API.

Illustration profile of Kady MacDonald Denton

kady.macdonald.denton

I was delighted today when the Autumn 2023 issue of Illustration magazine arrived in the mail. It has a long profile of my mother Kady MacDonald Denton, rich with examples of her work from various books. It was written by Warren Clements (publisher at Nestlings Press), who does regular features in the magazine. It’s a wonderful profile, covering her life and career with thoughtful and informed insight into her art.

Cover of the issue
Cover of the issue

If you’re at all interested in illustration in books and magazines from the Victoria era onwards, I highly recommend investigating Illustration if you don’t know it. It’s very good. Also look at the books Nestlings Press has done bringing back into print the work of Mervyn Peake, Peter Newell and others.

List of all blog posts