Miskatonic University Press

Andrew Kines and I talk about LOTR

04 April 2010 vagaries

Andrew Kines and I are in some advertorial content in The Toronto Star today, talking about how much we loved The Lord of the Rings as boys and how it influenced our lives: My Dinner with Frodo. (Actually, it was lunch: we got together for lunch a few weeks ago and talked for three hours. Andrew boiled it down.)

I first read The Hobbit and then LOTR when I was about eleven. I loved it. I was distraught when it was over and I had to leave Middle Earth. I think I reread it immediately. I read it five or six more times, but I was never a huge LOTR geek. I never read The Silmarillion or any of the other stuff. I did read a fair bit of post-Tolkien Extruded Fantasy Product, like the Belgariad and the Malloreon by David Eddings (which are awful), The Sword of Shannara, and a lot of other similar bilge. Eventually I gave up on fantasy.

I’ll never forget how Tolkien captured me as a boy, though. The Lord of the Rings was a major influence on me in lots of ways. I’ll never read it again — I tried fifteen years ago and I could no longer stand the language and style — but it was one of the books that turned me into a grown-up reader and, in part and often indirectly, set the course of my life.

(The picture is by Michael Watier. That’s me on the left and Andrew Kines on the right.)


Make my Glass the P-Glass

18 March 2010 music

Jaron Lanier jamming

Last Saturday I saw Jaron Lanier jamming with Rick Sacks at the Arraymusic studio downtown. Rick’s a percussionist, and you can see him in action in these videos.

There were only about ten people at the show, including Tim Knight, a cataloguer friend and colleague who has an exciting blog about KF Modified (the Canadian law classification scheme). The studio’s small and it was quite informal and improvisational. Lanier and Sacks just made it up as they went, and there were many times when they got into something really interesting. Sacks played all different kinds of percussion; Lanier played piano, a small Chinese harp, and several different kinds of flutes. Another fellow played drums and percussion on a couple of songs, too. Lanier told some good stories. I enjoyed it very much. It’s always worth it to see two really talented musicians improvising together.

I saw in Lanier’s Wikipedia entry that he’s collaborated with, among others, both Philip Glass and George Clinton. Not at the same time. But what if those two did collaborate? What if Glass wrote a based on Clinton’s work, the way he turned David Bowie’s “Heroes” into Symphony No. 4? I think the lyrics might go something like this:

We want the funk give up the funk
We want the funk give up the funk
We want the funk give up the funk
We want the we want the
We want the we want the
We want the give up the
Give up the give up the
We want we want we want we want
We want we want we want we want
Give up the give up the funk
Give up the give up the funk
Give up the give up the funk
We want we want we want we want
We want we want we want we want
Give up the give up the funk
Give up the give up the funk
Give up the give up the funk
We want we want we want we want
We want we want we want we want
Give up we want give up we want
We want we want we want we want

I pass between the two lions

01 March 2010 libraries

The reading room at the main branch of the New York Public Library

I left Code4Lib Thursday morning and instead of arriving back in Toronto that night, bad weather (snow and wind) stranded me in New York for the night. I stayed in a hotel near Times Square. I got there too late to do anything, but the next morning I had a couple of hours to spare before I had to get back to LaGuardia and wait some more, so I was able to see something I’d always wanted to see: the main branch of the New York Public Library, at 42nd and Fifth.

It’s an astounding building. Incredibly beautiful. The reading room on the third floor was awe-inspiring. I wandered around, looked at the reference collection, reshelved a Loeb edition of Suetonius that was out of order, and then just sat there, drinking it all in and looking up at the clouds on the ceiling. When I go back I’ll take a tour and spend the day there.

After Code4Lib last year I visited the Providence Athenaeum and the Boston Athenaeum. At both places the librarians were extremely generous with their time and gave tours and showed treasures from their collections. Those two and the NYPL branch are three of the most beautiful libraries in the world. I wonder what I’ll see after next year’s Code4Lib.


Code4Lib 2010: Wednesday 24 February

01 March 2010 code4lib

Emily Lynema, Iterative Development: Done Simply

Problems: You have too much work to do. Priorities change frequently. Requirements change. No business analysts. Emergencies happen. “IT black box” where no-one outside IT knows what’s going on inside.

Agile development as opposed to the waterfall method.

Scrum: product owner/scrum master/team. Artifacts: product backlog, sprint backlog. 2-4 week cycle. Plant, commit to certain things and estimate. Daily scrum for fifteen minutes each day. What have you done since yesterday? What will you do today? What problems might you have? After sprint, sprint review and retrospective.

Case study at NCSU Libraries. They use iterative process loosely based on Scrum. JIT planning and documentation. Collaboration with customers. Joint project ownership.

They use JIRA, Confluence. Sprint planning: Google Docs + JIRA.

Sprint planning: use one week to plan across multiple projects. Day 1: overview of next 3-6 months. Prioritize. She does up a Google Docs spreadsheet with weeks as columns and projects as rows, and puts in everything she knows about what should be done.

Days 2-5: Meet with product owners for each prioritized project.

A release in JIRA = product iteration.

Day 6: Sprint planning. Reprioritize based on estimate and time available.

Development: Working on it all. Daily meetings. Weekly review.

Challenges: Multiple small projects within a cycle. Not traditional for Agile. Lack of documented requirements: what are the user stories and when do you need them? “Teams of librarians work slowly.” Prioritization difficult for library staff. Testing: how to automate; no QA experts. Simultaneously handle support and development.

Outcomes: Projects working well. Keeping to six-week cycles keeps everything in line. 31 releases across six projects in 2009. Increased flexibility.

Agile for All. Succeeding with Agile

Bess Sadler, Vampires vs. Werewolves: Ending the War Between Developers & Sysadmins with Puppet

Developers say: Those sysadmins keep me from doing my job! My job is to write new software and add features and do cool stuff and get it into production.

Sys admins say: Those developers! My job is to keep things running and build trust and make sure things are reliable.

But if they’re arguing and systems go down or new features don’t get added, the angry villagers show up with pitchforks.

Innovation is about risk. You don’t take risks with people you don’t trust. Let go of the anger.

Testing. They use NAGIOS to watch their projects, not just uptimes: searches into systems, etc. They set up tests and use NAGIOS to run them automatically, so when there’s an upgrade they can see automatically what works and what doesn’t.

Write docs. Link to the doc from where the problems are seen (in NAGIOS for example).

Hudson, continuous integration tool.

Puppet for release management.

Naomi Dushay, Willy Mene, and Jessie Keck, I Am Not Your Mother: Write Your Test Code

They use Hudson to manage automated testing. Looks nice. Made me feel guilty because I never write tests. But then I’m just hacking on stuff myself, mostly.

Selenium: Firefox plugin to automate browsing, good for testing.

Summary of test-driven development.

Uses rSpec, Cucumber, and the Rails testing environment.

Jessie K Blacklight on his laptop. Showed an rSpec test. He knew it would fail. Ran it and it failed. Edited code to change something, reran the test, and it passed.

Cucumber. It has features that consist of scenarios. Showed an example of using this to test that something was on the home page. It wasn’t, test failed, code changed, test rerun, test passed.

Types of tests: Unit tests. Integration tests. Black box/functional/acceptance testing.

Other web testing tools: WebRat, Watir.

Chris Beer, Media, Blacklight, and Viewers Like You

Media archives. “Anatomy of a film clip.”

PBCore. Fedora. Blacklight. Solr. jQuery. MySQL. Lighttpd.

http://tinyurl.com/c4l-pbcore

They did their own video player, which handles scrolling simultaneous transcripts.

http://github.com/cbeer/ave-sync

Ian Walls, Becoming Truly Innovative: Migrating from Millennium to Koha

Had a full datestamp to the minute on his title slide.

University’s security people pulled their server offline during a retirement party; the library noticed when their ILS went down. Thought about moving to Koha.

To migrate: bib/auth data, patron, checkouts, holds, serials issues, acquisitions. Patron data not to get out and ingest. Bib data harder. A number of export methods just wouldn’t work, but they did get it working.

Explained everything about how they’d done it, and gave some advice on what to do it you’re doing it. It went pretty smoothly, from the sounds of it, and there were no all-nighters.

Dan Chudnov (facilitating), Ask Anything!

Worked very well. I didn’t have any questions to ask, or answers to give, but it was a great to watch and I think everyone enjoyed it.

Naomi Dushay and Jessie Keck, A Better Advanced Search

Stanford advanced search

Use cases: author + title, e.g. “mozart sonata 21”

Personal name in art: could be author, subject, additional author, etc.

Combining multiple facets: find books and videos, stuff in Spanish and English, stuff at this and that libraries.

People like Boolean.

Context-specific advanced seach, eg for music.

One pattern: “any of these words” “all of these words” “none of these words”

Or: Title, Author, Subject, fields.

Or: multiline form where you can pick modifiers and fields (like WebCat).

Notice on Stanford’s form: Keyword isn’t the first. Keyword searching wasn’t high demand in use cases, so they didn’t put it up front. “Subject terms” is their lingo to imply controlled vocabulary.

They had problem searching across multiple fields from one search box, because weightings didn’t figure into it. Solution: localparams in Solr.

Documented here:

http://www.stanford.edu/people/~ndushay/code4lib2010/advSearchSolrQueries.pdf

This got quite technical and I don’t now much about Solr.

Challenges in UI:

Multi-select facets. Make user easily aware of current facet selections. Integration with UI: Faceting. Search breadcrumbs.

Actionable facets in search results.

Cary Gordon, What’s New in Drupal 7

He was filling in at the last minute for two Danes who couldn’t make it because of weather-caused airplane delays.

  1. Make the most frequent tasks easy and less frequent tasks achievable.
  2. Design for 80%.
  3. Privilege the content creator.
  4. ?

Very complete update on everything that’s new and changed in Drupal 7. It’s in alpha now but I’ll upgrade when it’s ready.

Andreas Orphanides, Cory Lown, and Emily Lynema, Enhancing Discoverability With Virtual Shelf Browse

Why do a virtual shelf browse? Universal behaviour. We all browse shelves, but shelves are going away. Users like recommendations.

NOTE TO SELF: Add call numbers for all our electronic resources that just say ELECTRONIC right now. That’s a big failing. Then do a virtual shelf browse.

http://www2.lib.ncsu.edu/catalog/record/NCSU1764762

Data model goals. Browse arbitrary number of titles around known item in call number order. Include online + all locations. Support browse searching, partial and non-matches. Browse by title, not by item. Forgiving call number searching.

Cron dumps out data in delimited text, ingest into DB, call number index in MySQL.

Front-end goals: Access to infinite shelf. Interactive visual browsing experience. Design cues from Google Books. High performance. Satisfy patrons and staff.

They used jQuery, Thickbox, jCarousel, SimpleTip.

Problems: DOM is slow. Three plugins = trouble. Remote servers = latency roulette. Too much Ajax = browser bottleneck. IE is bad.

Future includes virtual browsing across other dimensions of likeness.

http://www.lib.ncsu.edu/dli/projects/virtualshelfindex/

Naomi Dushay and Jessie Keck, How to Implement A Virtual Bookshelf With Solr

Showed how shelf browse works in their system. They’re starting simple, vertical listing on left-hand side.

Described how they normalize call numbers, standardize things, but of course the data is often ugly or messed up or breaks rules.

What if you have a 40-volume enyclopedia? Don’t want to go through 40 books. Can lop off volume number etc. Various kinds of lopping done.

In a mix of different classification schemes, need to separate them so that some archival or thesis stuff starting with E isn’t mixed in with American history in LC’s E.

Naomi explained all about the detailed parts of how they got this to work. Very useful; we can use this ourselves at York. Too much for me to take in during the talk, but if we implement this then a talk like this is exactly what we need.

Interface: jQuery. Animating left/right browsing he did in 10-15 lines of jQuery, with no plugins.

There’s a lot of work involved in shelf browsing, apparently.

Lightning Talks

LibX Update, Godmar Back

http://libx.org/chrome/

Showed the new UI they did for Chrome. Looks nice.

LibX is nice software.

How to build a Virtual Bookshelf Without Solr (or MySQL) - Maccabee Levine

http://tinyurl.com/virtualbookshelf

Instead of loading stuff into a database or Solr, which requires IT department support, you can use the ILS API as it is. Simple way of doing it yourself, if your ILS has a web services API, which Voyager does.

VIVO, an interdisciplinary national network - Paul Albert

Semantic web way of connecting/relating people, grants, subjects, etc.

http://vivo.cornell.edu/

Look very interesting. Could we use this at York? OCUL?

http://vivoweb.org/

WolfWalk, two ways - Jason Casden

WolfWalk

iPhone app. Geolocation-aware way of showing images from special collections about campus history.

Ran into trouble when Apple’s lawyers and NCSU lawyers couldn’t agree on the App Store contract, so they did a web-based mobile app that will work everywhere, which he recomends.

Custom metasearch widgets - Alex Smith

http://xerxes.calstate.edu/

Node.js development - Gabriel Farrell

Node.js

Super-lively on GitHub: http://github.com/ry/node

Catalog Auto-suggest using SOLR - Jill Sexton

https://docs.google.com/present/view?id=dcz7k2rb_59xzgz36fg

http://search.lib.unc.edu/

Problem: Use of external index for library catalog limits access to authority data while searching.

So: do an autosuggest feature using library authority data.

Example search: starting to type “the big lebowski” or “dickinson emily” (notice fields on right-hand side).

This has caused a big increase in subject searches and auto-suggest search queries are used a lot, the logs show.

EmeraldView, a PHP frontend for Greenstone - Yitzchak Schaffer

Greenstone is a “digital library solution.”

Kill the Search Button - Michael Nielsen, Jørn Thøgersen [facilitated by Roy Tennant]

http://developer.statsbiblioteket.dk/kill/code4lib

You Heard It Here First… - Roy Tennant

Roy announced the new OCLC Innovation Lab. (Mike Teets, Tip House, Rob Koopman.) Be interesting to see what comes out of that.

File Information Tool Set (FITS) - Spencer McEwen

http://fits.googlecode.com/

JavaScript E-book Reader – Eric Palmitesta

Showed the ScholarsPortal Ebooks interface. They wrote an ebook reader for it, because existing ones weren’t good enough.

Faceted browse on the cheap - Tom Keays

He set up a collection of books in RefWorks, then exported to BibTeX, then ran it through Babel to turn that into JSON. Smart!

SIMILE stuff from MIT is excellent.


Code4Lib 2010: Tuesday 23 February

01 March 2010 code4lib

More notes from Code4Lib 2010, some brief, some so brief as to be nonexistent.

Cathy Marshall: People, Their Digital Stuff, and Time: Opportunities, Challenges and Life-Logging Barbie

Three or four things to think of about when we mix people, stuff and time. Ruminations about personal digital archiving. “Feral ethnography.”

“People rely on benign neglect as a de facto stewardship technique and collection policy.”

“Personal digital archiving != archiving a personal digital collection.”

Lots of laughing. Quite funny.

One can keep everything. Why might one do that? Don’t know an item’s future worth. It’s hard thankless work to delete. “Filtering and searching can locate the gems among the gravel.”

“It’s easier to keep than to cull.” Loss as a means of culling collections.

Personal scholarly archives as an example. One researcher who’d thought he’d have lots of stuff, but now has little, only things since 2001, mostly PDFs.

“Implicaton: not all long-term stores ned to perform with the same level of reliability.”

Use-based heuristics help assess value.

Second point. No single preservation technology/repository/etc. will win the battle for your stuff.

Instead of centralizing, we’ll be knittingtogether stores and services. Mobile devices, email, web sites, etc.

No single archive. The catalogue is the answer. Some things (dental records, high school photos, tax records) SHOULD be in different places. You can find them later when you need them.

Third point: Forget about digital originals or reference copies.

People have local copies of images etc. that they think of as the master copies, but there’s a lot of useful added metadata in the online versions (eg Flickr) that makes it more valuable.

Example of a picture of an enormous catfish; the image has been resized and rescaled and had the quality changed, and appears in lots of different places online in different formats: and has lots of different explanations: it’s a record-size fish, it’s this, it’s that, it was caught by X, it was caught by Y, it was caught here, it was caught there.

Where are the tools that will let us harvest the metadata that copies have grown? Where’s the search tool for gathering copies, not deduping them?

Fourth: Given 1-3 there will be some interesting opportunities to take a fresh look at searching and browsing.

Techniques for re-encounter: stable personal geography; value-based organization; better presentation of surrogates.

Possible interfaces to do all this: faceted browsing, eg LifeBits approach. Annotated timeline (also LifeBits). Hard to do.

Bottom-up efforts: lots of digitization happening, policies, tools, practices. Personal archiving as cottage industry. SALT project at Stanford.

New opportunistic uses of massed data. She did a study of photos in Flickr of the same thing, a mosaic in Milan. Superstition: you stand on it, on the bull, spin on it, and then take a picture and post it online. In the pictures you can actually see the mosaic being eroded over time!

Links:

http://www.csdl.tamu.edu/~marshall/, http://research.microsoft.com/~cathymar/

Blog: http://ccmarshall.blogspot.com/

Twitter: http://twitter.com/ccmarshall

Jeremy Frumkin and Terry Reese, Cloud4Lib

Idea: Cloud4Lib = an open library platform.

Lots of different projects out there doing different or the same things. Need to glue them all together, put all the Code4Lib work and energy into one thing. “Enable libraries to truly and collaboratively build and use common infrastructure.” Development efforts should enhance an entire platform, not just one piece of it all.

http://cloud4lib.org

They set up a wiki. Interesting: some Amazon EC2 servers where people can experiment. Sponsored by someone or other.

Use cases, mentioning University of Hobbitville

Ross Singer, The Linked Library Data Cloud: Stop Talking and Start Doing

Tim Berners-Lee’s Four Rules of Linked Data

Library linked data cloud was amazingly empty but has been growing slowly. id.loc.gov, LIBRIS and DBPedia.

Ross matched up lcsubjects.org and http://marccodes.heroku.com/gacs/.

Chronicling America.

VIAF connects to DBPedia. Can search with SRU.

Ross made LinkedLCCN

He also hacked on VuFind to add RDFa. http://dilettantes.code4lib.org/vufind/ Explore button.

mirlyn.lib.umich.edu Bill Dueber added links to his VuFind.

TODO

  • Agree on data models. FRBR or something like it. Aboutness vs isness.
  • More linked data available from very common identifiers
  • More linkages to resources outside the library domain. Who will do that? How? Tools.
  • Sustainability and preservation

Good talk. People inspired.

Harrison Dekker, Do-It-Yourself Cloud Computing with Apache and R

R. Rapache.

Good blog about R that I follow: Revolutions

Rosalyn Metz and Michael B. Klein, Public Datasets in the Cloud

Infrastructure as a Service: Amazon EC2

Platform as a Service: Google Apps, Heroku

Software as a Service: Zoho, Google Docs

Not talking about data you can download eg in a CSV.

Did a video of setting up an EC2 instance (took seconds), and attached (mounting) a volume to it (the volume being a big set of census data in this case). Very cool.

Socrata. (Expensive.)

Did an example of loading census data into Google Fusion Tables. Really wild stuff. 200 GB dataset copied into place and ready to be analyzed into three minutes. Looked like great tools for analyzing the data, visualizing, cross-tabulating, etc.

Michael Klein talked about issues and problems with all of this. Authority, provenance, preservation, access, etc.

Pushing Library Data. Secondary uses of it: research, testing, unified indexing. Not just bib data: anonymized borrowing data, etc.

Karen Coombs, 7++ Ways to Improve Library UIs with OCLC Web Services

  1. Crosslisting print and electronic materials. Can’t see all the formats all in one thing. Use WorldCat Search API to see if you have the print version of the book and add record to ebook record.

  2. Linking to libaries nearby in case a book is out.

  3. Providing journal table of contents. Use xISSN to see if feed of recent TOCs is available.

  4. Peer review indicators.

  5. Provide information about authors. Use Identities and Wikipedia API to insert author information into dialogue box in UI.

  6. Link to free online versions of books. Get OCLCnum, then use ?’s APIs to see if they have it and then link to it.

  7. Adding similar items on same screen. Use DDC classification. Makes a lot more sense than just a Title keyword match like VuFind does.

  8. Bonus: Creating a mobile catalogue.

http://librarywebchic.net/mashups/

Blog post: New York Times Mashups

Jennifer Bowen, Taking Control of Library Metadata & Websites Using the eXtensible Catalog

Extensible Catalog

Four components that can be used individually or together.

  1. User interface built on Drupal. Faceting. FRBRized. Customizable search interface.

Metadata has been restructured in a new way, FRBRy, but she didn’t have time to get into that.

In search results there’s a place to show the matching text (from record or whatever) so that users know why they’re getting the result they did. They found in research that users want that.

Web forms to let you create custom search boxes, for just journals, databases, etc. (Widget-maker, I guess.) Showed how they automatically generate a DVD browser, with no special programming.

They’ve made about 20 Drupal modules.

  1. Metadata tools. Automated processing of large batches of metadata.

Metadata Services Toolkit. Harvest, process, dedupe, clean up, aggregate, synchronize metadata.

Nice.

Version tracking of metadata through changes, so you can track the history.

  1. Connectivity tools to match up XC and ILS.

NCIP and OAI. OAI Toolkit works with virtually any ISL that exports MARC. NCIP Toolkit lets XC talk to ILS auth, circ and patron services. Real-time.

http://www.screencast.com/users/eXtensibleCatalog

Anjanette Young and Jeff Sherwood, Matching Dirty Data—Yet Another Wheel

Goal: Ingest metadata and PDFs for ETDs from UMI into DSpace.

Matching.

Exact title. Find intersection of sets. Verify intersection with exact author. Shorten author names to remove punctuation etc.

Examples of titles and names that are tricky to match.

Levenshtein Edit Distance.

Reminded me of this poem by bp Nichol that’s carved in the pavement of bp Nichol Lane:

A lake
A lane
A line
Alone

Similarity = 1 - dL / max (|s1|, |s2|)

Fuzzy query in Solr/Lucene users Levenshtein distance.

Reduce search space. Identify stop words. Throw out common words (eg “models” and “data” in their diss titles).

Got a bit lost here.

Jaro-Winkler Algorithm. (Solr spellchecker uses it.) Works best for short strings. Developed by US Census.

Code: pypi editdist

http://bit.ly/ZGSmF String Comparison Tutorial

What they were looking for but was released after they’d done all their own work: MarcXimiL: The Bibliographic Similarity Analysis Framework

Slides: http://www.slideshare.net/ghostmob/matching-dirty-data

Ryan Scherle and Jose Aguera, HIVE: A New Tool for Working With Vocabularies

HIVE = Helping Interdisciplinary Vocabulary Engineering. Jose Aguera wasn’t here.

http://ils.unc.edu/mrc/hive/

Site: http://hive.nescent.org:9090/

Code: http://hive-mrc.googlecode.com/

http://datadryad.org/

David Kennedy and David Chandek-Stark, Metadata Editing—A Truly Extensible Solution

Duke University Libraries Digital Collections.

Python, Django, Yahoo! Grids CSS, jQuery.

http://library.duke.edu/trac/dc/wiki/Trident

http://tridentproject.org/, http://blog.tridentproject.org/

Lightning Talks

UW Forward - Steve Meyer

UW Forward uses Blacklight.

Search for ‘psychology’ and they recommend the psychology subject librarian. They use WorldCat API to get links to Wikipedia entries for authors. Challenges: 14 Voyager ILS instances in the U Wisconsin system! Serials licensed differently at different campuses. They’re having various problems of the kinds such projects has and he’d like to talk to people with similar ones.

MODS4Ruby & Opinionated XML - Matt Zumwalt

Prezi presentation was a bit zoomy, but lively.

http://yourmediashelf.com/blog/

The Digital Archaeological Record - Matt Cordial

The Digital Archaeological Record

Beta application

Archaeologists can submit data encoded in whatever way they want, and then connect it to other data.

Example: fauna, searching in a square in SW United States. Integrate data tables.

Hydra: Blacklight + ActiveFedora + Rails - Willy Mene

Stanford + U Virginia + Hull.

Hydrangea Project next: Blacklight, ActiveFedora, Shelver, in Rails.

Why CouchDB? - Benjamin Young

Data gets lonely. Often depends on APIless app. Web apps a bit better. Open source apps better but data can be in RDBMSes.

CouchB

Listed advantages to using it. Portable standalone apps. Imagine as CouchDB apps: openlibrary.org. 3.5 gig dump now. If it supported replication you could get updates and parts of it.

Subject guides: ad-hoc.

couch.io does hosting and you can get one free database.

Data integrity (cheap, fast, and easy) - Gwen Exner

HathiTrust Large Scale Search update - Tom Burton-West

http://www.hathitrust.org/blogs/large-scale-search/

5.4 million full-text books. Full-text search of each, average response time of 3 seconds. They’re using Solr. Big-scale stuff.

EAD and MARC Sitting in a Tree: D-R-U-P-A-L - Mark Matienzo

http://www.slideshare.net/anarchivist/ead-and-marc-sitting-in-a-tree-drupal

When he was at NYPL they migrated to Drupal from static content, ColdFusion, other stuff. Snazzy new site: http://nypl.org/

http://nypl.org/find-archival-materials/

http://nypl.org/shrew/b16185699/mods.xml, http://nypl.org/shrew/b16185699/marcxml.xml

EZproxy Wondertool - Paul Joseph

He’s at UBC. Had a bunch of EZProxy work to do and did a thing that made all his work easier.

HathiTrust APIs - Albert Bertram

http://www.lib.umich.edu/two-over-threehundred/code4lib/

Repository of MARC Abominations - Simon Spero

Lovecraft meets MARC. Building a test set of the eldritch MARC records from non-Euclidean geometries. Send to marcthulhu@ibiblio.org.

Mystery Meat - Joe Atzberger

Stephen Abram’s anti-open-source FUD. Sirsi’s Workflows client ships with tar.exe and gzip.exe but does NOT come with a copy of the GPL, which its license says it must. Does the Software Freedom Law Center know about this?

Fuwatto Search - Masao Takaku

http://fuwat.to/worldcat, Twitter: http://twitter.com/tmasao


Code4Lib 2010: Monday 22 February

01 March 2010 code4lib

My brief notes on what I saw at Code4Lib 2010 in Asheville, North Carolina. I had a great time at the conference and am glad I went. My thanks to Kevin Clarke, Jodi Schneider, and all the other organizers and volunteers.

See also:

Preconference 1: Solr White Belt

Bess Sadler mostly ran through this Solr tutorial, elaborating with lots of examples from Blacklight work and personal experience. There were about 50 people in the room and everyone got Solr installed and working.

We tried:

  • requesting particular fields
  • highlighting
  • looking at how Solr parses a search
  • facets

Useful to know is Stanford’s Solr config which gives their relevancy weightings:

<str name="qf">
title_245a_unstem_search^100000
title_245a_search^75000
vern_title_245a_search^75000
title_245_unstem_search^50000
...

That solves the Nature Problem. Putting in Nature used to bring up a book called Naturalism, but now the unstemmed match comes up a lot higher, so Nature will be the top result.

Very good morning. Bess did a fine job and everyone learned a lot of practical, hands-on stuff.

Preconference 2: Hacker 201 with Dan Chudnov

Dan Chudnov had done Hacking 101 in the morning, which used Processing as a way of learning programming and good programming habits. The afternoon session was to use Python and pymarc to hack on MARC, but there was some trouble with everyone in the room getting all the right things installed.

I wasn’t too interested in hacking on MARC in Python, so I worked on OpenFRBR and ended up making some good progress through the afternoon and into the evening. I counted that as a hacking success.

Dan talked about the thirty-minute rule: if you’re stuck on a problem for 30 minutes, take a break. A good rule. Here’s another I’d forgotten: start with the simplest thing and then make it more complicated. I knew I wanted to deal with data from five or six sources in three different formats (JSON, RDF, XML). Different fields from different sources would mean different things and there would be different relations between them.

I was getting dismayed at how complicated this was turning out to be. Finally I realized, “I don’t need to get all that working right off the bat. I just need to get one source working. I’ll ignore everything else for now and add it later.”

And that worked very well. By doing just one thing everything fell into place for me and I got it working quickly and enjoyably. That was nice.


Meditations 7.61

27 January 2010 stoicism

Here is section 61 of chapter 7 of Meditations by Marcus Aurelius in the original Greek and a number of translations. This passage is one of my favourites and I wanted to compare how different translators handled it. Hays and the Hickses are two much-praised recent translations. Casaubon’s was the first into English, and Long’s was the standard one for several decades. (Staniforth says Long’s 1862 translation is “admirably correct, as literal as a school crib, and to me at least utterly unreadable.”)

Original Greek text, ca. AD 160-180 : “Δεῖ καὶ τὸ σῶμα πεπηγέναι καὶ μὴ διερρῖφθαι μήτε ἐν κινήσει μήτε ἐν σχέσει. οἷον γάρ τι ἐπὶ τοῦ προσώπου παρέχεται ἡ διάνοια συνετὸν αὐτὸ καὶ εὔσχημον συντηροῦσα, τοιοῦτο καὶ ἐπὶ ὅλου τοῦ σώματος ἀπαιτητέον. πάντα δὲ ταῦτα σὺν τῷ ἀνεπιτηδεύτῳ φυλακτέα.” (I think this is the correct quote. Please correct me if it’s not.)

Meric Casaubon, 1634: “The art of true living in this world is more like a wrestler’s, than a dancer’s practice. For in this they both agree, to teach a man whatsoever falls upon him, that he may be ready for it, and that nothing may cast him down.”

Thomas Gataker, 1752: “The art of life resembles more that of the wrestler, than the dancer; since the wrestler must every be ready on his guard, and stand firm against the sudden unforeseen events of his adversary.”

George Long, 1862: “The art of life is more like the wrestler’s art than the dancer’s, in respect of this, that it should stand ready and firm to meet onsets which are sudden and unexpected.”

Gerald H. Rendall, 1898: “Life is more like wrestling than dancing; it must be ready to keep its feet against all onsets however unexpected.”

Maxwell Staniforth, 1964 (Penguin): “The art of living is more like wrestling than dancing, in as much as it, too, demands a firm and watchful stance against any unexpected onset.”

C. Scot Hicks and David V. Hicks, 2002 (Scribner, The Emperor’s Handbook): “Living is more like wrestling than dancing: you have to stay on your feet, ready and unruffled, while blows are being rained down on you, sometimes from unexpected quarters.”

Gregory Hays, 2003 (Modern Library): “Not a dancer but a wrestler: waiting, poised and dug in, for sudden assaults.”

Martin Hammond, 2006 (Penguin): “The art of living is more like wrestling than dancing, in that it stands ready for what comes and is not thrown by the unforeseen.”


James Michener on How to Use a Library

24 January 2010 quotes libraries

In the early nineteen-eighties (I think it was), the International Paper Company paid for a series of two-page articles in the “Power of the Printed Word” series to appear in some large American magazines. At the bottom of each was their name and logo, an offer to send you free reprints, and this messgae:

Today, the printed word is more vital than ever. Now there is more need than ever for all of us to read better, write better, and communicate better.

International Paper offers this series in the hope that, even in a small way, we can help.

I don’t remember where I saw them, but I wrote away and they sent me the whole set. I have fourteen — I don’t know if I ever lost any over the years, but I don’t think so.

James Michener did “How to Use a Library.” Here’s a picture of him jotting down call number at the card catalogue, saying, “Every time I go to the library, I make a beeline to the card catalog. Learn to use it. It’s easy.”

Picture of James Michener standing by a card catalogue

These are the fourteen articles I have:

I made a PDF of How to Use a Library but the quality turned out fairly low, so here are higher-quality images of the two pages:

Page 1 of How to Use a Library Page 2 of How to Use a Library