Miskatonic University Press

Code4Lib 2010: Wednesday 24 February

code4lib

Emily Lynema, Iterative Development: Done Simply

Problems: You have too much work to do. Priorities change frequently. Requirements change. No business analysts. Emergencies happen. "IT black box" where no-one outside IT knows what's going on inside.

Agile development as opposed to the waterfall method.

Scrum: product owner/scrum master/team. Artifacts: product backlog, sprint backlog. 2-4 week cycle. Plant, commit to certain things and estimate. Daily scrum for fifteen minutes each day. What have you done since yesterday? What will you do today? What problems might you have? After sprint, sprint review and retrospective.

Case study at NCSU Libraries. They use iterative process loosely based on Scrum. JIT planning and documentation. Collaboration with customers. Joint project ownership.

They use JIRA, Confluence. Sprint planning: Google Docs + JIRA.

Sprint planning: use one week to plan across multiple projects. Day 1: overview of next 3-6 months. Prioritize. She does up a Google Docs spreadsheet with weeks as columns and projects as rows, and puts in everything she knows about what should be done.

Days 2-5: Meet with product owners for each prioritized project.

A release in JIRA = product iteration.

Day 6: Sprint planning. Reprioritize based on estimate and time available.

Development: Working on it all. Daily meetings. Weekly review.

Challenges: Multiple small projects within a cycle. Not traditional for Agile. Lack of documented requirements: what are the user stories and when do you need them? "Teams of librarians work slowly." Prioritization difficult for library staff. Testing: how to automate; no QA experts. Simultaneously handle support and development.

Outcomes: Projects working well. Keeping to six-week cycles keeps everything in line. 31 releases across six projects in 2009. Increased flexibility.

Agile for All. Succeeding with Agile

Bess Sadler, Vampires vs. Werewolves: Ending the War Between Developers & Sysadmins with Puppet

Developers say: Those sysadmins keep me from doing my job! My job is to write new software and add features and do cool stuff and get it into production.

Sys admins say: Those developers! My job is to keep things running and build trust and make sure things are reliable.

But if they're arguing and systems go down or new features don't get added, the angry villagers show up with pitchforks.

Innovation is about risk. You don't take risks with people you don't trust. Let go of the anger.

Testing. They use NAGIOS to watch their projects, not just uptimes: searches into systems, etc. They set up tests and use NAGIOS to run them automatically, so when there's an upgrade they can see automatically what works and what doesn't.

Write docs. Link to the doc from where the problems are seen (in NAGIOS for example).

Hudson, continuous integration tool.

Puppet for release management.

Naomi Dushay, Willy Mene, and Jessie Keck, I Am Not Your Mother: Write Your Test Code

They use Hudson to manage automated testing. Looks nice. Made me feel guilty because I never write tests. But then I'm just hacking on stuff myself, mostly.

Selenium: Firefox plugin to automate browsing, good for testing.

Summary of test-driven development.

Uses rSpec, Cucumber, and the Rails testing environment.

Jessie K Blacklight on his laptop. Showed an rSpec test. He knew it would fail. Ran it and it failed. Edited code to change something, reran the test, and it passed.

Cucumber. It has features that consist of scenarios. Showed an example of using this to test that something was on the home page. It wasn't, test failed, code changed, test rerun, test passed.

Types of tests: Unit tests. Integration tests. Black box/functional/acceptance testing.

Other web testing tools: WebRat, Watir.

Chris Beer, Media, Blacklight, and Viewers Like You

Media archives. "Anatomy of a film clip."

PBCore. Fedora. Blacklight. Solr. jQuery. MySQL. Lighttpd.

http://tinyurl.com/c4l-pbcore

They did their own video player, which handles scrolling simultaneous transcripts.

http://github.com/cbeer/ave-sync

Ian Walls, Becoming Truly Innovative: Migrating from Millennium to Koha

Had a full datestamp to the minute on his title slide.

University's security people pulled their server offline during a retirement party; the library noticed when their ILS went down. Thought about moving to Koha.

To migrate: bib/auth data, patron, checkouts, holds, serials issues, acquisitions. Patron data not to get out and ingest. Bib data harder. A number of export methods just wouldn't work, but they did get it working.

Explained everything about how they'd done it, and gave some advice on what to do it you're doing it. It went pretty smoothly, from the sounds of it, and there were no all-nighters.

Dan Chudnov (facilitating), Ask Anything!

Worked very well. I didn't have any questions to ask, or answers to give, but it was a great to watch and I think everyone enjoyed it.

Naomi Dushay and Jessie Keck, A Better Advanced Search

Stanford advanced search

Use cases: author + title, e.g. "mozart sonata 21"

Personal name in art: could be author, subject, additional author, etc.

Combining multiple facets: find books and videos, stuff in Spanish and English, stuff at this and that libraries.

People like Boolean.

Context-specific advanced seach, eg for music.

One pattern: "any of these words" "all of these words" "none of these words"

Or: Title, Author, Subject, fields.

Or: multiline form where you can pick modifiers and fields (like WebCat).

Notice on Stanford's form: Keyword isn't the first. Keyword searching wasn't high demand in use cases, so they didn't put it up front. "Subject terms" is their lingo to imply controlled vocabulary.

They had problem searching across multiple fields from one search box, because weightings didn't figure into it. Solution: localparams in Solr.

Documented here:

http://www.stanford.edu/people/~ndushay/code4lib2010/advSearchSolrQueries.pdf

This got quite technical and I don't now much about Solr.

Challenges in UI:

Multi-select facets. Make user easily aware of current facet selections. Integration with UI: Faceting. Search breadcrumbs.

Actionable facets in search results.

Cary Gordon, What's New in Drupal 7

He was filling in at the last minute for two Danes who couldn't make it because of weather-caused airplane delays.

  1. Make the most frequent tasks easy and less frequent tasks achievable.
  2. Design for 80%.
  3. Privilege the content creator.
  4. ?

Very complete update on everything that's new and changed in Drupal 7. It's in alpha now but I'll upgrade when it's ready.

Andreas Orphanides, Cory Lown, and Emily Lynema, Enhancing Discoverability With Virtual Shelf Browse

Why do a virtual shelf browse? Universal behaviour. We all browse shelves, but shelves are going away. Users like recommendations.

NOTE TO SELF: Add call numbers for all our electronic resources that just say ELECTRONIC right now. That's a big failing. Then do a virtual shelf browse.

http://www2.lib.ncsu.edu/catalog/record/NCSU1764762

Data model goals. Browse arbitrary number of titles around known item in call number order. Include online + all locations. Support browse searching, partial and non-matches. Browse by title, not by item. Forgiving call number searching.

Cron dumps out data in delimited text, ingest into DB, call number index in MySQL.

Front-end goals: Access to infinite shelf. Interactive visual browsing experience. Design cues from Google Books. High performance. Satisfy patrons and staff.

They used jQuery, Thickbox, jCarousel, SimpleTip.

Problems: DOM is slow. Three plugins = trouble. Remote servers = latency roulette. Too much Ajax = browser bottleneck. IE is bad.

Future includes virtual browsing across other dimensions of likeness.

http://www.lib.ncsu.edu/dli/projects/virtualshelfindex/

Naomi Dushay and Jessie Keck, How to Implement A Virtual Bookshelf With Solr

Showed how shelf browse works in their system. They're starting simple, vertical listing on left-hand side.

Described how they normalize call numbers, standardize things, but of course the data is often ugly or messed up or breaks rules.

What if you have a 40-volume enyclopedia? Don't want to go through 40 books. Can lop off volume number etc. Various kinds of lopping done.

In a mix of different classification schemes, need to separate them so that some archival or thesis stuff starting with E isn't mixed in with American history in LC's E.

Naomi explained all about the detailed parts of how they got this to work. Very useful; we can use this ourselves at York. Too much for me to take in during the talk, but if we implement this then a talk like this is exactly what we need.

Interface: jQuery. Animating left/right browsing he did in 10-15 lines of jQuery, with no plugins.

There's a lot of work involved in shelf browsing, apparently.

Lightning Talks

LibX Update, Godmar Back

http://libx.org/chrome/

Showed the new UI they did for Chrome. Looks nice.

LibX is nice software.

How to build a Virtual Bookshelf Without Solr (or MySQL) - Maccabee Levine

http://tinyurl.com/virtualbookshelf

Instead of loading stuff into a database or Solr, which requires IT department support, you can use the ILS API as it is. Simple way of doing it yourself, if your ILS has a web services API, which Voyager does.

VIVO, an interdisciplinary national network - Paul Albert

Semantic web way of connecting/relating people, grants, subjects, etc.

http://vivo.cornell.edu/

Look very interesting. Could we use this at York? OCUL?

http://vivoweb.org/

WolfWalk, two ways - Jason Casden

WolfWalk

iPhone app. Geolocation-aware way of showing images from special collections about campus history.

Ran into trouble when Apple's lawyers and NCSU lawyers couldn't agree on the App Store contract, so they did a web-based mobile app that will work everywhere, which he recomends.

Custom metasearch widgets - Alex Smith

http://xerxes.calstate.edu/

Node.js development - Gabriel Farrell

Node.js

Super-lively on GitHub: http://github.com/ry/node

Catalog Auto-suggest using SOLR - Jill Sexton

https://docs.google.com/present/view?id=dcz7k2rb_59xzgz36fg

http://search.lib.unc.edu/

Problem: Use of external index for library catalog limits access to authority data while searching.

So: do an autosuggest feature using library authority data.

Example search: starting to type "the big lebowski" or "dickinson emily" (notice fields on right-hand side).

This has caused a big increase in subject searches and auto-suggest search queries are used a lot, the logs show.

EmeraldView, a PHP frontend for Greenstone - Yitzchak Schaffer

Greenstone is a "digital library solution."

Kill the Search Button - Michael Nielsen, Jørn Thøgersen [facilitated by Roy Tennant]

http://developer.statsbiblioteket.dk/kill/code4lib

You Heard It Here First... - Roy Tennant

Roy announced the new OCLC Innovation Lab. (Mike Teets, Tip House, Rob Koopman.) Be interesting to see what comes out of that.

File Information Tool Set (FITS) - Spencer McEwen

http://fits.googlecode.com/

JavaScript E-book Reader -- Eric Palmitesta

Showed the ScholarsPortal Ebooks interface. They wrote an ebook reader for it, because existing ones weren't good enough.

Faceted browse on the cheap - Tom Keays

He set up a collection of books in RefWorks, then exported to BibTeX, then ran it through Babel to turn that into JSON. Smart!

SIMILE stuff from MIT is excellent.