I’ve collected most of the number from the York University Libraries annual reports into one CSV file covering 2001–2013 (those are academic years, so 2013 is 2012–2013). I just realized how to do nice year-to-base comparisons to see how things have been growing and changing since 2001, and to make it easier for myself later, I’ll post it all here.
First thing, load up two R packages we’ll need, then read in the CSV file. It’s nice how R can read in a file on the web without doing anything special. Second, glom together the archives, maps and film/audio libraries into “Scott,” which is the name of the biggest library at York. They are all small branches inside Scott. The comparisons are easier when its four branches, not seven. (The law library on campus is a separate unit and its numbers aren’t part of our reports.)
The above shows how the small branches were renamed, and then, using dplyr, the numbers I want are picked out and put into a nice small data frame. The number of questions asked at the branches that were put together into Scott are summed into one number.
A note about the users column:
I want to calculate how things have changed since 2001, so I want to make ratios for all later years by dividing their numbers by those from 2001. (This assumes 2001 is an average year—I don’t know if it is, but it’s 12 years ago, which is about one-quarter of York’s existence, so it seems long enough, and besides, that’s as far back as I could easily get the numbers.)
Make a data frame holding just the 2001 numbers:
Then merge that with the yul.reference data frame. R duplicates everything as necessary.
The rows got jumbled up, but that doesn’t matter. Notice how the right base.users and base.questions numbers were repeated for every branch’s row in yearly.reference.
Now all the numbers are in the right places, and it’s just a matter of dividing this by that and that by the other to find all the ratios I want. (If you want to compare one year to the previous one, lag is the way to go, as Calculate groupwise ratio of consecutive values in R at Stack Overflow explains).
First, we can see how the number of home users has been growing at the branches:
Next, percentage growth compared to 2001. The science program at York used to be surprisingly small, but it’s been growing the last few years, and will continue to grow, and this shows it:
Next, total number of questions asked each year. It’s going down.
The dip in 2009 is explained by the lengthy strike that stopped classes for three months. Without the strike, I imagine the line from 2008–2010 might have been fairly straight, or perhaps 2010 would have been a little lower so the decline would actually be seen starting in 2007. That’s just a guess.
(There’s no way from these aggregate numbers to tell what types of questions are being asked less, or where, but the LibStats tracking we do shows that it’s almost entirely directional and tech support questions.)
Finally, looking at the number of questions per home user at each branch shows something striking:
I don’t know why there’s a dip in 2008, but you can see how clearly the Bronfman numbers went up over the last five years (until the slight decline last year) while the other branches are falling. If 2008 is anomalous and set aside, Bronfman has increased for quite a few years while the others decline.