Today at work I was doing some analysis of spending on electronic resources. I’d done it a few months ago on fiscal year 2015, in a hacky kind of way, but now that F2016 is complete I had two years ago data to work with. As usual I used Org and R, but this time I rejigged everything to use Hadley Wickham’s idea of tidy data and his tools for working with such data, and it made things not only simpler to work with in R but also to present with in Org.
Here’s a simplified example of what it looked like.lo First, I load in the R packages I’ll need for this brief example. In Org hitting Ctrl-c Ctrl-c
runs these code blocks. This one is configured to have no output.
Next, a table of data: costs of things that librarians spend money on. (We can’t share our eresource spending data … perhaps some day.) This table is meant for people to read and it will appear in the exported PDF. The way it’s presented is good for humans, but not right for machines. I call it tab_costs
because it’s a table of costs and I’m going to need to refer to the table later.
The way I have Emacs configured, that looks like this (with extra-prettified source blocks):
The next bit of R reads that table into the variable costs_raw
, which I then transform with tidyr’s gather
function into something more machine-useable. The gather
statement says take all the columns except “name” and turn the column names into “year” and the cell values into “cost”. So I can see it and make sure it’ll work, the output is given, but :exports none
means that this table won’t be exported when the document is turned into a PDF. Only I can see this, in Emacs.
That’s hard for humans to read, but it means making a chart comparing spending across the two years is easy.
Or (see the geom_bar docs for more):
Another Emacs screenshot showing how Org mixes code, graphics and text (well, text if I’d written some, but I didn’t here):