Brown: It’s a podcast. It’s kind of like your radio show, except people listen to it on purpose.
Enright: OK. Good. You’ve put on weight.
Episode 20 of the show, a few months back, was Rex Murphy is Paid by the Oil Sands and the CBC Won’t Disclose or Discuss It. Now, Rex Murphy is a blowhard who appears on the CBC news every week offering some eminently ignorable opinion about something. I stopped paying attention to him a long time ago after he spent five minutes explaining, with his tedious sesquipedalian loquaciousness, how he was tired of Britney Spears.
Any road up, turns out Murphy was taking money from Big Oil to talk at their events and then he was getting on CBC TV and saying the tar sands are a good thing, and he never mentioned he’d taken the money.
It all blew up into a bit of a storm, and the CBC did not behave properly, but one outcome is that a listing of public appearances by CBC staff appeared on their web site a little while ago. I think I saw Jesse (I hope he won’t mind if I call him Jesse, he seems familiar, I’ve been listening to his podcasts for so long) mention it on Twitter.
There’s a lot in the file, but here’s what’s most interesting:
In each entry, there’s some stuff we don’t need to bother with, but these fields look good for an initial analysis: name, date, event, role, fee. Fee is either “Paid” or “Unpaid,” it doesn’t say how much was paid.
That’s not all the data, though. There’s also the Network Radio tab. The April data in XML is available. (The April and May URLs are different; I don’t see immediately how they’re structured so I don’t know how to get at all the data in one go when there’s more than one month’s information available. This first stab at it does use all the available data, though.)
I decided to load this into R to see what I could make of it. You can load the data up into your favourite data tool or language. Have a go! Here I slurp up the XML, convert it into data frames, pick out the rows with actual data, and glom them together into one data frame, appearances.
OK, we’ve got a data frame that’s got all the information we need in it. What’s the extent of the dates?
Only a week’s worth of data so far, and they’re a month behind. Well, here’s hoping they keep it up. It’s a great step, making this data public, and I hope they’re committed to it.
Let’s simplify what we have and make some counts. First, totals of paid and unpaid events.
Adrian Harewood had four engagements that week, setting the record for busiest person. Most people had one. And most were unpaid. How many?
10 paid, 25 unpaid, that’s 35 total, so 29% were paid and 71% unpaid in this small data set. Worth watching.
We can make a basic chart of paid and unpaid appearances.
That could definitely do with some tidying. But for now, let’s just pick out who’s been getting paid.
Peter Mansbridge’s paid event was to speak at the Halifax Chamber of Commerce spring dinner (“Peter Mansbridge stands at such a level of public respect and visibility that, for many Canadians, he is not simply a newscaster: he is the daily voice of a nation”). I couldn’t begin to guess at how much he got paid for that, or how much people paid to attend, but I could begin to guess at how much you’d have to pay me to go. Jian Ghomeshi got paid to host the Amazon.ca First Novel Award, which seems like a good pairing.
Let’s narrow it down to just counts of who’s been paid to speak:
A dull chart, but as those spreadsheets grow, it could get more interesting. Analysis of where people are speaking, and who their audiences are, will also be interesting. And of course, what if any effect this has on their reporting, and if they disclose their paid work.
This is just a first stab at the initial small data set, but with the CBC making the information easily available, anyone can hack on this. It will be interesting to watch. Congratulations to the CBC for opening this up.