Miskatonic University Press

Books and chapters in Zotero

zotero

Last week I had two questions about book-chapter relations in Zotero. Is there a good way of handling them and making it easier to cite different chapters from the same book? This question arises in research fields where it’s common to publish edited collections, where editors oversee an entire book but each chapter is written by different authors. It’s rare to see these kinds of books in regular bookstores and libraries, but in academic libraries we see them a lot. It’s equivalent to citing poems or short stories collected in an anthology.

Can we manage this nicely in Zotero? Not really, it turns out. In your head, you know the chapters are part of the book. Structurally, the book contains the chapters. Maybe the print book is sitting on the table in front of you and physically the covers contain the chapters. In Zotero, however, the chapters and the book are at the same level: they are all equal items in the library. This makes part-whole relations more difficult.

I asked about this on Mastodon last week and got some good replies, so I’m writing this to document what I understand of it all, which isn’t much. Any mistakes I’ve made, or things I’ve overlooked, please let me know. This won’t fix anything for anyone, but I thought it was still worth noting.

Citing book sections

Let’s start with citing. You might want to cite just one chapter from one of these edited collections. Citation style guides have rules for this—see for example the Chicago Manual of Style’s sample citations for a chapter or other part of an edited book. You might want to cite two or three chapters from one book—each deserves its own citation because it’s an individual work—and CMOS 18 14.10 (paywalled), “Several contributions to the same multiauthor book,” says:

If two or more contributions to the same multiauthor book are cited, the book itself, as well as the specific contributions, may be listed in the bibliography. The entries for the individual contributions may then cross-refer to the book’s editor, thus avoiding clutter. In notes, details of the book may be given the first time it is mentioned, with subsequent references in shortened form.

It gives this example, with notes:

  1. William H. Keating, “Fort Dearborn and Chicago,” in Prairie State: Impressions of Illinois, 1673–1967, by Travelers and Other Observers, ed. Paul M. Angle (University of Chicago Press, 1968), 84–85.

  2. Sara Clarke Lippincott, “Chicago,” in Angle, Prairie State, 363.

And then bibliography:

Draper, Joan E. “Paris by the Lake: Sources of Burnham’s Plan of Chicago.” In Zukowsky, Chicago Architecture.

Harrington, Elaine. “International Influences on Henry Hobson Richardson’s Glessner House.” In Zukowsky, Chicago Architecture.

Zukowsky, John, ed. Chicago Architecture, 1872–1922: Birth of a Metropolis. Prestel-Verlag in association with the Art Institute of Chicago, 1987.

Very readable! That is nice. For author-date style, it says:

In a reference list, individual components of a multiauthor work should not be listed separately and then cross-referenced to the work as a whole (a departure from advice in previous editions that recognizes the difficulty of cross-linking references in electronic journals and similar publications). Instead, list only the multiauthor work as a whole. Components can then be mentioned in the text and cited to the collection.

This new approach (the eighteenth edition came out in September) should make things easier one day, at least for Chicago style, but I’m not going to get into that.

Example bibliography of book and chapter

Let’s start with an example I cooked up. Here’s a sample book, titled THIS IS A BOOK, as seen in Zotero. Notice it has an Editor, and Item Type is Book.

That book has a chapter, Chapter One. Here the Item Type is Book Section, and the chapter has an Author while the book has an Editor. The publication information (publisher, date, location) is the same.

There’s nothing inherent in this that says the chapter is in the book. There’s no relation between the two. We know the chapter is in the book, but Zotero doesn’t. It can make a proper Chicago citation for one chapter (exported with Zotero’s Chicago style, seventeenth edition because the update to the eighteenth is being worked on):

Sample, Marjorie Q. “Chapter One.” In THIS IS A BOOK, edited by Reginald Example, 2d ed. Toronto: Example Press, 1976.

But if I do the chapter and book, I get this, with a full citation for both the chapter and the book:

Example, Reginald, ed. THIS IS A BOOK. 2d ed. Toronto: Example Press, 1976.

Sample, Marjorie Q. “Chapter One.” In THIS IS A BOOK, edited by Reginald Example, 2d ed. Toronto: Example Press, 1976.

There’s no way to have Zotero do a short citation for the chapter and a full one for the book, as far as I can tell.

I messed around with making a stub version of the chapter item, with little information so that Zotero had to make a short citation because it didn’t have more to use, but I couldn’t get it working. I’d end up with something like this (where “n.d.” means “no date”):

Sample, Marjorie Q. “Chapter One.” In THIS IS A BOOK, edited by Example, n.d.

If you know how to get Zotero to handle citations this way, I’d love to hear how you did it. I hope I haven’t overlooked something obvious, but is everyone who’s managing this problem doing these citations by hand the hard way, or with a huge kludge?

It seems about all we can do in Zotero is tie the chapter and book together as related items. Here’s the Related tab for the book.

If I click on the plus sign (out of screenshot) Zotero pops up a menu and I can pick an item to relate. If I choose the chapter, we get this:

Clicking on “Chapter One” under Related changes the focus to the chapter, and we see there is a reciprocal relation: the chapter is related to the book.

That’s useful but limited. The relation has no type or direction: Zotero knows this item is related to that item, but we know that one is a part contained in the other, that this chapter is part of this book. Maybe an article refutes a book, or a book reviews a film. Zotero doesn’t allow us to give those details.

Requests to understand hierarchy go back a long time. Here’s Hierarchical item relationships from the Zotero forums, a thread that started in 2007, where four years ago someone helpfully added links to several other similar discussions. There don’t seem to be any plans to add this feature.

Creating sections from books, and vice versa

Zotero can generate a book section from a book, and a book from a book section. Here I right-click on the book, and the menu has a Create Book Section option:

That creates a new entry like this, which knows it’s a book section and that it’s related to the book (out of screenshot). You’ll need to add the Author field then fill in it and Title.

It works similarly the other way. This does make things easier to work with Zotero as it is, but we’re still left with chapters and books being equal in Zotero.

Chapters in a subcollection?

It occurred to me that maybe something could be done by putting chapters in a subcollection under the book, but it didn’t get anywhere.

The future?

Perhaps the change will come through or in concert with the Citation Style Language, which is where Zotero gets all its citation styles. I saw this comment from early this year in a discussion of handling multi-volume works: “There were discussions about changing the currently flat data model to a hierarchical/structured data model, which would allow for such things. But don’t expect anything to happen regarding this in the short term.” CSL is an incredible piece of work. It’s astounding how much it handles and how well it works. I’m a professional librarian, and you can trust me when I say the thousands of bibliographic citation styles that exist are almost entirely unncessary and dealing with them is maddeningly finicky. But CSL handles them, for free.

Zotero itself is a vastly complex project, with a PDF viewer and editor, cloud storage, cross-device synchronization, code that understands umpteen different publisher platforms, and more, all working on multiple operating systems. For free! It’s one of the best free and open source software projects in the world, and has made a huge difference to the lives and work of researchers of all sorts all over the world.

Many thanks to Zotero and CSL and everyone involved with the projects for all the incredible work they do!

Database queries

This section is technical. I got curious about how Zotero’s database is set up to contain all this information, and spent a little while digging into it. Exploring Zotero Data Model for Direct Database Access by GitHub user pchemguy was really helpful with this. I knew Zotero uses SQLite, but I didn’t know it uses the entity-attribute-value model, which was completely new to me. It’s good for dealing with flexible and sparse data such as bibliographic metadata, but it sure leads to a lot of table joins.

Here’s how I got started. I looked at Zotero’s documentation on its SQLite database and classified it as “too technical for me,” so I just starting poking around. I closed Zotero and ran this in a shell to open up its database, knowing where the data directory is:

sqlite3 ~/Zotero/zotero.sqlite

I ran these commands. The first two set up some formatting, then .tables outputs a list of all the database tables,

sqlite> .headers on
sqlite> .mode column
sqlite> .tables
baseFieldMappings          itemData
baseFieldMappingsCombined  itemDataValues
charsets                   itemNotes
collectionItems            itemRelations
collectionRelations        itemTags
collections                itemTypeCreatorTypes
creatorTypes               itemTypeFields
creators                   itemTypeFieldsCombined
customBaseFieldMappings    itemTypes
customFields               itemTypesCombined
customItemTypeFields       items
customItemTypes            libraries
dbDebug1                   proxies
deletedCollections         proxyHosts
deletedItems               publicationsItems
deletedSearches            relationPredicates
feedItems                  retractedItems
feeds                      savedSearchConditions
fieldFormats               savedSearches
fields                     settings
fieldsCombined             storageDeleteLog
fileTypeMimeTypes          syncCache
fileTypes                  syncDeleteLog
fulltextItemWords          syncObjectTypes
fulltextItems              syncQueue
fulltextWords              syncedSettings
groupItems                 tags
groups                     translatorCache
itemAnnotations            users
itemAttachments            version
itemCreators

Let’s try digging for a known value: the title of the book.

sqlite> SELECT * FROM itemDataValues WHERE value = "THIS IS A BOOK";
valueID  value
-------  --------------
13352    THIS IS A BOOK

Let’s find the items that have data with that value.

sqlite> SELECT * FROM itemData WHERE valueID = 13352;
itemID  fieldID  valueID
------  -------  -------
4435    44       13352
4436    44       13352

That says some items have fields with a value we know is THIS IS A BOOK, but not what the fields are. For that we need a more complex query:

sqlite> SELECT i.itemID, i.fieldID, i.valueID, f.fieldName
FROM itemData i, fields f
WHERE i.fieldID = f.fieldID
AND i.valueID = 13352;
itemID  fieldID  valueID  fieldName
------  -------  -------  ---------
4435    44       13352    bookTitle
4436    44       13352    bookTitle

All right, that’s getting somewhere. Let’s join more tables:

sqlite> SELECT i.itemID, i.fieldID, i.valueID, f.fieldName, iv.value
FROM itemData i, itemDataValues iv, fields f
WHERE i.fieldID = f.fieldID
AND i.valueID = iv.valueID
AND i.valueID = 13352;
itemID  fieldID  valueID  fieldName  value
------  -------  -------  ---------  --------------
4435    44       13352    bookTitle  THIS IS A BOOK
4436    44       13352    bookTitle  THIS IS A BOOK

Things with titles that we’re looking for: good. Let’s find more about items with IDs 4436 and 4437. I’ll skip over a few steps here, where I was joining more tables, and jump to this next query. Remember, I’m no database expert, and I’m sure there are tidier and more efficient ways to do this, but it works.

sqlite> SELECT i.itemID, i.itemTypeID, it.typeName, i.key, id.fieldID, f.fieldName, id.valueID, idv.value
FROM items i, itemData id, fields f, itemTypes it, itemDataValues idv
WHERE i.itemID = id.itemID
AND id.fieldID = f.fieldID
AND i.itemTypeID = it.itemTypeID
AND id.valueID = idv.valueID
AND i.itemID IN (4436, 4437);

itemID  itemTypeID  typeName     key       fieldID  fieldName       valueID  value
------  ----------  -----------  --------  -------  --------------  -------  --------------------------------
4436    7           bookSection  DV3YIBUF  1        title           13353    Chapter One
4436    7           bookSection  DV3YIBUF  6        date            13094    1976-00-00 1976
4436    7           bookSection  DV3YIBUF  7        language        404      eng
4436    7           bookSection  DV3YIBUF  11       libraryCatalog  11781    ocul-yor.primo.exlibrisgroup.com
4436    7           bookSection  DV3YIBUF  21       place           9267     Toronto
4436    7           bookSection  DV3YIBUF  23       publisher       13371    Example Press
4436    7           bookSection  DV3YIBUF  42       edition         13099    2d ed.
4436    7           bookSection  DV3YIBUF  44       bookTitle       13352    THIS IS A BOOK
4437    6           book         YT4DGR9U  1        title           13352    THIS IS A BOOK
4437    6           book         YT4DGR9U  6        date            13094    1976-00-00 1976
4437    6           book         YT4DGR9U  7        language        404      eng
4437    6           book         YT4DGR9U  11       libraryCatalog  11781    ocul-yor.primo.exlibrisgroup.com
4437    6           book         YT4DGR9U  21       place           9267     Toronto
4437    6           book         YT4DGR9U  23       publisher       13371    Example Press
4437    6           book         YT4DGR9U  42       edition         13099    2d ed.

There we have the chapter and the book, with all the publishing information detailed, and database ID numbers that show how it’s all tied together. The key, for example YT4DGR9U, is used by Zotero when it’s storing attachments on disk. Look in storage/ in your Zotero data directory and you’ll see lots of directories with names like this.

But there is nothing here about creators or relations! That information is in other tables. We can do some joins to get at the creators and their roles:

sqlite> SELECT ic.itemID, ic.creatorID, ic.creatorTypeID, c.firstName, c.lastname, ct.creatorType
FROM itemCreators ic, creators c, creatorTypes ct
WHERE ic.creatorID = c.creatorID
AND ic.creatorTypeID = ct.creatorTypeID
AND itemID in (4436, 4437);
itemID  creatorID  creatorTypeID  firstName    lastName  creatorType
------  ---------  -------------  -----------  --------  -----------
4436    1506       10             Reginald     Example   editor
4436    1509       8              Marjorie Q.  Sample    author
4437    1506       10             Reginald     Example   editor

That’s easy enough to read. The relations are trickier. With a bit of work I got this:

sqlite> SELECT ir.itemID, ir.object, ir.predicateID, rp.predicate
FROM itemRelations ir, relationPredicates rp
WHERE ir.predicateID = rp.predicateID
AND itemID in (4436, 4437);
itemID  object                                       predicateID  predicate
------  -------------------------------------------  -----------  -----------
4436    http://zotero.org/users/4291/items/YT4DGR9U  3            dc:relation
4437    http://zotero.org/users/4291/items/DV3YIBUF  3            dc:relation
4437    http://zotero.org/users/4291/items/Q32SKZ3X  3            dc:relation

What’s with those object values? They are URLs, but they don’t work. Now, as it happens, 4291 is the ID for my Zotero account (except it isn’t, I made that up just in case I shouldn’t let my ID number be known). I know that from looking at URLs when I’m looking at my account details on the Zotero site.

I didn’t look into this at all, so forgive my guess, but with an object and a predicate and a URL I got to thinking about the RDF model of triples. The DC in dc:relation is Dublin Core, a simple metadata schema that has fifteen elements, one of which is relation: “A reference to a related resource. Recommended best practice is to reference the resource by means of a string or number conforming to a formal identification system.” Simple Dublin Core leaves the relation at that, but Qualified Dublin Core (which came later and is a little more complicated) has some defined relationships, one of which is IsPartOf. Maybe in the future Qualified Dublin Core might help?

In any case, the last part of the object URI is familiar: we saw YT4DGR9U above because it is the key for item 4437. I didn’t try to pick that piece of the object value out, but with this query we can see by eye that each thing we’re interested in is related to the other. In other words, this is where it’s defined that the book is related to the chapter.

sqlite> SELECT i.key, ir.itemID, ir.object, ir.predicateID, rp.predicate
FROM items i, itemRelations ir, relationPredicates rp
WHERE i.itemID = ir.itemID
AND ir.predicateID = rp.predicateID
AND i.itemID in (4436, 4437);
key       itemID  object                                       predicateID  predicate
--------  ------  -------------------------------------------  -----------  -----------
DV3YIBUF  4436    http://zotero.org/users/4062/items/YT4DGR9U  3            dc:relation
YT4DGR9U  4437    http://zotero.org/users/4062/items/DV3YIBUF  3            dc:relation

I already knew that bibliographic metadata is difficult, and that databases are difficult, and a few hours of ploughing through this showed me in a new way that storing bibliographic metadata in a database is difficult. Adding part-whole relations seems like it would be a major project.

I enjoyed exploring the Zotero database for a little while. It gave me a better understanding of how Zotero works, and greater appreciation for all the developers who make tools and plug-ins that work with Zotero. Thank you!

PS

I asked one of DuckDuckGo’s AI chat bots to help me write some SQL queries on the Zotero database. I made a simple request, and it gave me a query that didn’t work: it referenced columns that don’t exist. I told it the query didn’t work, and it gave me a new query, which didn’t work. I told it the query didn’t work, and it gave me another query, which also didn’t work. I told it none of these queries worked and expressed some displeasure, and it asked me for help in understanding the database model. Bah!

Speaking of edited collections, my first publication was a chapter in the great Arlene Taylor’s collection Understanding FRBR: What It Is and How It Will Affect Our Retrieval Tools. She was very generous and I am forever grateful to her.