As the Digital Publishing Librarian I am frequently asked what format a researcher should use to publish their materials in our open access institutional repository or in our open access journals. Leaving the other mediums aside for now, I will focus only on text files for this post.
The truth is, I spend a lot of time thinking about how to best direct researchers with this question and no great answer seems readily available. My default response is, if you want to use PDF, please use archival PDF/A-1. In my position I recognize how important it is to authors and editors that a document look its best, but I also need to think about how to best preserve it, digitally, for a long time. I’m not a preservationist by a long shot, but these matters still keep me up some nights.
We have experimented with a few projects that stray from the PDF. For example, The Medieval Review uses XML to generate their articles which they supply to our repository and in turn we then transform into HTML. We hold on to the XML files primarily because we think they could be useful to our preservation strategy.
We’ve also worked with Museum Anthropology Review (see volume 5, issues 1-2) as time and staff help permits to create HTML versions of their PDF articles using a template a crafty student created. While these files in particular are great HTML files, they take quite a bit of time to create as I learned one Friday afternoon last month when the editor and I sat down to try to create them ourselves!
Yes, time is a large part of the crux of the problem. Staff expertise as well. I have inquired of these editorial practices and support for the creation of well-formed preservable articles with other library staff doing similar work and our general response boils down to this: we’re a shoe-string shop, trying to get by and do good work without spending a lot of time and money on the format of the output and so we resort to what seems good enough and people like: the PDF.
In our spare time, folks like me keep abreast of the the NISO Journal Article Tag Suite – Standardized Markup for Journal Articles. We play around with Annotum, an open-source, open-process, open-access scholarly authoring and publishing platform based on WordPress which allows for the easy creation of XML-based articles. We try to create XML templates in Microsoft Word. If you read into these projects you may notice many of them focus on scientific publications and I thank these developers for venturing into these arenas. Most of the publications I support to date are humanities-based and am hopeful as humanists continue to explore viable options – ones that are easy for authors, editors, and peer reviewers to use and of course, that readers like to read. I look forward to the possibility of discussing these questions at venues such as ThatCamp Publishing 2012.
This post is just as much a call for response as it may help point others wondering about these matters to useful resources. I thank people like Michael Fenner at PLoS and Matthew Gold at CUNY for delving into these matters as well.