Improve version control of research assets
Research outputs typically go through several different iterations or versions over the course of their lifecycle. NISO, for example, defines up to seven versions of scholarly journal articles http://www.niso.org/publications/niso-rp-8-2008-jav. In order to avoid confusion, support proper citation, and ensure compliance with publisher open access policies, it is critically important that institutional repositories clearly distinguish between different versions of the same work by prominently displaying this information in both the asset record and PDF cover sheet.
Esploro currently offers two options for communicating version: asset type and file type.
Asset type refers to the type of resource being deposited. When categorizing asset types, Esploro distinguishes between “Publications” and “Posted Content”. “Publications” refer to published works such as journal articles, book chapters, conference proceedings, etc. “Posted Content” refers to works that have not been formally published, which Esploro defines as working papers, preprints, and accepted manuscripts. Each asset can be assigned only one asset type. Asset type values are predefined and not customizable.
File type is used to describe the individual files or external links associated with an asset. The list of default values consists of a mix of terms used to indicate format (text, image, audio, video) and version (preprint, submitted, accepted, published, correction). Unlike asset type, however, file type values are customizable. Although choosing a file type is not required for deposit, this is the field Esploro uses to display version information on PDF cover sheets.
We believe the overlapping use of asset type and file type to capture version information is a potential source of confusion and data inconsistency. For example, if a researcher wishes to deposit an article that has been accepted for publication, should they deposit it as an "Accepted Manuscript" or as a "Journal Article" with the file type "Accepted"? Either is possible but having two ways to deposit the same work will likely confuse researchers. This inconsistency also creates problems for discovery and analytics. For example, if a user wanted to obtain a list of all preprints, they would only retrieve those defined at the asset level as opposed to the file level. Meanwhile, version information only appears on cover sheets if it specified at the file level. Avoiding this confusion seems to require a single, consistent model for representing versions in Esploro.
Ultimately, we don’t think it makes sense to manage version as an asset type or a file type; it is neither. Instead, we recommend version be managed as a completely separate field. When a researcher deposits an asset, they should be prompted to select the version of the asset being deposited. This information should be prominently displayed in both the asset record and cover sheet and be queryable in Analytics.
We further recommend that Esploro allow researchers to submit new versions of previously deposited works. Different versions of the same work should either be FRBRized or displayed in a way that directs users from superseded version(s) to the current preferred version.
i. When depositing an asset, prompt researchers to indicate if the asset is a new version of a previously deposited work.
ii. If the asset is a new version, allow the researcher to select the previous version from a list of their previously deposited assets.
iii. After the previous version has been selected, a) auto-populate the deposit form with the metadata from the previous version, and b) automatically link the old and new versions via the appropriate relationships (is new version of / is previous version of).
iv. Once the new version has been approved, either a) FRBRize all versions with the latest as the preferred record, or b) exclude the previous version from search indexing (but continue to allow it to be accessible via DOI) and add a prominent note notifying users that this version has been superseded and directing them to the latest version.
To summarize, we believe the following enhancements would significantly improve version control in Esploro.
1) Create a single, dedicated, and prominently displayed field for version
2) Allow researchers to easily deposit new versions of previously deposited works
3) FRBRize different versions of the same work OR provide some mechanism for identifying and directing user to the latest or preferred version of a work
Removing the "accepted" status so that institution can continue voting as agreed with the PWG.