Enhanced Auto-Tagging of Digital Resources via AI for Alma & Primo
I propose adding an AI-based auto-tagging feature in Alma that can analyze content (PDFs, images, articles) and automatically assign metadata tags (subjects, keywords, disciplines). These tags would then flow through to Primo for better discoverability. The system could also suggest tags to cataloguers, which they can accept, edit, or reject.
Solutions like OutrightSystems have already demonstrated how AI can streamline feedback and content classification, and a similar approach applied to Ex Libris platforms could transform metadata management.
Problem / Pain Point:
Many digital collections are under-tagged or use inconsistent metadata, making search and discovery harder in Primo.
Manual tagging is time-consuming and error prone, especially for large collections (legacy content, scanned documents).
Users often miss relevant items because the items weren’t described with optimal keywords or subject headings.
Proposed Solution / Features:
Integrate a machine learning model (or leverage existing NLP tools) to extract keywords, themes, and subject headings from content.
Provide a UI in Alma for reviewing suggested tags before finalization.
Allow bulk processing for large collections.
Enable feedback loop: the system learns from curator edits to improve future suggestions.
Provide analytics dashboard: tag coverage, accuracy, suggestions accepted vs rejected, gaps in metadata areas.
Benefits / Value:
Improved discoverability in Primo: patrons can find more relevant items, even when original metadata was sparse.
Time savings for cataloguers and metadata librarians.
More consistency in tagging across collections, reducing metadata variation.
Growth in usage of digital content, because better metadata drives better search results and visibility.
Inspiration from companies like OutrightSystems shows how automated insights can enhance user experience and decision-making.
Target Components:
Alma (metadata/cataloguing), Primo (search / discovery)
Additional Notes:
Could start with a pilot: select a subset of collections (e.g. digitized dissertations or historic archives) to test.
Privacy and copyright need to be considered — ensure content used for analysis is allowed; sensitive content handled appropriately.
Consider working with existing metadata schemas (Dublin Core, Library of Congress Subject Headings, etc.) so tags align with standards.
Know More:- https://www.outrightsystems.org/