[PRIMO] Allow normalization of Primo Central Index PC records
It would be great to have the ability to normalize the Primo Central records using NR set as we do for local records.
This would be very beneficial. For example:
- Customers could choose their own facet values and harmonize them with local records.
- Values only in PC records could be removed of modified (e.g. toplevel facet values for Open access and Peer Reviewed )
- FRBR and Dedup could be performed also against local records, thus greatly improving results in blended search.
Stacey van Groll commented
Thanks for providing more detail as to how you would like to see this working in practice. It helps to conceptualise it.
I'm stuck on the fact that local norm rules are for your own Primo database of millions at most, whereas PCI records are billions. Our expanded search is over 950,000,000. At the moment there is only one PCI server globally, with another coming on board hopefully soon. I would hazard a guess that part of the reason for only one / two is due to sheer size. But this idea has an outcome of every site locally storing a copy of each activated PCI record in their own Primo database. I can maybe see the Dedup / FRBR portion working on the Primo VE model, where these are dynamically grouped at query time, but I don't see how it would be possible for the Primo Back Office model where these are grouped at index time.
But this is definitely an IMHO response, and I could be way off the mark with my understanding of the underlying infrastructure and capacity variables in play.
Luigi Siciliano commented
Dear Stacey, dear all, the idea is not to have a pipe and a norm rule set for each collection.
The customers should be able to select sources for PC exactly as they are doing now. And records in PC should be normalized by Ex Libris in the current PNX format exactly as they are now.
But in Primo BE we should have an additional layer (it could be a pipe or something similar) with its norm rule set. This norm rule set would normalize, where required, the PC PNX records delivered by ex libris.
In other words, I can imagine a pipe named "customerPCI", having as data source the customer's collections and current PNX as source format, and its Normalization Rule having PNX field as source and related transformation routines when needed.
That would allow to make local PNX and PC PNX more consistent and improve blended search.
Stacey van Groll commented
Wow! That would be a big undertaking! Wouldn't that mean that each site is doing the work of the Ex Libris Primo Central Team individually? Would you have a pipe and a norm rule set for each collection? The mind boggles at the thought of how this would work in practice!
Cheers, Stacey van Groll, Discovery and Access Coordinator