Ability to dedup or FRBR between local data and CDI data
For Primo Central Index data collections, the PCI citation often represents a title that we also hold in our local data. The current inability to dedupe or cluster between PCI and local data means that users will always be presented with two hits when they are on the combined/everything scope.
With the new Central Discovery Index (CDI), we would like the ability to configure a deduping or clustering algorithm such that we only present one hit to the user.

We have reviewed this idea carefully and discussed it with Primo working group while estimating NERS enhancements. Currently since there are two separate indexes for local and central data, we do not have any technical solution to fulfill this request. However, with the upcoming NERS 6702 we are trying to reduce the issue caused by books duplication by adding configuration option to exclude ebooks from your CDI results. We have decided to close this idea to release your votes.
-
Lars Iselid commented
NERS case 6702 has exactly the same heading and it will be solved Nov 2021: https://igelu.org/products/primo/primo-enhancements Though the solution is far from what the heading suggests:"The solution will allow more control for customers by not grouping the results, but rather allowing more control on the activation side such as to not include a Book package from CDI, which will mean no duplicates with the local Books, but to choose to still include all the Book Chapters of that package" Christine Stohn has said in a webinar or session that it's technically very hard (my interpretation: impossible) to solve what this idea want a solution for. So, why don't Ex Libris review this idea with so many votes (at the moment 766 votes) and tell the truth. Then we could use our votes for ideas we still think is technically possible to implement.
-
Brian commented
Wild that this isn't possible yet.
-
Manu Schwendener commented
NERS 7433, open for voting now.
-
H. Baleix commented
Great!
-
François Renaville commented
Hi Tom,
If I'm not mistaken, they do not plan to develop what has been suggested in this idea, but they will rather offer some flexibility for ebooks. See slide 26 in the 2021 Roadmap https://t.co/JNscsbzeca?amp=1 --> We will be able to hide CDI ebook records and to keep book chapters (if my understanding is correct). It should be for H2 2021. -
Tom Kistell commented
Ex Libris: given the level of interest, could you clarify whether development on this will go ahead please?
-
Manu Schwendener commented
From the Go VE webinar 9.2.2021: this will probably not be done, because of performance problems.
-
Ibrahim Ali commented
This is really needed to omit the duplication between local and CZ records in Primo
-
Manu Schwendener commented
NERS voting second round open until 31.8.2020
6702 – Ability to dedup or FRBR between local data and CDI data
-
Katharina Wolkwitz commented
I wonder how to explain to our users that there are some clustered records of e.g. "Dubbel: Taschenbuch für den Maschinenbau" with different editions and years nicely sorted together under one entry (coming from our local library-catalog-data) and the exact same editions and years appear as single entries in the search-result-list from the CDI.
And the final fulltext-link ends up at the exactly same page.
If there ever was a need for FRBR this is it!
-
Laura Akerman commented
What I would hope is that a better "solution" than either dedup or FRBR could be found for bringing together different instances of the same "expression" or edition - same content - of a work.
How about - from any record, I'd like users to easily be able to see other records for other formats or issuances of the same content, whether physical or electronic, in CDI or local resources (whether from Alma or other sources). This would have to be based on identifiers, as it is now (the dedup and frbr processes create shared identifiers), but those identifiers, whether machine-generated or human generated, should be stored as part of the metadata in a way that allows customers to see what is happening and fix mistakes, at least on their end.
I hope Ex Libris is at least thinking in this direction. In the meantime, clustering (but not deduping please! Since we have no control over CDI metadata) local and CDI records would be of great value.
-
Jane Daniels commented
I don't have any votes left but would support this.
Longer term we should also consider why we have to contend with multiple records for the same resource and why Ex Libris have to maintain so many knowledge bases. I presume all the other companies vying for library business with web discovery layers are doing the same thing.
There is a real need to rationalise the current metadata workflows so that less time is spent on having to devise systems that match, merge or dedup records of varying quality and instead concentrates on using a single high quality record to further search & discovery experience for end-users.Jane
-
Lars Iselid commented
The concept of local data excludes external data like repositories (from pipes) or am I wrong? But what if you import your repository in Alma. Will it be included in the concept "local data"?
-
Brenda Norton commented
This is one of the most-requested improvements that we receive from our clients.
-
Asbjørn Risan commented
We would also like to see this functionality. Its important this is supported also for non Primo VE institutions.
-
Manu Schwendener commented
+1
-
Aya Steinig commented
extremely important for our patrons!
-
Anonymous commented
Très bonne idée
-
G. Marchais commented
Would be much appreciated. We absolutely need this option too
-
Sylvain Machefert commented
If what seems to be a huge project like CDI does not include this possibility, that would really be an error.
Thanks François for creating this suggestion.