OAI Pipe date options to facilitate CONTENTdm automated harvesting
We would like our pipe that harvests the OAI feed from CONTENTdm to run automatically either weekly or monthly.
However it is a known bug with CONTENTdm that if the collection is large, and if a small date range is used with the OAI data feed, then the process is likely to hang.The current workaround with Ex Libris is to manually set the start date to 100 years ago, and then run the pipe.
The enhancement I would like is enhancing the pipe configuration to either add
a.) A checkbox which adds the option of not automatically updating the start date after each run OR
b.) A checkbox to remove the date altogether from the pipe run and just use the continuation tokens to iterate through the whole collection
Either would enable this pipe to be setup to run automatically.
Deborah Fitchett commented
We're planning to move to VE (as Ex Libris seem to be strongly encouraging it!) too and would hate to lose this function when we do.
Katie Sanders commented
Yes, we are on VE now and would like this corrected in Primo VE
Dean Lingley commented
This was completed for Primo BO in May 2020. Now it needs to be implemented for Primo VE discovery harvest.
Simplify Data Reload Flow (NERS #6184)
Previously, the Delete Data Source and Reload pipe allowed you to harvest records that had been exported since the last time the records were harvested and loaded into Primo. With this enhancement, you can retain the same date for subsequent pipe executions so that you can automatically reload the same set of harvested records.
The Use Static Date field has been added to the Define Pipe page for Regular and Delete Data Source and Reload pipes. When selected, the Start harvesting files/records from field retains the date on which the pipe was first run instead of automatically changing it to the date on which the pipe was last run.
Natasha Stephan commented
Just adding a comment that we've moved to Primo VE/Alma, and this is still a problem. I want to schedule a run of the Discovery Import Profile we created for CONTENTdm, but it will run with a Harvest Start Date of the last run date.
The only way to get this not to hang is to select Connect and Edit, set the Harvest Start Date to 5 years ago, save, and then run manually.
Elliot Williams commented
We also have the same problem with both regular update pipes and delete-and-reload pipes, both for CONTENTdm and other systems we harvest. It would be great to have the option not to include dates in the OAI-PMH request at all.
Colin Meikle commented
We have the same problem Deborah mentions. Having to manually set the date back each time the delete-and-reload pipe is run precludes being able to schedule such a pipe to run at set times.
Gordon Andrew commented
Just being able to override the date would help a lot. We also have to run regular Delete and ReLoad pipes, and we have to manually edit the start date each time the pipe is run.
Deborah Fitchett commented
This would also help with a use case we have: We're harvesting from a small OAI feed that can't supply 'delete' data, so we have to run regular delete-and-reload pipes. Obviously for this case we have to set the date back to before the repository began every time we run it.