OAI Pipe date options to facilitate CONTENTdm automated harvesting
We would like our pipe that harvests the OAI feed from CONTENTdm to run automatically either weekly or monthly.
However it is a known bug with CONTENTdm that if the collection is large, and if a small date range is used with the OAI data feed, then the process is likely to hang.The current workaround with Ex Libris is to manually set the start date to 100 years ago, and then run the pipe.
The enhancement I would like is enhancing the pipe configuration to either add
a.) A checkbox which adds the option of not automatically updating the start date after each run OR
b.) A checkbox to remove the date altogether from the pipe run and just use the continuation tokens to iterate through the whole collection
Either would enable this pipe to be setup to run automatically.
Natasha Stephan commented
Just adding a comment that we've moved to Primo VE/Alma, and this is still a problem. I want to schedule a run of the Discovery Import Profile we created for CONTENTdm, but it will run with a Harvest Start Date of the last run date.
The only way to get this not to hang is to select Connect and Edit, set the Harvest Start Date to 5 years ago, save, and then run manually.
Elliot Williams commented
We also have the same problem with both regular update pipes and delete-and-reload pipes, both for CONTENTdm and other systems we harvest. It would be great to have the option not to include dates in the OAI-PMH request at all.
Colin Meikle commented
We have the same problem Deborah mentions. Having to manually set the date back each time the delete-and-reload pipe is run precludes being able to schedule such a pipe to run at set times.
Gordon Andrew commented
Just being able to override the date would help a lot. We also have to run regular Delete and ReLoad pipes, and we have to manually edit the start date each time the pipe is run.
Deborah Fitchett commented
This would also help with a use case we have: We're harvesting from a small OAI feed that can't supply 'delete' data, so we have to run regular delete-and-reload pipes. Obviously for this case we have to set the date back to before the repository began every time we run it.