How can we improve Rosetta?

Recalculate fixity SHA2 and insert into Rosetta

We have around 22,000,000 files that we would like to recalculate a new fixity hash, but this time it would be SHA256. We need a good way of getting this fixity back into Rosetta. We would like to store it along with the SHA1 value we previously generated. We are aware that the SHA1 fixity is located in xml clob data in the hdemetadata table. We are wondering if there is another location for this SHA1 value in another table.

2 votes
Vote
Sign in
(thinking…)
Sign in with: facebook google
Signed in as (Sign out)
You have left! (?) (thinking…)
Aly Conteh shared this idea  ·   ·  Admin →

3 comments

Sign in
(thinking…)
Sign in with: facebook google
Signed in as (Sign out)
Submitting...
  • Andreas Romeyke commented  · 

    If you use the AIP-update functionality via update_representation() you could provide the right checksum. Eventually you need to build a validation-plugin for your required fixity method.

  • Kris Dekeyser commented  · 

    You may want to be very careful in doing this. I ran the 'Identify storage dead references' task chain on our repository and it created a new version for each IE. It got our database setup into problems because the archivelogs were filling up faster than they could be processed by the DB dataguard. See case #00480085.

    My intention was not even to update the checksums, but merely to check the files, but a bug caused the process to recalculate the checksum and create an event for each IE, causing the update and a new version for each IE in the system. But that's another story...

    So, if you are expecting to perform the IE updates at a high rate, I would strongly recommend to make a full DB backup first, then turn off archivelogs and other time-critical settings before starting the update of 22 million IE's.

  • Aly Conteh commented  · 

    We are aware that there is a task chain that can be run from within Rosetta to recalculate the fixity but we are running a process outside of Rosetta that will necessitate us touching all 22 million files and therefore it makes sense to do the recalculation at that point and use an API call to update the metadata in Rosetta.

Feedback and Knowledge Base