Full Validation Stack for BYTESTREAM Objects

We have a growing number of deposits where data enters the archive in container formats such as .zip and is usually not extracted for various reasons. Examples include zip containers which are part of a larger complex object IE such as the content of a CD or a as zipped supplementary data to a journal article where the zip container is referenced from with the publication.

In this case, files in BYTESTREAM are extracted during ingest for metadata extraction, but are stored as container formats in permanent. But there are some shortcomings compared to the regular validation stack:

• no validation takes place
• techMD extraction takes place but is only stored in very limited form
• no fixity checksum is created for files in bytestream

On the container-level itself – so the file level for the object in question – everything takes places as expected.

We therefore suggest, to run the full validation stack on objects in BYTESTREAM, which includes:

• validation
• full techMD extraction
• virus check
• creation of fixity checksums

33 votes

Michelle Lindlar shared this idea · Apr 13, 2021 · Admin →

An error occurred while saving the comment

elenafontana commented · July 15, 2024 4:12 AM

Hello,
According to me improve our handling of deposits with data in container formats such as .zip, where files are currently not extracted for various reasons, we propose running the full validation stack on objects in BYTESTREAM. This will address current shortcomings, including the lack of validation, limited techMD extraction, and absence of fixity checksums.

Submitting...
Fabian Schneider commented · April 22, 2021 12:07 AM

We have a large number of submitted container formats too and are very interested in having the full validation on the files within the containers. However, depending on the usecase one might avoid having all the files in the TA workbench e.g. if the files within the container are from a specific format that can't be handled anyway. Therefore it would be nice to have the option to activate/deactivate the validation in such cases.

Submitting...

How can we improve Rosetta?

Full Validation Stack for BYTESTREAM Objects

Your importance score has been recorded.

Feedback

Rosetta: Ingest

Full Validation Stack for BYTESTREAM Objects

We're glad you're here

Your importance score has been recorded.

We're glad you're here

We're glad you're here

We're glad you're here

Rosetta: Ingest

Categories