Internet Archive
The Internet Archive https://archive.org is a 501(c)(3) non-profit library. Founded in 1996, their mission is to provide Universal Access to All Knowledge. They collect published works and make them available in digital formats.
Nowadays, the Archive contains 11 million books and texts, 4 million audio recordings (including 160,000 live concerts), 3 million videos (including 1 million Television News programs), and 1 million images. Books published prior to 1923 are available for download, and hundreds of thousands of modern books can be borrowed (free of charge).
Certainly a good candidate for Primo Central and Summon indexes...
Hello all,
Thank you for contributing this idea.
Following numerous attempts to reach out to the provider we have not been able to establish contact and therefore must decline this request.
Should the provider be interested in establishing contact with us, we would be happy to revisit this request.
Best,
Rael
-
Sarah commented
Any news on this? We also would be interested.
-
Rachel Smith commented
I support this idea. A few of our members have expressed strong interest in having Internet Archive metadata available in the CZ and in the CDI in Alma and Primo VE.
-
Lars Iselid commented
I have been in contact with Internet Archive and they answer:"Our APIs are public so other services are welcome to import our metadata." So, Rael, what is your problem here?
-
Michele Ruth commented
I agree with Jacqueline Carrell. Please make these collections available.
-
Jacqueline Carrell commented
Now that the Internet Archive has announced its National Emergency Library, perhaps they would be more amenable to discussing this? There are two collections available in the OCLC Knowledge Base called "Internet Archive: Free to Lend" and "Internet Archive: Free to Read." They have been recently updated. I would love to have similar collections available in the community zone. I've already funneled a couple of users looking for specific books to the Internet Archive. It would be nice if they were in our catalog.
-
Anonymous commented
Since there is now an official partnership between Internet Archive and Better World Books (the new Better World Libraries initiative) to digitize library-donated books for single-user open access, perhaps they might be a better contact. Since many of us are donating books to this project, it is crucial for us to provide visibility to these in Primo. Their contact form is at https://services.betterworldbooks.com/libraries/contact-us/
--Jonathan H. Harwell, Head of Collections & Systems, Rollins College
-
Anonymous commented
I wonder why they won't take on newspaper archives? That's their next big challenge to digitalize papers.
-
Hi,
Thank you, Francois, for your input and for providing a narrower list to address.
We will review the list provided.Best,
Rael -
François Renaville commented
My two cents…
There are currently about 320,000 collections on archive.org (see https://archive.org/search.php?query=mediatype:collection&sort=-downloads ). This high number is not surprising because a new collection can be requested and built on archive.org for users who have a minimum of 50 items created on archive.org that are related and of the same mediatype (see https://archive.org/about/faqs.php#1051 ).
I would suggest keeping it simple and Ex Libris to consider creating one collection per “media type”
- eBooks and Texts (15 millions) https://archive.org/details/texts
- Images (3.8 millions) https://archive.org/details/image
- Moving Image Archive (3.9 millions) https://archive.org/details/movies
- Audio Archive (3.9 millions) https://archive.org/details/audio
- TV News Archive (1.5 millions) https://archive.org/details/tv
- Software Collection (ca 193,000) https://archive.org/details/software
- Live Music Archive (ca 181,000) https://archive.org/details/etree
No all these collections would be useful to all customers, but this would at least give us some flexibility in selecting the content types.
For my part, web crawl activities are not of any interest for the index.There is also the possibility to harvest for example the top 100 or 200 of the largest collections, but not all of them are meaningful (‘additional_collections’, ‘inlibrary’, ‘Podcasts’…) and some like ‘Arxiv.org’, ‘JSTOR Early Journal Content’, ‘Biodiversity Heritage Library’ or ‘PubMed Central’ would certainly be only duplicates with what has been in the index for years. See also https://www.screencast.com/t/a06ztMIrO
Anyway, records are often in more than one collection. For example https://archive.org/details/FrankensteinfullMovie is in ‘Sci-Fi / Horror’, ‘Feature Films’ and ‘Movies’ and https://archive.org/details/wilsonharris00maes is in ‘Daisy Books for the Print Disabled’, ‘Books to Borrow’, ‘Borrow in Browser’, ‘Internet Archive Books’, and ‘Scanned in China’. This is for me in favor of collections per media type.Having collections per original provider or thematic collections may also be of interest, but I have strong doubts as to the Content Team’s and the customers' ability to regularly monitor the situation and later to add new thematic collections once they become available on archive.org! As earlier said: KISS… ;-)
Sure, archive.org is not a small collection for the index, but much smaller than for example Scopus that currently contains more than 70 million records and hardly larger than PubMed that comprises over 27 million citations for biomedical literature, but here with archive.org there will certainly be much less duplicates with records coming from other Primo Central / Summon collections. And this makes even more the content of archive.org of great interest!
-
AdminDana Sharvit (Admin, Ex Libris) commented
As Internet Archive is very big, it would be very helpful to get a list of collections that you wish to be added
-
Mary Grenci commented
Yes, please. It would be very useful to have records for both the entire Internet Archive and individual collections. If the entire Archive is too huge for one PCI collection, then just having the individual Internet Archive collections would still be great.
-
G. Marchais commented
That would be extremely useful, especially if we could select the collections they belong to (activate several sets by document type)
-
Sylvain Machefert commented
That would be extremely useful, especially if we could add titles based on the collections they belong to.