Our work for Jisc on digital archive collections will provide universities and colleges with tools for exposing their digital archive collections.
In April we kicked off a new project for Jisc working with colleagues from SERO. The Digital Archive Collections (DAC) project will be exploring ways to collect, collate and share information about a class of digital material often neglected in the electronic resource management space. Digital archive collections are often problematic due to their historic nature, and the lack of reliable identifiers.
On a technical level, we will be exploiting work done for the Open Library Foundation on the GOKb system. We will be:
- extending GOKb, adding capabilities for aggregating and describing materials in the archive
- leveraging the Elasticsearch index to build a registry which is both quick and easy to search
Owen will be looking at defining the import formats and resolving data issues. The team have already contributed two pieces of technical work to:
- upgrade GOKb to the latest versions of Spring Boot (work dockerize GOKb)
- make GOKb custom fields available to the standardised import format.
Anyone interested can access our latest code in the repository: OLF GOKb open source project on GitHuB.
We’ve also shared screen scraping code to aggregate proprietary DAC sources into the system. We’re hopeful that we can push DAC providers towards KBART and KBART like formats for data sharing.
The project team will be developing and hosting a demonstrator system for 6 months.