The challenge

The Archives Hub aggregates descriptions of archival materials from over 300 UK institutions. It aims to provide an interface to meet academic research needs as well as the diverse needs of anyone with an interest in using archival sources.

We were contracted by Jisc to redevelop the existing system. A core aim was to improve the interface used by researchers to discover content. The more ambitious requirements centred around the provision of new tools to manage the data and content supplied by a wide range of diverse systems. Jisc needed to:

  • improve workflow for the ingest of data
  • check the quality of data for ingest
  • enhance the data
  • aggregate content for use by international initiatives such as Archives Portal Europe

The size and complexity of the source files was particularly challenging. Whilst the third party data for describing archives conforms to a standard structure: EAD (Encoded Archival Descriptions), in reality institutions and vendors interpret and apply the standard very differently, causing considerable variation in the data provided to the Archives Hub.

Another challenge was the need to resolve URL structures from the old site to maintain backward compatibility.

What we delivered

The result of the project is a solution based on our CIIM product with a completely redesigned user interface and new administration tools.

The user interface was developed in partnership with Gooii Ltd and includes innovative navigation features for presenting archives of varying complexity. From simple high level descriptions to hierarchical documents which go down to individual item level.

The new administration tools gives staff at the Archives Hub much more control over the data. We extended the existing CIIM admin interface to include tools for managing the data ingest, enabling the use of “pipelines” to clean and enhance the data. These tools replace a lot of manual data manipulation routines, allowing subject specialists to undertake tasks which previously required technical input.

The CIIM product has been configured with custom ingest parsing for full archival descriptions (EAD), people and organisations (EAC) and repositories (EAG). The pre-processing pipeline is managed via an interface with the client’s existing mercurial repository to ensure version control. The public web interface is driven by CIIM’s Elasticsearch endpoint which also drives an OAI-PMH target (repository), which is used by Archives Portal Europe to harvest UK data.

View More Projects