Archivists at the ACT Heritage Library discovered this week that an automated ingestion process used to digitise roughly 40,000 historical photographs had created duplicate copies at a rate high enough to compromise the integrity of the public-facing collection. The library, housed within ACT Libraries at the Civic branch on Petrie Plaza, halted new uploads to its online catalogue on Tuesday while staff work through a remediation process.
The timing matters. The duplication problem surfaced just as the ACT Government's broader Digital Canberra infrastructure program was preparing to publicise a milestone: the transfer of the pre-1980 photographic holdings from the Noel Butlin Archives Centre at the Australian National University to a jointly managed digital repository. That handover, years in the negotiation, was expected to give researchers and residents free online access to images spanning Canberra's early settlement and post-war growth across Gungahlin, Belconnen and the inner north.
How the Problem Emerged
The duplication issue is a known risk in large-scale digitisation work. When institutions use batch-processing software to ingest scanned files, images scanned at different resolutions or given inconsistent file-naming conventions can be logged as separate records even when the underlying photograph is identical. The ACT Heritage Library collection is understood to have used at least two separate scanning contractors across the project's three-year lifespan, which archivists believe contributed to the inconsistency.
The practical consequence is that the library's internal database now contains multiple catalogue entries pointing to the same image — in some cases, the same photograph appears under four or five distinct record numbers. For a collection intended to be searchable by street address, suburb and decade, the duplicates generate false results and inflate apparent holdings. Staff at the Civic branch are cross-referencing file metadata and image-hash records to identify and consolidate the redundant entries, a process the library estimates will take several weeks.
The University of Canberra's Cultural Informatics research group, based at the Bruce campus, has worked on similar deduplication challenges for state and territory institutions nationally. Automated perceptual-hashing tools — software that compares images pixel-by-pixel to find near-identical copies — have become standard practice, though they require manual verification before any records are permanently deleted from a heritage database.
What Comes Next for the Collection
The ACT Heritage Library confirmed through its public website this week that the online catalogue search function for the photographic collection remains active, but users may encounter duplicate results until remediation is complete. The library advises researchers who find what appears to be the same image listed twice to use the feedback form on its website to flag the records, helping archivists prioritise which entries to consolidate first.
For the broader ANU-ACT digital repository project, the duplication audit adds an unplanned phase before the collection can go live to the public. The project had been working toward a public launch in the third quarter of 2026. Whether the remediation pushes that date into late 2026 or beyond will depend on how quickly staff can clear the backlog.
Public servants and researchers at institutions along Northbourne Avenue who rely on historical images for heritage assessments, planning submissions and academic work are the most directly affected. Several heritage consultants who regularly access the library's collection for DA documentation in suburbs like Ainslie and Reid will face delays if the catalogue remains partially unreliable through August.
The ACT Heritage Library's Petrie Plaza branch is open Monday to Saturday. Researchers needing access to physical copies of affected photographs during the remediation period can lodge a request through the library's reference service. The Noel Butlin Archives Centre at ANU remains separately accessible for original holdings not yet transferred to the digital repository.