Canberra's public digital repositories are carrying tens of thousands of duplicate image files, and the agencies responsible for managing them are under growing pressure to do something about it. The problem is neither new nor unique to the capital, but a global push toward leaner, more searchable archives has put local institutions on notice: clean house, or fall behind.
The issue matters more acutely right now because federal and territory agencies are mid-cycle in a broader digital transformation drive. The Australian Public Service Commission's ongoing data governance push, accelerated after the 2023 Data and Digital Government Strategy was released, has made deduplication — the process of identifying and removing redundant copies of images and files — a formal compliance concern rather than an optional housekeeping task. For a city whose economy runs on public administration, that shift has real consequences for day-to-day information management across dozens of departments.
What Canberra's Institutions Are Actually Doing
The National Library of Australia, headquartered on Parkes Place in the parliamentary triangle, maintains the Trove platform, which aggregates digitised collections from institutions across the country. Library staff have acknowledged, in published technical documentation, that managing image duplication across contributor feeds is a persistent challenge, particularly as state and territory partners upload overlapping historical photograph sets. The Library uses automated hash-matching tools to flag likely duplicates, but human review is still required before deletion — a labour-intensive process that creates backlogs.
Closer to the suburban end of the city, the ACT Government's own digital archive unit, operating under the Territory Records Office in Greenway, handles imagery from planning, infrastructure and events collections. The office adopted a new records classification framework in July 2024, which for the first time included explicit guidance on how duplicate digital assets should be treated before long-term retention decisions are made.
The Australian National University's library system, which manages research image datasets across its Acton campus, has taken a different tack. ANU Libraries began piloting AI-assisted deduplication tools in early 2025 as part of a broader research data management initiative. The pilot drew on open-source perceptual hashing software to compare images at scale, a method increasingly common in research libraries internationally.
How Canberra Compares to Wellington, Edinburgh and Lausanne
The global picture is instructive. Wellington's National Library of New Zealand completed a major deduplication audit of its Papers Past photographic collection in 2023, removing more than 14,000 redundant image records and crediting the exercise with a measurable improvement in search result quality. Edinburgh's City Libraries, which manage the Capital Collections portal, have run a rolling deduplication program since 2021 and now process new uploads through automated checks before they enter the live catalogue. Lausanne's municipal archive adopted a centralised digital asset management system in 2022 that prevents duplicate ingestion at the point of upload — arguably the most efficient model of the three.
By those benchmarks, Canberra's approach looks reactive rather than preventive. The Territory Records Office's 2024 framework was a meaningful step, but it addressed classification and retention policy rather than technical prevention. Neither the National Library nor ANU has yet published outcomes data from their deduplication work, making direct comparison difficult. What is clear is that Wellington and Edinburgh both reached the prevention stage — stopping duplicates entering collections — years before Canberra's institutions adopted consistent detection protocols.
The cost of inaction is not trivial. Cloud storage pricing for government archives in Australia typically runs between $0.02 and $0.05 per gigabyte per month for warm storage tiers, and image collections are among the most storage-intensive asset types any agency manages. An unaudited collection carrying 20 percent redundancy — a figure cited in international library standards literature as a common baseline for institutions without active deduplication programs — can translate into hundreds of thousands of dollars in avoidable annual storage costs across a portfolio the size of the ACT's combined government holdings.
For Canberrans working in or around the public service, the practical upshot is straightforward: check whether your agency's digital asset management policy has been updated since mid-2024, and whether it includes technical controls rather than just policy guidance. Institutions that want to match the Wellington model should be looking at ingestion-level duplicate prevention tools now, before the next audit cycle begins in earnest.