Canberra's public institutions are sitting on digital image libraries bloated with duplicates — redundant files consuming server capacity, inflating licensing costs, and making archival retrieval slower across agencies that together hold some of the most significant documentary collections in the country. The problem is not new, but a confluence of factors in 2026 is pushing it toward the top of IT budgets across the ACT.
The issue matters now because the federal government's ongoing data consolidation push — part of broader whole-of-government digital reform — is exposing just how disorganised image asset management has become inside departments clustered along the Parliamentary Triangle and beyond. When agencies began migrating legacy storage to cloud infrastructure over the past two years, internal audits repeatedly flagged duplicate imagery as a primary driver of unexpected storage overruns. The Australian National Audit Office has flagged digital asset governance as a recurring concern across multiple agency reviews in recent years.
What Canberra Is Actually Doing About It
Two institutions stand out for taking the problem seriously. The National Library of Australia, based at Parkes Place in the Parliamentary Triangle, has been running a deduplication program across its Trove platform since at least 2024, using perceptual hashing tools to identify near-identical digitised images across its holdings. The library's digital preservation team has described the challenge publicly as managing millions of image files where even small format conversions can produce what the system registers as distinct assets.
Across Lake Burley Griffin, the Australian War Memorial in Campbell has similarly been working through its digitised photographic collection — one of the largest military image archives in the southern hemisphere — using automated tools to flag duplicates before human review. The scale of the task is significant: the Memorial's collection spans physical and born-digital assets accumulated over more than a century.
At the Australian National University in Acton, researchers within the College of Engineering, Computing and Cybernetics have been examining deduplication algorithms as part of broader machine learning projects. The university's proximity to so many major public collections has made it a natural partner for institutions looking for cost-effective technical solutions, though formal joint programs remain limited in scope.
How Canberra Compares to Wellington, Ottawa and Edinburgh
Cities with comparable profiles — mid-sized capitals dominated by government and cultural institutions — offer a useful benchmark. Wellington's Archives New Zealand began a structured digital deduplication audit in 2023, ultimately reducing its active image storage footprint by an amount the agency described publicly as substantial, though precise figures were not independently verified. Ottawa's Library and Archives Canada rolled out an enterprise digital asset management platform in stages between 2022 and 2025, with duplicate detection baked into the ingestion workflow from the start — a design choice Canberra agencies largely did not make when building their own systems a decade ago.
Edinburgh's National Records of Scotland took a hybrid approach: automated flagging followed by mandatory curatorial sign-off before any file is deleted. That model is closer to what the National Library in Parkes is doing now, but Edinburgh had the advantage of implementing it at the point of digitisation rather than retrospectively.
The retrospective nature of Canberra's problem is its defining difficulty. Storage costs on government-contracted cloud platforms are not trivial — enterprise-grade cloud storage for large image files typically runs at rates where even modest reductions in redundant assets translate to savings measurable in tens of thousands of dollars annually across a large agency. For the ACT government's own holdings, including records managed through the Territory Records Office in Greenway, the same logic applies at a smaller but still meaningful scale.
For public servants and researchers who rely on these collections — whether pulling historical imagery for policy documents or accessing digitised maps from the Nolan Collection in Acton — the practical upside of better deduplication is faster search results and more reliable version control. The risk of getting it wrong is deletion of a file that only looked like a duplicate. That is why every institution contacted for background on this story described human review as non-negotiable, regardless of how good the algorithm is. The next twelve months, as whole-of-government cloud migration deadlines press closer, will test whether that caution survives budget pressure.