Tens of thousands of duplicate image files are sitting inside Canberra's public sector digital archives, quietly inflating storage costs and slowing down the systems that government workers depend on every day. The problem isn't new, but the scale of it — now measurable in ways that weren't possible five years ago — is forcing a reckoning for agencies, universities, and local institutions alike.
The timing matters. The ACT government's Digital Strategy, which set out data management benchmarks through 2025, has now reached its review window. At the same time, federal departments headquartered on Northbourne Avenue and around the Parliamentary Triangle are under pressure from the Australian National Audit Office to demonstrate tighter information governance. Duplicate imagery — a mundane but expensive byproduct of large organisations sharing files across teams — has emerged as a concrete, quantifiable target.
What the Numbers Actually Show
Industry benchmarks published by data management consultancies suggest that between 20 and 30 percent of files stored in large organisational repositories are exact or near-exact duplicates. Apply that to a mid-sized federal agency with 10 terabytes of image storage — a conservative estimate for departments with communications, HR, and policy teams all uploading imagery independently — and you're looking at 2 to 3 terabytes of redundant data. At current enterprise cloud storage rates in Australia, which sit around $30 to $50 per terabyte per month depending on the provider and contract, even a single agency could be spending $1,000 or more annually on storage that does nothing but replicate files already held elsewhere on the same system.
The Australian National University on Acton Peninsula and the University of Canberra at Bruce both operate large digital asset management systems for research imagery, campus photography, and publication libraries. Academic institutions of their size routinely accumulate duplicate image sets when researchers work across departments or when project handovers happen without proper file hygiene protocols. A 2023 report by the Australasian Research Data Commons — which funds infrastructure used by both ANU and UC — identified redundant file storage as one of the top three avoidable costs in research data management, though it did not publish institution-specific figures.
For the ACT government's own digital holdings, the problem intersects with the Territory Records Act 2002, which requires agencies to maintain accessible, well-organised records. Duplicate files aren't just a storage cost — they create compliance risk. If an agency holds three versions of the same photograph with different metadata tags, determining which is the authoritative record takes staff time and, occasionally, legal advice.
Local Efforts and What Comes Next
The ACT's Digital, Data and Technology Solutions directorate, based in Canberra City, has been piloting deduplication software across selected government shared drives as part of a broader cloud migration program that began rolling out in late 2024. The tools work by generating a unique hash for each image file and flagging matches — even when file names have been changed — for review and consolidation.
Organisations working through the same process should expect to devote initial staff time to validation: automated tools flag duplicates, but humans still need to confirm which version carries the correct metadata before deletion. Agencies that have completed similar programs elsewhere in Australia have reported storage reductions of between 15 and 25 percent in image-heavy directories, with the work typically taking three to six months for a team of two or three information management officers.
For Canberra's large public service workforce — many of whom store imagery in shared drives on Callam Street in Woden, or across the various Russell offices — the practical upshot is likely to arrive as a prompted clean-up exercise rather than a sweeping system change. Departments will receive flags, staff will confirm or dispute them, and the redundant files will eventually come down.
The deeper issue is cultural: file-sharing habits formed during the pandemic, when teams across Civic and Barton were working remotely and emailing image attachments rather than linking to centralised repositories, created layers of duplication that are still being unwound. Getting those numbers down requires both the right software and the institutional will to use it consistently.