Canberra's federal and territory agencies are sitting on an estimated backlog of duplicate digital images stretching back to early scanning programs begun around 2014, when the Australian Public Service Commission pushed departments to accelerate paperless workflows. The duplication problem — identical or near-identical files stored multiple times across siloed systems — is now drawing scrutiny from records managers and IT procurement officers across the capital, at a moment when cloud storage costs and data governance obligations are rising together.
The timing matters. The federal government's Digital Transformation Agency has been pressing agencies since early 2025 to meet updated records management standards under the Archives Act review framework, and the ACT government is simultaneously modernising its own holdings ahead of light rail Stage 2 corridor planning, which will generate large volumes of environmental assessment imagery, survey photographs and engineering scan records. Duplication in those holdings isn't just an administrative nuisance — it creates legal exposure when agencies cannot confirm which version of a document is authoritative.
What's Happening on the Ground in Canberra
Two institutions are at the front of local efforts to get this under control. The National Archives of Australia, headquartered on Queen Victoria Terrace in Parkes, has been piloting a deduplication workflow using perceptual hashing tools — software that compares image fingerprints rather than file names — as part of a broader digitisation quality project. Separately, the Australian National University's Scholarly Information Services team in Chifley Library has been running a smaller-scale program since mid-2024 to purge duplicate scan outputs from its institutional repository, which holds research imagery produced by ANU colleges including the Research School of Earth Sciences.
Neither program is operating at the scale that comparable institutions elsewhere have reached. The ACT government's own records arm, Access Canberra, acknowledged in its 2024–25 annual report tabling documents that interoperability between territory and Commonwealth image repositories remains incomplete, meaning duplicates can exist simultaneously in both systems for the same project files — for example, planning imagery linked to the Molonglo Valley development corridor.
How Helsinki and Singapore Compare
The gap becomes stark when Canberra's approach is set against what two similarly sized, governance-heavy capitals have done. Helsinki's City Executive Office completed a system-wide image deduplication audit across its 47 municipal departments in 2023, using an AI-assisted matching tool developed with Aalto University. According to documentation published by the city, the audit recovered roughly 34 terabytes of redundant storage across a 2.1 petabyte estate — savings that translated directly into reduced annual cloud licensing costs.
Singapore's Integrated Land Authority — a reasonable functional analogue to the ACT Planning directorate given both manage dense urban land records — embedded automated duplicate detection at the point of ingest from 2022 onward, meaning files are checked against existing holdings before they are written to storage rather than cleaned up retrospectively. That upstream approach is widely regarded among records management practitioners as significantly cheaper than post-hoc audits, because retrospective deduplication requires human review of flagged near-matches where images differ only slightly — a different time stamp on a site photograph, for instance, or a cropped version of a heritage building scan.
Canberra agencies are still largely doing it retrospectively. Storage costs for Australian government cloud infrastructure under the whole-of-government Microsoft Azure arrangement — details of which are set under coordinated procurement through the Digital Transformation Agency — are not publicly broken down by agency, making it difficult to quantify what the duplication backlog is costing in dollar terms. Independent estimates from the records management sector have put per-terabyte annual costs for compliant government cloud storage at between $30 and $60 Australian dollars, though those figures vary by contract tier.
For public servants and records staff dealing with this daily — particularly those working on the growing image libraries tied to the Gungahlin town centre redevelopment approvals and the Belconnen Arts Centre heritage documentation project — the practical next step is to push for ingest-point detection before the next major digitisation contract is signed. The National Archives is expected to release updated digitisation procurement guidance before the end of the 2026 calendar year. Whether territory agencies plug into that guidance or continue procuring separately will shape how quickly Canberra closes the gap on Helsinki and Singapore.