Canberra's major public institutions are sitting on hundreds of thousands of duplicate digital images — redundant scans, re-uploaded photographs and copied file sets that are quietly inflating storage costs and undermining the integrity of records held by some of Australia's most consequential bureaucracies. The problem, long treated as a housekeeping issue, is now drawing serious attention as agencies face pressure to make digitised records searchable, shareable and legally defensible.
The timing matters. The federal government's Digital Records Transformation Initiative, which has pushed agencies toward cloud-based document management since 2023, has accelerated the volume of images entering departmental systems without a proportional investment in deduplication tools. The National Archives of Australia, based on Queen Victoria Terrace in Parkes, has publicly acknowledged the scale of its digitisation backlog but has not released specific figures on how much of that digitised material contains redundant content.
What Other Capital Cities Are Doing Differently
Wellington's Archives New Zealand adopted a mandatory deduplication protocol in 2021 as part of its Public Records Act reforms. Ottawa's Library and Archives Canada integrated perceptual hashing — a technique that detects visually similar images regardless of file format — into its ingest pipeline in 2022. Singapore's National Archives, operating under the National Library Board, runs automated duplicate-detection on every image submission before it enters the permanent collection. None of these systems is perfect, but each treats deduplication as infrastructure, not afterthought.
Canberra has no equivalent mandatory standard. Individual agencies make their own choices. The ACT Government's Shared Services ICT division, which manages technology platforms for territory-level bodies, uses Microsoft SharePoint and an assortment of legacy content management systems across directorates based in London Circuit and Macarthur Avenue. Whether those systems flag or remove duplicates depends entirely on how each directorate has configured its environment — and many have not configured it at all.
A 2024 report by the Australian Information Commissioner's office noted that poor data quality, including redundant records, was among the top compliance risks identified across surveyed federal agencies, though the report did not break out image duplication as a discrete category. Storage costs for federal government cloud environments have risen sharply since 2022, driven partly by unmanaged data growth.
The Local Stakes
For Canberra, this is not merely a bureaucratic inconvenience. The ACT's growing suburbs — Gungahlin in particular, where planning disputes routinely hinge on photographic evidence of development conditions — generate large volumes of site images that move through territory agency systems. When duplicate images persist across multiple versions of a planning file, the legal coherence of that record can be challenged. The ACT Civil and Administrative Tribunal has had to manage evidentiary disputes involving digital records, though no ruling specific to image duplication has been publicly reported.
The University of Canberra's Institute for Governance, on Kirinari Street in Bruce, has flagged the issue in submissions to the ACT Legislative Assembly's standing committee on justice and community safety, arguing that records management needs to be treated as a governance priority rather than an IT budget line.
Agencies and territory bodies that want to get ahead of the problem have practical options available now. Open-source tools such as dupeGuru and rmlint can scan file systems for duplicates at no cost. Commercial platforms including Proofpoint's Intelligent Compliance suite and Microsoft Purview offer enterprise-grade deduplication with audit trails. The National Archives has published guidance on digital preservation standards under its Digital Preservation Policy, updated in December 2024, which provides a baseline framework even if it stops short of mandating deduplication at ingest.
The federal budget handed down in May 2026 did not include a dedicated line for records deduplication infrastructure, but agencies with existing digital transformation funding can direct resources toward the problem without new appropriation. The question is whether anyone in the relevant directorates treats it as urgent enough to act before the backlog grows larger.