Thousands of duplicate images are sitting inside the digital storage systems of Canberra's largest public institutions, and the people responsible for managing them say the problem is bigger — and more expensive — than most people realise. Archivists, IT procurement specialists and records management officials across the capital have spent much of 2026 debating the best approach to cleaning up what has become a significant administrative headache for agencies concentrated along the Northbourne Avenue corridor and in the parliamentary precinct.
The issue has sharpened this year partly because of cost. Cloud storage prices charged to government agencies have risen substantially since 2023, and federal departments operating under the Australian Government Information Management Office's storage frameworks are under renewed pressure to audit what they actually hold. Duplicate image files — created when staff scan, re-upload, or migrate records without deduplication protocols — represent a measurable drain on budgets that agencies would rather spend elsewhere.
What the Experts Are Saying
Records management specialists at the Australian National University's School of Computing have been examining the problem through an ongoing project focused on public sector digital asset governance. The core argument from researchers in that program is that the duplication problem is not primarily a technology failure — it is a workflow failure. Staff across large departments routinely save images in multiple locations, across shared drives, email attachments and content management systems, with no automated check to flag what already exists.
The National Archives of Australia, based in Parkes on Queen Victoria Terrace, has its own framework for managing digital records under the Archives Act 1983, but enforcement at the agency level varies considerably. Institutions that migrated records to cloud platforms between 2019 and 2022 — a period of rapid digital acceleration accelerated by COVID-era remote work — are now discovering that those migrations were rarely accompanied by rigorous deduplication steps. The resulting storage bloat is a known problem inside several Canberra-based departments, according to publicly available Australian National Audit Office performance audit reports on digital records management published in recent years.
At the University of Canberra's Faculty of Science and Technology in Bruce, researchers working on machine learning applications for image classification have argued in published work that hash-based deduplication tools — software that generates a unique fingerprint for each image file — can remove the majority of exact duplicates automatically. The harder category is near-duplicates: slightly cropped versions, different file formats of the same original, or images re-exported at different resolutions. Those require either manual review or more sophisticated perceptual hashing algorithms that the Australian Public Service has been slow to adopt at scale.
Local Institutions Caught in the Middle
The ACT government's own Digital Strategy, updated in late 2024, identifies digital asset management as a priority area, but the Territory Records Office on Rudd Street has limited capacity to police how images are stored across Health, Education and Transport Canberra directorates. Each directorate manages its own systems, and integration between them remains partial at best.
For institutions like the Australian War Memorial on Treloar Crescent in Campbell, which holds one of the country's largest collections of digitised historical photographs, the stakes are different. There, duplication is partly deliberate — preservation copies are kept in geographically separate locations — but cataloguing errors mean the same image can appear under different reference numbers, confusing researchers and adding costs to collection management.
Staff at smaller ACT agencies told The Daily Canberra the practical barrier to fixing the problem is time. A meaningful deduplication audit for a mid-sized agency with several terabytes of image data can take months if done carefully, and no dedicated budget line exists in most agency appropriations for that kind of remediation work.
The consensus among specialists is that agencies should start with a scoped pilot — selecting one file repository, running a deduplication tool against it, and documenting the results — before attempting anything system-wide. The Australian Signals Directorate has published technical guidance on data hygiene that touches on this approach, though its framing is primarily security-focused rather than administrative. For the public servants in Gungahlin and Belconnen processing paperwork every day, the practical upshot is straightforward: the next time your agency migrates to a new platform, push for deduplication to be written into the contract from day one.