Federal agencies based in Canberra collectively manage tens of millions of digital image files, and a growing number of those files are exact or near-exact duplicates sitting redundantly across departmental servers — costing taxpayers in storage, slowing workflows, and complicating records management obligations under the Archives Act 1983. The problem did not emerge overnight. It is the product of decades of ad hoc digitisation drives, machinery-of-government changes, and the steady expansion of hybrid working arrangements that accelerated sharply after 2020.
The issue matters acutely right now because the National Archives of Australia, headquartered on Queen Victoria Terrace in Parkes, has been stepping up enforcement of digital records standards ahead of a broader overhaul of Commonwealth information governance expected to land before parliament later this year. Agencies that cannot demonstrate clean, de-duplicated image libraries face the prospect of compliance notices and, in some cases, mandatory remediation programs that carry their own costs and timelines.
How the Backlog Built Up
The roots of the duplicate-image problem trace back to the wave of machinery-of-government changes that reshuffled departments repeatedly between 2013 and 2022. Every time a division moved from one agency to another, its digital assets — including images from reports, ministerial briefs, communications campaigns and internal training materials — were typically migrated in bulk with minimal curation. Duplicates from the source environment came along for the ride and were then layered on top of whatever already existed at the receiving agency.
The Australian Public Service Commission, based in Chifl ey Square on Chifl ey Street in the city centre, has been tracking workforce digitalisation trends and noted in its most recent State of the Service Report that the number of APS employees using shared cloud-based file platforms crossed the majority threshold for the first time in the 2023-24 financial year. Shared platforms reduce some duplication at the point of creation but do nothing to address the legacy archives already baked into older on-premises systems that many larger departments have been slow to decommission.
At the Australian National University in Acton, researchers attached to the 3A Institute and the School of Computing have separately documented how image deduplication at scale requires more than running a basic hash-matching script. Visual near-duplicates — images that are cropped, resized, colour-adjusted or lightly edited versions of the same original — are not caught by simple file-comparison tools. Identifying them requires perceptual hashing algorithms or machine-learning classifiers, technologies that smaller agencies rarely have in-house.
The Compliance Pressure Forcing Action
Under the current iteration of the Archives Act, Commonwealth agencies are required to maintain accurate and accessible records, and image files are explicitly covered. The National Archives published updated guidance on digital asset management in March 2025, giving agencies an 18-month window to bring their holdings into line. That clock runs out in September 2026 — less than three months away.
For agencies concentrated in Barton, Forrest and the parliamentary triangle, the remediation task is substantial. A cluster of mid-sized departments that went through the Shared Services consolidation program — which centralised back-office functions including some digital storage — found themselves inheriting combined image libraries with no clear chain of custody for older files. The Digital Transformation Agency, which sits on Constitution Avenue in Reid and oversees government technology uplift programs, has been offering technical assistance, but take-up has been uneven.
Procurement records show that at least several ACT-based agencies have turned to private contractors to run deduplication audits, with quoted project costs for a medium-sized department ranging from roughly $80,000 to well over $200,000 depending on archive size and the age of legacy systems involved. That expenditure is avoidable for agencies that move quickly and use the DTA's own toolkit — but that requires staff capacity at a moment when efficiency dividend pressures are already squeezing technology teams.
Agencies that miss the September deadline should expect to engage directly with the National Archives on a remediation timeline. Those that have already begun audits are advised to document the methodology used — including whether the deduplication tool distinguishes visually similar images from byte-identical ones — because that distinction will matter if compliance is later tested. For public servants working through the process in offices scattered from Woden to Civic, the practical starting point is a full inventory of where image files currently live before any deletion occurs.