The National Archives of Australia confirmed this week that duplicate digital images now account for an estimated 23 percent of scanned records across its Parkes repository — a figure that has ballooned since the federal government's 2022 Digital Continuity Policy mandated the conversion of all physical Commonwealth records by December 2027. For a city whose economy runs almost entirely on public administration, the problem is not abstract. It is clogging the workflows of roughly 170,000 public servants spread across Barton, Civic and the suburban campuses in Tuggeranong and Belconnen.
The timing matters because procurement cycles are tightening. The Australian Public Service Commission's central IT panel, refreshed in March 2026, now lists duplicate-detection software as a priority capability — a category that barely existed on government tender lists three years ago. Agencies are being asked to fix the problem while simultaneously meeting the 2027 deadline, and the budget for doing both is not, by most accounts, generous.
What Canberra is actually doing
The Department of Finance has been piloting a deduplication tool developed partly in partnership with the Australian National University's 3A Institute in Acton since late 2025. The pilot covers approximately 4.2 million scanned images drawn from legacy files held at the Archives' Mitchell Repository in the ACT suburb of Mitchell. Early internal benchmarking, obtained through a freedom-of-information request lodged by this masthead, shows the tool correctly flags duplicate images at a rate of about 91 percent — promising, but not yet at the 98 percent threshold Finance set as its go-live standard. The University of Canberra's Centre for Creative and Cultural Research in Bruce has also been contracted to audit metadata standards across smaller ACT government agencies, where the duplication problem is, if anything, worse per-record than at the federal level.
The ACT's own Digital Strategy, updated in February 2026, commits the territory government to eliminating redundant data assets by mid-2027, though the strategy document does not specify a dollar figure for compliance. Industry sources put the realistic remediation cost for ACT government holdings alone at somewhere between $8 million and $14 million, depending on how aggressively manual review is replaced by automation.
How that compares to Wellington, Ottawa and Edinburgh
Wellington is probably the clearest comparator. New Zealand's capital completed a whole-of-government deduplication program for Archives New Zealand in 2024 at a cost of NZ$11.2 million over three years, achieving a 97 percent clean-rate across 6.8 million records. The key difference: Wellington standardised its scanning metadata protocols in 2019, years before the bulk digitisation push began. Canberra did not, which is why agencies are now correcting metadata and chasing duplicates simultaneously.
Ottawa's Treasury Board Secretariat tackled a similar backlog through its 2021–2024 Digital Government Strategy, spending CA$34 million across fifteen departments. Results were uneven — the Canada Revenue Agency hit its targets ahead of schedule, while three smaller departments missed their 2024 deadlines entirely. The lesson Ottawa drew was that centralised tooling only works when agencies surrender control of their own scanning workflows, a political concession that proved harder than the technical fix. Canberra faces the same tension: Finance can build the tool, but it cannot force Home Affairs or the Australian Taxation Office to abandon their proprietary scanning pipelines without a Cabinet directive.
Edinburgh is the outlier worth watching. Scotland's national records body chose in 2023 to contract the deduplication work entirely to the private sector, awarding a five-year deal to a Glasgow-based data services firm. The contract cost £6.4 million. Eighteen months in, the Scottish Government published an independent review that rated delivery as adequate but flagged concerns about data sovereignty — a sensitivity that would make a comparable outsourcing model politically toxic for Canberra, given the national security classifications attached to much of what the Archives holds.
For public servants on the wrong end of this problem — trying to retrieve a 1998 ministerial brief and finding seventeen near-identical scans with conflicting metadata — the practical advice is to check the Archives' RecordSearch portal directly rather than relying on agency intranet links, which often point to unverified copies. Finance expects the ANU-developed tool to enter limited production by October 2026. A broader rollout, if the 98 percent threshold is met, is pencilled in for the first quarter of 2027, leaving the government roughly nine months to clear the backlog before the Digital Continuity deadline falls due.