Skip to main content
The Daily Canberra

All of Canberra, every day

News

Canberra's Digital Archives Have a Duplicate Image Problem — and Officials Are Finally Talking About It

Government agencies, ANU researchers and archivists say duplicate digital images are wasting storage, distorting records and costing taxpayers real money.

Share

By Canberra News Desk · Published 5 July 2026, 4:35 am

4 min read

How we reported this

This article was generated by AI from the linked public sources. The Daily Canberra is independently owned and covers Canberra news free from advertiser or sponsor influence. Read our editorial standards →

Canberra's public sector is sitting on enormous repositories of duplicate digital images — redundant scans, copied photographs and replicated graphics spread across agency servers — and the people responsible for managing those collections say the problem has quietly grown into a significant administrative and financial burden.

The issue sits at the intersection of two forces pressing down on the federal government simultaneously: an aggressive push to digitise legacy paper records, and a chronic failure to establish consistent deduplication standards before those digitisation programs began. Agencies that started scanning in earnest between 2018 and 2022 now find themselves holding multiple identical or near-identical files with no automated system to reconcile them.

Why Archivists and IT Managers Are Raising the Alarm Now

The National Archives of Australia, based on Queen Victoria Terrace in Parkes, manages millions of digital files on behalf of the Commonwealth. The Archives has publicly acknowledged the challenge of managing exponential growth in born-digital and digitised records, and specialists in the records management community say duplicate image accumulation is a material contributor to that growth — inflating storage costs, complicating retrieval, and in some cases creating legal uncertainty about which version of a document constitutes the authoritative record.

At the Australian National University in Acton, researchers working on computational archiving and digital humanities have spent several years developing tools specifically designed to identify duplicate and near-duplicate images within large institutional collections. The work draws on hash-matching algorithms and perceptual similarity techniques. It is precisely the kind of applied research that federal agencies need but have historically been slow to procure or adopt at scale.

The University of Canberra's Faculty of Arts and Design, located on Kirinari Street in Bruce, has also been active in digital preservation questions, particularly as they relate to government photography collections and cultural heritage materials held by ACT institutions. Faculty members involved in that work have pointed to a gap between what the technology can now do and what agencies are actually deploying.

Several smaller ACT government bodies — including those managing property and land records under the ACT Planning directorate — are known to be grappling with the same issue at a local level, particularly as the territory's planning reform process has generated large volumes of newly scanned legacy documents since 2023.

What the Experts Say Should Happen Next

Records management professionals who work across the federal precinct broadly agree on three practical steps. First, agencies need a mandatory deduplication audit before any new digitisation tranche begins — not after. Second, procurement standards for digitisation contracts should require vendors to deliver deduplicated outputs as a baseline condition, not an optional extra. Third, the National Archives needs a formal policy instrument, not just guidance, that defines which file is the authoritative copy when duplicates exist.

The cost dimension is not trivial. Cloud storage for government data is priced on volume, and independent analysis of similar programs in comparable jurisdictions has suggested that duplicate image accumulation can inflate storage footprints by anywhere from 20 to 40 per cent in mature digitisation programs. Applied to a federal estate the size of Australia's, that represents a recurring annual expenditure measured in millions of dollars.

For the ACT government, the stakes are somewhat different but no less pressing. The territory's transition to the Integrated Planning Act framework from March 2023 created an immediate surge in document digitisation across the Environment, Planning and Sustainable Development Directorate. Staff handling those records have had to make day-to-day decisions about which scanned files to retain without clear central guidance.

The National Archives is understood to be working on updated digital continuity policy guidance, though no release date has been publicly confirmed. In the meantime, records managers across the Russell offices and the Barton administrative precincts say the practical workaround is manual — time-consuming, expensive, and not sustainable as digitisation volumes continue to climb. The conversation has started. The policy has not caught up.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Canberra

Covering news in Canberra. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Canberra news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Canberra and accept our Privacy Policy. Unsubscribe anytime.

The Daily Network — local news across Australia