Skip to main content
The Daily Canberra

All of Canberra, every day

News

Canberra's Digital Archives Race to Purge Duplicate Images — Here's What Changed This Week

A push to clean up redundant digital records across ACT government databases has accelerated, with local institutions adopting automated detection tools to cut storage costs and improve public access.

Share

By Canberra News Desk · Published 5 July 2026, 4:43 am

4 min read

Updated 4 h ago· 5 July 2026, 12:17 pm

How we reported this

This article was generated by AI from the linked public sources. The Daily Canberra is independently owned and covers Canberra news free from advertiser or sponsor influence. Read our editorial standards →

ACT government agencies and Canberra's two major universities moved this week to accelerate the removal of duplicate digital images from public-facing archives and internal records systems, responding to a growing storage bill and long-standing complaints from researchers who rely on clean datasets for policy and academic work.

The shift matters now because ACT digital infrastructure budgets are under pressure heading into the 2026–27 financial year, and duplicate image clutter has become a documented drag on systems used daily by public servants in Civic, Barton and across the broader federal precinct. Redundant files inflate cloud storage costs, slow retrieval times and, in some cases, have caused errors in public-records requests processed through the ACT Government's Access Canberra portal.

What Happened This Week

The Australian National University's Scholarly Information Services division, based on the Acton campus, confirmed it began deploying a perceptual hashing tool across its digitised photographic collection on Tuesday. The method compares image fingerprints rather than raw file sizes, catching near-identical duplicates that older rule-based filters miss. The library's digitised holdings — which include historical Canberra town planning photographs dating to the early 1960s — had accumulated an estimated tens of thousands of redundant files across multiple migration cycles over the past decade.

Separately, the ACT State Records office in Dickson notified internal government clients that a deduplication audit scheduled for late June had wrapped up ahead of schedule, with results distributed to relevant directorates on Wednesday. The audit covered image assets held in the Territory Records system, focusing on files ingested during the COVID-era digitisation push of 2020 and 2021, when speed of upload took priority over quality control.

At the University of Canberra's Bruce campus, the library's digital collections team said it had completed the first phase of a similar project in late June, removing redundant files from the institution's regional photographic archive — a collection used by urban planning students and ACT heritage researchers. A second phase, covering born-digital assets acquired since 2018, is scheduled to begin this month.

Why Storage Costs Are Driving Urgency

Cloud storage pricing has shifted the calculus for public institutions. Australian government bulk storage rates under the Digital Transformation Agency's whole-of-government cloud agreements have not insulated agencies from the volume problem — the more files, the higher the bill, regardless of whether half of them are copies. Industry benchmarks published by the Australian Government Information Management Office suggest that duplicate and redundant data can account for between 20 and 40 percent of an organisation's total data volume, though individual agency figures vary widely.

For Canberra's public-sector-heavy economy, the downstream effects are practical. Public servants at agencies clustered around London Circuit and Constitution Avenue routinely access shared document management systems where image duplicates can cause versioning confusion and slow search returns. The ACT Government's Digital Strategy 2025–2028, published last year, nominates data quality as a priority action area, which has given directorates some budget cover to pursue deduplication projects that might previously have been deferred.

The timing also intersects with broader national conversations about data quality. The Australian Bureau of Statistics, headquartered in Belconnen on Benjamin Way, has published guidance this year encouraging government data custodians to treat deduplication as a prerequisite for AI-readiness — a framing that has resonated with agencies eager to position themselves for machine-learning applications.

For Canberra residents who use Access Canberra services or access ACT heritage collections online, the practical upshot should be faster load times and more accurate search results over coming months. Researchers at ANU and UC who query institutional repositories can expect cleaner return sets from image searches by the end of the third quarter of this year. Anyone who spots duplicate entries persisting in public-facing systems can report them directly through the Access Canberra feedback form or contact the relevant university library's digital collections desk — both institutions have flagged they are treating user reports as part of their ongoing quality assurance process.

You might also like

Editorial picks

How did this story land?

Spread the word

Share

Have your say

Loading comments…

Sources

About this article

Published by The Daily Canberra

Covering news in Canberra. This article was generated by AI from the linked sources and was not reviewed by a human editor before publishing. See our editorial standards.

Spread the word

Share

See something wrong? Suggest a correction.

Daily brief

Enjoyed this? Wake up to Canberra news every morning.

Free, in your inbox before 7am. Weekdays.

By subscribing you agree to receive emails from The Daily Canberra and accept our Privacy Policy. Unsubscribe anytime.

The Daily Network — local news across Australia