Canberra's public sector has a storage problem it rarely talks about publicly. Across ACT government departments, the Australian National University, and the University of Canberra, digital asset libraries have grown bloated with duplicate images — the same photograph stored two, three, sometimes a dozen times under different file names, in different folders, on different servers. The duplication isn't trivial. Analysis of comparable federal agency workflows in Australia has found that duplicate image files can account for between 20 and 40 percent of total digital storage consumption in organisations that lack automated deduplication protocols.
That range matters in a city where the federal and territory public service is the dominant employer and the dominant generator of digital content. With the ACT government's Digital Strategy 2025–2028 pushing agencies toward cloud-first infrastructure, the cost of redundant data is now measured in real dollars rather than dusty hard drives in a server room on Northbourne Avenue.
What Duplication Actually Costs
Cloud storage is not free. Enterprise-grade cloud contracts for mid-sized government agencies in Australia — the kind used by ACT Health or the ACT Education Directorate — typically run at rates where a single terabyte of actively managed storage costs materially more than consumer-grade alternatives, once licensing, security compliance, and redundancy tiers are factored in. When a communications team at a Civic-based agency uploads the same hero image from a ministerial announcement 15 times across a six-month campaign cycle, that overhead accumulates invisibly in budget line items labelled simply as "ICT infrastructure."
The University of Canberra's Bruce campus and ANU's Acton precinct both run substantial digital asset management systems for research publications, marketing, and internal records. In higher education broadly, a 2023 industry report from EDUCAUSE — a US-based nonprofit that tracks technology in higher education — found that duplicate and near-duplicate files represented roughly 30 percent of institutional image library volume at universities with more than 10,000 students. Neither UC nor ANU has publicly released its own deduplication audit figures, but both institutions are large enough that the EDUCAUSE benchmark offers a plausible order of magnitude.
The practical trigger for renewed attention to this issue is not just storage cost. It is the growing use of AI-powered content management tools across the ACT public sector. When duplicate images sit in a library, machine-learning cataloguing systems — increasingly used to tag and retrieve assets — train on redundant data, skewing metadata outputs and slowing retrieval times. An image of the new Woden Town Centre development, for instance, stored 18 times under variant file names, tells a cataloguing algorithm very little about what is actually new in the collection.
What Agencies and Institutions Can Do Now
Deduplication is not a new concept, but uptake has been inconsistent. The Australian Signals Directorate's Information Security Manual, last updated in its 2026 edition, includes guidance on data minimisation that implicitly covers redundant file stores, though it stops short of mandating specific deduplication schedules for non-sensitive imagery. Several ACT-based agencies have begun trialling perceptual hashing tools — software that generates a fingerprint for each image and flags near-identical files regardless of file name or format — as part of broader data governance reviews.
The Gungahlin and Belconnen community hubs, both of which house ACT government service delivery offices with active social media and community communications teams, represent exactly the kind of operational environment where deduplication workflows would show rapid returns. A single communications officer managing image libraries for multiple suburbs, uploading event photography from the Gungahlin town centre market or the Belconnen Arts Centre, can inadvertently create hundreds of duplicate files in a single financial year without a systematic check in place.
For institutions starting from scratch, the recommended first step is a baseline audit — running a perceptual hash scan across the full image library to produce a duplication rate figure. That number, whatever it turns out to be, becomes the benchmark. From there, automated ingestion rules that reject exact duplicates at the point of upload cost relatively little to implement and prevent the problem from compounding further. The data problem, in other words, is easier to stop than to reverse. Canberra's agencies would do well to start counting before the bill gets any larger.