Federal agencies based in the ACT collectively manage tens of millions of digital image files — and a significant portion of them are exact or near-exact duplicates. That is the central finding driving a quiet but growing push inside Canberra's public sector to audit, deduplicate and replace redundant image assets before storage and licensing costs compound further.
The timing matters. With the federal government's digital investment framework under review and light rail Stage 2 construction squeezing ACT infrastructure budgets, every department looking to trim operational overhead is being pushed to examine its data estate. Image duplication — mundane as it sounds — has emerged as a measurable, fixable cost centre that sits largely unexamined inside communication, records management and web publishing teams.
What the Data Actually Shows
Industry benchmarks from digital asset management research published in recent years suggest that between 20 and 40 percent of images stored in large organisational repositories are redundant — either identical copies or visually near-identical variants created during editing workflows. For an agency maintaining 500,000 image files, that translates to potentially 100,000 to 200,000 files consuming storage, requiring backup cycles and, where stock photography is involved, generating ongoing licensing liability.
The Australian National University's digital collections team and the National Archives of Australia, both headquartered within a few kilometres of each other on the Acton and Parkes precincts respectively, operate image repositories that run well into the hundreds of thousands of assets. Neither institution has publicly disclosed a deduplification audit result, but the scale makes them obvious candidates for the kind of systematic review that comparable institutions overseas have undertaken.
Cloud storage is not free. Enterprise-grade cloud object storage — the kind used by federal agencies compliant with the Australian Signals Directorate's cloud security guidelines — typically runs at several cents per gigabyte per month at scale. A repository carrying 10 terabytes of unnecessary duplicate image data can quietly accumulate tens of thousands of dollars in annual storage costs before anyone flags it as a line item worth scrutinising.
Canberra's Specific Exposure
The ACT government's own digital communications teams, spread across offices in Civic, Dickson and the Hume administrative precincts, face the same structural problem. Communications units routinely receive images from multiple sources — ministerial photographers, contracted agencies, stock libraries and internal staff — and file them across shared drives, content management systems and social media scheduling tools with limited deduplication controls in place.
A practical complication: many image management platforms flag duplicates only when files are byte-for-byte identical. A photo resaved at a slightly different compression ratio, or cropped by a single pixel, will evade basic hash-matching checks. More sophisticated perceptual hashing tools — which compare the visual content of images rather than their file fingerprints — are available but require deliberate procurement and implementation decisions that smaller teams often defer.
The University of Canberra's Centre for Creative and Cultural Research has examined digital preservation practices in government contexts, and the broader literature points to a consistent pattern: duplication compounds fastest in the first three years after a new content management system is deployed, as staff migrate legacy files without systematic deduplication. ACT government departments that moved platforms between 2020 and 2023 would, on that pattern, now be sitting at peak redundancy exposure.
For public servants and communications managers looking at this problem practically, the first step is a storage audit using open-source tools such as dupeGuru or platform-native duplicate detection features in systems like Sharepoint or Adobe Experience Manager. The second is establishing a clear image naming and versioning convention before new assets enter any repository. The third — and most consequential for agencies with licensing exposure — is cross-referencing stored stock imagery against active licence agreements, since retaining images beyond a licence term creates legal risk on top of storage cost.
Budget estimates for a mid-sized agency deduplification project — covering audit tooling, staff time and remediation — typically run between $15,000 and $60,000 depending on repository size. Ongoing savings in storage and licensing can repay that within two financial years. For a public sector under sustained pressure to do more with less, that arithmetic is becoming harder to ignore.