Today I did some more cleanup work on the /opt/rtcds file system following yesterday's full-filesytem errors.
We perform hourly zfs snapshots on this file system, and zfs-sync them to the backup machine h1fs1 at the same rate. h1fs0 had hourly snapshots going back to May 2016.
Yesterday I had deleted all of May and thinned June down to one-per-day. Today we made the decision that since all the files are backed up to tape, we can delete all snapshots older than 30 days. This will ensure that disk allocaed to a deleted file will be recovered after the last snapshot which references it is destroyed after 30 days. I destroyed all snapshots up to 26th September 2016.
After the snapshot cleanup, the 928G file system is using 728G (split as 157G used by snapshots and 571G used by disk-system). This is a usage of 78% which is what DF reports.