A technique to automatically minimise the re-computation when a data analysis program is iteratively changed, or added to, as is often the case in exploratory data analysis in astronomy.
A typical example is ﬂagging and calibration of demanding or unusual observations where visual inspection suggests improvement to the processing strategy. The technique is based on memoization and referentially transparent tasks. We describe the implementation of this technique for the CASA radio astronomy data reduction package.
We also propose a technique for optimising eﬃciency of storage of memoized intermediate data products using copy-on-write and block level de-duplication and measure their practical eﬃciency. We ﬁnd the minimal recomputation technique improves the eﬃciency of data analysis while reducing the possibility for user error and improving the reproducibility of the ﬁnal result. It also aids exploratory data analysis on batch-schedule cluster computer systems.
Recipe is described in Minimal Re-computation for Exploratory Data Analysis in Astronomy, Nikolic, Small and Kettenis, proceedings of ADASS 2017, arXiv:1711.06124 and Minimal Re-computation for Exploratory Data Analysis in Astronomy, Nikolic, Small and Kettenis, Astronomy and Computing 2018A&C….25..133N, arXiv:1809.01945