Friday, July 1st, 2022
Ensemble-based Data Assimilation for Large Scale Simulations
Prediction of chaotic and non-linear systems like weather or the groundwater cyclerelies on a floating fusion of sensor data (observations) with numerical models todecide on good system trajectories and to compensate for non-linear feedback effects.Ensemble-based data assimilation (DA) is a major method for this concern. It relieson the propagation of an ensemble of perturbed model realizations (members) thatis enriched by the integration of observation data. Performing DA at large scale tocapture continental up to global geospatial effects, while running at high resolution toaccurately predict impacts from small scales is computationally demanding. This requiressupercomputers leveraging hundreds of thousands of compute nodes, interconnectedvia high-speed networks. Efficiently scaling DA algorithms to such machines requirescarefully designed highly parallelized workflows that avoid overloading of shared resources.Fault tolerance is of importance too, since the probability of hardware and numericalfaults increases with the amount of resources and the number of ensemble members.Existing DA frameworks either use the file system as intermediate storage to provide afault-tolerant and elastic workflow, which, at large scale, is slowed down by file systemoverload, or run large monolithic jobs that suffer from intrinsic load imbalance and arevery sensible to numerical and hardware faults. This thesis elaborates on a highly parallel,load-balanced, elastic, and fault-tolerant solution, enabling it to run efficiently statistical,ensemble-based DA at large scale. We investigate two classes of DA algorithms, the en-semble Kalman filter (EnKF), and the particle filter algorithm with sequential importanceresampling (SIR), and validate our framework under realistic conditions. Groundwatersensor data is assimilated using a regional hydrological simulation leveraging the ParFlowmodel. We efficiently run EnKF with up to 16,384 members on 16,240 compute coresfor this purpose. A comparison with an existing state-of-the-art solution on the samedomain, running 2,500 members on 20,000 cores, shows that our approach is about50 % faster. We also present performance improvements running particle filter withSIR at large scale. These experiments assimilate cloud coverage observations into2,555 members, i.e., particles, running the weather research and forecasting (WRF)model over the European domain. To manage the many experiments performed onvarious supercomputers, we developed a specific setup that we also present.
Mis à jour le 7 June 2022