Donating spare computing capacity to help with COVID-19
As COVID-19 has reached pandemic proportions, we have been exploring ways to leverage our resources to contribute to research for lifesaving therapies.
We found we could donate idle compute time to Folding@home, a computing project that simulates protein dynamics, including the process of protein folding associated with a variety of diseases.
With the rise of COVID-19, Folding@home has become the largest distributed supercomputer network in the world. Volunteer scientists use Folding@home’s network to run simulations on their personal computers, which could lead to new drug therapies.
We’re dedicating all our spare stress testing capacity to Folding@home. It’s a small contribution to this global effort, and we hope it helps find a cure.
During the first three months of 2020, we processed more than 25 million hours of work. If all of our clients agreed to donate their idle resources, we would be able to provide Folding@home with 5 million hours of computing resources to run more simulations as scientists look for COVID-19 therapies.
In 2012, Milliman became one of the first companies to operate a hyperscale global distributed computing service in the cloud, which runs on Microsoft Azure. It’s typical for us to run actuarial models on over 100,000 cores daily for our clients’ risk management needs.
Actuaries use mathematical models to calculate insurance risks. Actuarial models are proxies for the real world that deal with uncertainty and risk—which we have much of today. The models require an immense amount of computational power in the same way protein folding analyses do.
Stress testing our platform
Stress testing the modelling platform is complex, and we spend in excess of $1 million annually to ensure our software meets the demands of operating at that scale. Part of this stress test is to evaluate the impact of running computationally intensive workloads on our infrastructure. Since Folding@home workloads and actuarial workloads share very similar computing characteristics, it’s a natural fit for our scale-testing needs.
The team decided that during their stress testing runs the virtual machines could be loaded with Folding@home which could run in the idle time between stress jobs.
The benefit of cloud computing is that you rent the resources you need—as little or as much as you require—a property referred to as elasticity. A challenge with elasticity is that it takes time to expand and contract. We optimise our workloads to maximise utilisation, and a big part of this is limiting how often we expand and contract.
Typically, our utilisation rates are in excess of 80%, which we are quite proud of. Another overhead is data movement—the time it takes to move information around the cloud resources. We have optimised our platform to reduce data movement overhead, which contributes 4% of our modelling cost.
To put these numbers in perspective, we typically consume around 1,800 core years (the equivalent of leaving your home computer turned on for 1,800 years) each quarter. The overhead of elasticity was therefore roughly 360 core years and data movement overhead was 72 core years.
This added up to 432 core years of overhead each quarter, some of which we could leverage for other purposes. But it’s hard to know how much is usable as much of this overhead is within Azure. The team noticed that we could use that overhead for good. So, they adapted the platform to run Folding@home in the background while it was waiting for new work or moving data.
Leveraging idle time
Since the COVID-19 crisis, actuaries are doing a considerable amount of extra analysis to assess its impact on the insurance industry, and it presents an opportunity to contribute idle computing time towards finding lifesaving therapies. We’re offering the opportunity for our clients to opt in to allow Folding@home to run in the background when they’re running their models.
For more information, contact Tom Peplow.
Learn more about Folding@home at https://foldingathome.org.