Met Office asks Cloudreach to purposefully break its systems

The planned 'Chaos Day' training session was designed to expose gaps in developer knowledge

Cloudreach has revealed how it was recruited to destroy critical parts of the Met Office's cloud application infrastructure to help the organisation work out how well equipped its CloudOps team was to investigate into and fix problems if they ever occur.

Dubbed "Cloud Chaos Day", Cloudreach was invited to break the Met Office's systems to help expose skills gaps and determine whether the current team could be trained to cover the areas of expertise missing, or new resources were needed. It would also reveal whether any documentation was required to help the CloudOps team respond to a failure, Cloudreach explained.

“The adoption of public cloud technology for operational service delivery is a big step for any organisation, introducing new practices that in some cases are quite different from those used for traditional on-premise delivery," Richard Bevan, head of operational technology at the Met Office said.

"The ‘cloud chaos day’ has enabled us to test our operating procedures in a safe environment, giving the Met Office the confidence that we are suitably prepared for the launch of our new services.”

Rather than automate the process, as other businesses including Netflix has done, Cloudreach thought it would fit in better with its Agile Development processes if it used an iterative model instead.

It spoke to the Met Office to work out what it could 'break' without it having a knock-on production impact, using AWS CloudFormation and the Sceptre tool it has developed to manage CloudFormation environments.

“This marked a milestone for the Met Office CloudOps team, a practical validation of the development journey undertaken so far," Jon Sams, cloud operations lead at the Met Office added.

"The CloudOps team consists of senior engineers from a number of traditional on premises operational support teams. Joining in partnership to support agile development teams to exploit the best of public cloud, the hard work and open relationships between Software Development, CloudOps & CloudReach was demonstrated practically in the Chaos day.”

Cloudreach explained the day was a success, with each team member approaching the challenges differently and ending the session with a stack of notes about missing documentation and skills, which will help the Met Office decide where it needs to invest in future.

Read more about:

Sign up for our free newsletter