For the last 18 months i’ve been contracting to the Department for Education as a Senior DevOps Engineer. It demonstrates the work I’ve been contributing and my experience using IaC with Azure.
This post was originally featured on the dxw blog.
Identifying areas of improvement
When we started out on this project, DfE was juggling infrastructure deployed on both the GDS Government Platform as a Service (GPaaS), and an ageing Azure tenancy left undocumented and poorly configured. At the time, each individual service was working in silos, with their own deployment strategies, and infrastructure configurations.
Looking at the existing infrastructure stacks, we started noticing similarities in architecture patterns. Most of the apps were written in .NET and launched either as Web App Services in Azure, or as containers in GPaaS. Some of them used Redis, and others depended on SQL Server.
If all the services were using the same patterns for architecture topology, we could save a lot of time by re-using the same Infrastructure-as-Code. It would also simplify the approach, meaning less documentation would be needed, and training could be provided to DfE engineers.
Introducing Terraform
We started by writing Terraform that covered the basic resources that the services needed. Knowing that some of the services were running in Containers meant that we could use the newly available Azure Container Apps product from Microsoft. This is a serverless, scalable platform that would enable us to quickly and efficiently deploy the services into Azure Cloud. Containerising the other .NET apps was fairly straightforward and we were quickly able to come up with a template Dockerfile that the other services could use.
At the time, Terraform didn’t have much traction in DfE. Only a few engineers were using it, others were using Bicep and most were not using any sort of Infrastructure-as-Code. At dxw our Technical Operations team are experts with Terraform, so we deemed it the most suitable option. Not only for us to develop from, but also to promote or seed further adoption within DfE.
After some time, we had got our initial Terraform configured such that we could deploy a Container Registry, the Container Apps, and optionally, a SQL Server and Redis.
At this point we were happy that we had a strong foundation that we could re-use across all the other services within RSD. So we decided to convert our work into a reusable module and published it on GitHub for other engineers to use.
Iterations and improvements
Inspired by the work we were contributing to DfE, we reflected on the Terraform we’ve been using across the internal hosting platform at dxw. This prompted us to revisit some of our older code, rewriting and refactoring to create a series of modules to better improve the interoperability of our infrastructure-as-code. We’ve also opted to publish a draft of our own Terraform Playbook and Terraform Module template that we continue to develop.
We bundled monitoring, alerting, diagnostic and application logging to support more comprehensive visibility across the infrastructure out-of-the-box. We made sure to follow Microsoft recommended best practices where available, and relied on our own expert experience when we needed to.
Aligning to a single set of standard deliverables meant that we could:
- foster consistency across all of the services
- make rapid iterations
- improve overall efficiency for the wider programme
This standardisation of infrastructure brought a lot of value to DfE. This was a strong start in normalising the approach to Azure Cloud across DfE. Having the Terraform module meant that we could make iterative changes to the configuration of infrastructure and propagate it quickly across all services.
A security-first approach
We had a strong focus on network security within Azure when building the module, meaning that any implementer would not need to take any further steps to make their infrastructure secure. Web Application Firewall (WAF), Azure Front Door CDN, Network Security Groups, Network Firewalls and Defender for Cloud were all great additions to include in the module.
Defender for Cloud is a Cloud security posture management (CSPM) tool that was already established within the Cloud Platform team at DfE so having the ability to enrol from the module was a sensible choice.
Getting noticed
Other engineers within DfE soon became aware of the published Terraform module. Adoption of Terraform across other programmes grew and dxw were proud to be the first to have established a ‘best-practice’ approach for others to follow. We established an unofficial Terraform support channel within the Slack Organisation so we could facilitate a break-out of silos, and promote cross communication with other engineering teams.
Over the next few months, we continued to develop the Terraform module. We added new features, tweaked configurations and tightened up security. We listened to the DfE community and worked with a number of other team engineers to implement extra infrastructure into the module, such as PostgreSQL Server, or custom sidecar containers (for example, ClamAV).
In recent months, the module has achieved a high level of maturity. We’ve received lots of positive feedback detailing how much time and effort has been saved by other programmes, being able to pick up the Terraform module and reduce their time-to-live in half. We continue to be led by the existing Cloud Platform, Networking and Infrastructure Operations teams within DfE to ensure that our infrastructure patterns follow DfE’s greater governance alignment.
Where are things now?
A year later, we’re proud to have supported DfE through this adoption phase, and published a number of Terraform modules that have been adopted across DfE.
These modules cover a number of other use cases:
- deploying more traditional Web App Services
- ingesting alerts and routing them to Slack using a Logic App Workflow
- escrowing Terraform Variable files into Azure Key Vault
And we’ve launched a Web Application Firewall for Application Gateway.
If you’re curious about what other Terraform modules we have been working on for our own hosting platform, you can check out our GitHub.