The Use of Infrastructure as Code in Regulated Companies
IT infrastructure has traditionally been provisioned using a combination of scripts and manual processes. This manual approach was slow and introduced the risk of human error, resulting in inconsistency between environments or even leaving the infrastructure in an unqualified state. In this article, we investigate some fundamental advantages of using Infrastructure as Code (IaC) for provisioning IT infrastructure.
Historically, scripts were stored in version control systems or documented step-by-step in installation guides. Often, the person writing the installation guide was not the same person following it or executing the scripts. The cloud introduces IaC as a provisioning method. IaC is a means of provisioning and deploying infrastructure using development/operations processes. In combination with version control and automation, IaC enhances quality as it relates to compliance and operational stability.
As stated in ISPE GAMP® 5 Guide: A Risk-Based Approach to Compliant GxP Computerized Systems (Second Edition), “IaC enables organizations to automate the provisioning of infrastructure, reducing the risk of human errors. Infrastructure code is subject to configuration management ensuring that all code changes are traceable. Infrastructure code development is subject to risk-based software development practices that ensure code is developed in accordance with a life cycle approach including verification prior to deployment.”
What is IaC?
According to the National Institute of Standards and Technology (NIST), IaC is “the process of managing and provisioning an organization’s IT infrastructure using ma-chine-readable configuration files, rather than employing physical hardware configuration or interactive configuration tools.”
Templates or Code for Implementing IaC
There are template-based and code-based options available for implementing and managing IaC. Templates enable developers to describe and create resources in an orderly and predictable fashion. Resources are written in static text files using JavaScript Object Notation (JSON) or Yet Another Markup Language (YAML) format (see Figure 1).
The templates require a specific syntax and structure that depends on the types of resources being created and managed. The programmer creates the resources in JSON or YAML with any code editor, checks it into a version control system, and then provides it to a service that interprets the template and provisions the specified resources in a safe, repeatable manner based on the supplied template.
If a developer needs to make changes to the running resources, they update their template and trigger a redeployment. Optionally, before changes are applied to the resources, they can generate a change set, which is a summary of their proposed changes. Change sets enable a programmer to see how their changes might impact the running resources, especially for critical resources, before implementing them.
A programmer can use a single template to create and update an entire environment or separate templates to manage multiple layers within an environment. This enables templates to be modularized and provides a layer of governance that is important to many organizations. When a programmer creates or updates resources, events are generated showing the status of the configuration. If an error occurs, resources can be rolled back to the previous state. In addition, some cloud providers offer a software development framework to model and provision the cloud application resources using familiar programming languages such as TypeScript, Python, Java, and .NET.
The code in Figure 2 generates the same kind of template seen in Figure 1. These development kits are popular with programmers and leverage the same cloud resource provisioning engine used by the template approach, meaning infrastructure resources are provisioned in the same safe, repeatable manner.
Developers can often leverage their existing integrated development environment—tools like autocomplete and inline documentation—to accelerate development of IT infrastructure. With the code-based approach, a programmer’s IT infrastructure can be as testable as any other code they write, and unit tests can be created before any deployment. The main difference is that a template is a static description of the required resources, whereas code can include logic to control the resources requested.
Regulatory Background for How to Manage IaC
Regulations do not explicitly mention IaC. The primary regulatory requirement toward IT infrastructure is stated in the European Medicines Agency’s Concept Paper on the revision of Annex 11 of the guidelines on Good Manufacturing Practice for Medicinal Products – Computerised Systems: “IT infrastructure should be qualified.”
Implementation of IaC Provisioning Method
In the traditional provisioning method, the command line instructions are written in the step-by-step installation guide, which, with the introduction of IaC, has turned into a code-based, automated process to be used repeatedly. Consequently, it is essential that responsibilities and principles have been defined from an overall perspective on how to manage IaC.
Shifting Responsibility
Introducing IaC might require a shift in responsibility between the IT infrastructure provisioning and the software development department. If they are separate departments, merging the departments should be considered; hence, enabling the use of DevOps processes. Cloud services enable programmers to provision resources on demand. There is no longer a need to create a ticket for a request that infrastructure be provisioned by another team and waiting weeks or months for it to be made available. Self-service is the new normal.
Cloud adoption is an opportunity for digital transformation, but that must include revisiting these old organizational structures, operating models, and standard operating procedures and introducing a shift in responsibility. Too often companies retain their old, familiar ways of working and just apply them to the cloud. Some organizations still implement ticketing processes even to provision cloud resources, as this is seen as a way to demonstrate control.
In addition, more companies are moving from project-based to product-based operating models and adopting an agile methodology instead of a waterfall methodology. The same control objectives still exist, but IaC facilitates new ways to achieve them. For example, rather than writing, reviewing, and approving an installation and configuration test script for manual execution, a programmer writes, reviews, tests, and approves an IaC template for automated provisioning. All the changes mentioned previously should only be made in a controlled manner in accordance with a defined procedure, supported and enforced using appropriate tools.
IaC Competencies
When organizations shift internal responsibilities, it also becomes necessary to update employee roles and responsibilities, which requires staff to learn new competencies. These organizational implications are relevant when using cloud in general—but even more essential to consider as part of introducing IaC. Thus, staff training is needed.
All involved employees must have appropriate qualifications in both the technologies used and quality. Thus, qualifications should consist of a combination of education, experience, and continuous training. Engineering teams will obviously be trained in IaC, but quality management roles also need at least a high-level understanding of the technology and how control objectives are achieved through automation.
IaC Coding Principles
To ensure both operational stability and quality, it is recommended that organizations prepare some general principles for implementing, using, and operating IaC, such as:
- IaC scripts should be versioned, tested, reviewed, and approved based on criticality. Information is maintained in tools, and controls are defined in workflows.
- IaC provisioning should be the same in respective environments once they have been finally approved.
- How to remove deprecated components should be defined.
- Handling of confidential information should be considered.
- Repeatability should be ensured.
Building Block Qualification
The IaC building block concept, as mentioned in GAMP® 5 Second Edition, is an approach to qualifying individual components or combinations of components, which can then be put together to build the IT infrastructure and thus use a “one qualification, many deployments” approach.
The benefit of this approach is that a programmer can qualify an instance of a building block once and assume all the other instances will perform the same way, reducing the overall effort across applications. This approach also enables a programmer to change a building block and requalify it without needing to requalify all other building blocks. Using IaC templates to provision infrastructure components and implementing these templates as building blocks ensures consistency.
Automation
By using automation, a programmer can set up IT infrastructure environments and components more rapidly in a standardized and repeatable manner. With IaC, the same tooling used for continuous integration/continuous deployment of application code can now be used to automate the deployment of IT infrastructure.
The use of automation is critical to realizing the full benefits of the cloud. Manual processes are error prone, unreliable, and inadequate to support an agile business. Frequently, an organization may tie up highly skilled staff to provide manual configuration when time could be better spent supporting other, more critical, and higher-value activities within the business.
Modern operating environments commonly rely on full automation to release software, configure machines, patch operating systems, troubleshoot, and fix bugs to eliminate manual intervention or restrict access to production environments. Automation provides the ability to make rapid changes, improve productivity, repeat configurations, reproduce environments, leverage elasticity, leverage automatic scaling, and automate testing. Many levels of automation practices can be used together to provide a higher-level end-to-end automated process.
Regulators want to see that regulated companies have control over their applications and the environment within which they run. Automation is a good way to demonstrate such control. The regulated company needs to demonstrate evidence that the automated deployment of IaC is performed according to the specification.
In Appendix D5 of GAMP® 5 Second Edition, Table 25.1 outlines how to demonstrate evidence that the automated deployment of IaC is performed according to specification from a risk-based approach in respect to key activities in the life cycle approach and how these principles might be applied to the testing of IT infrastructure as well.
Installation Testing of It Infrastructure
Organizations are usually familiar with how to perform installation testing on premise, but may be unsure how to do so in the cloud. Creating and executing a verification plan has traditionally been a manual, labor-intensive process, and it produced a static snapshot of the environment. That same process works in the cloud, too; but with IaC, it is now possible to automate the process.
With cloud technology, the whole purpose of the service responsible for deploying resources is the consistent and repeatable deployment of the resources exactly as described in the input template. This service can be tested and verified to demonstrate that it always provides the resources as requested. Therefore, as long as the input template is controlled and approved, the confidence that the resources are deployed as expected is high and the need for verification of the output reduced, resulting in the viability of a review-by-exception approach that can replace many of the static verification activities.
Let’s look at how this might work in practice. First, let’s consider the “approved specification.” The IT infrastructure is specified with an IaC template. This template describes the required resources and their configuration and should be deployed by continuous deployment pipelines. These templates are controlled in a similar way as source code. Storing them in a source code repository enables a programmer to version the template and keep a complete history of its evolution over time.
Another key part of the previously mentioned phrase is “approved.” There are many ways to handle the approval. For example, programmers can use a Jira workflow or a pull request approval in the source code repository. Whichever method is used will be vetted and acknowledged by the IT quality and/or compliance team in accordance with a quality management system (QMS). The net result is a specific version of the template in the source code repository being recorded as approved.
The result is an approved specification describing the resources to be deployed. Once approved, the automated pipeline is triggered to deploy the resources, which will require the programmer to look at the next requirement, which is to demonstrate the installation was correct. The service that takes the template as input and performs the deployment will go through its own qualification as defined in the regulated company’s QMS.
This qualification will show that deployed resources are always consistent with the template provided. Therefore, performing additional testing and reporting to confirm this after every deployment adds unnecessary time and overhead and should only be done in case of an exception. Any automation should continuously be monitored to ensure it is operating as expected and that action is taken should there be a problem.
Monitoring and Alerting
When creating IaC templates, it is also important to define controls that will help maintain the compliant state of resources once deployed, i.e., configuration changes that would negatively impact a security or compliance posture should be detected, alerted, and remediated.
Although any change to the IT infrastructure should go through the previously mentioned controlled automation, there is still a risk of changes happening by mistake or through malicious intent. It is therefore important to monitor the configuration of the IT infrastructure. This was problematic to accomplish with physical infrastructure but easy with IaC.
Monitoring services exist that will detect any change and trigger an assessment. Should the change violate any defined controls, an alert can be raised immediately to trigger remediation. In addition, automated remediation may be possible to revert the configuration change and even revoke the permissions of the individual that made the change.
Conclusion
As the name implies, IaC is code, but it is code for IT infrastructure management and hence is considered to be category 1 (IT infrastructure) for the regulated company according to GAMP® 5 Second Edition.
The cloud service providers are expected to follow good engineering practice and are thus expected to specify, verify, and keep their services in continuous control because these are used as building blocks by the regulated companies. This is supported by supplier assessments, quality agreements, and service level agreements where appropriate, with associated suppliers supported by recommendations in section D9 of GAMP® 5 Second Edition.