A team migrated to Terraform. It went well for six months. Then they deleted a resource manually in the AWS console. Terraform's state file still thought it existed. Every subsequent apply tried to reference it. The state drifted further. Eventually, a plan that was supposed to make a small change would have deleted and recreated their production database. They stopped running Terraform entirely. Their infrastructure is now managed through a mix of Terraform and console changes that nobody fully understands.
This is not an indictment of Terraform. It is a real pattern I see when teams adopt it without understanding the commitment it requires. Terraform is excellent. It also has real failure modes that bite teams who are not ready for them.
What Terraform actually solves
Infrastructure as code means your infrastructure is reproducible, reviewable, and version-controlled. The problems it genuinely solves:
- Environments that drift apart. Production and staging diverge over time when people make console changes. With Terraform, both environments are defined from the same code with different variable files. Drift becomes visible in the plan.
- Rebuilding from scratch. When you need to spin up a new environment, staging for a client, a disaster recovery environment, a region migration: you apply the same code. Hours instead of days.
- Review and audit trail. Infrastructure changes go through pull requests, code review, and git history. Someone knows why a security group rule was added.
- Preventing configuration drift. Running
terraform planagainst existing infrastructure shows you what has changed outside of Terraform. It is an infrastructure audit on demand.
The state problem
Terraform maintains a state file that maps the resources in your code to the resources in the cloud. This state file is the source of truth. When the state file diverges from reality, problems follow.
Common ways state drifts:
- Someone modifies a resource in the console without updating Terraform code
- A resource is deleted outside Terraform (console, CLI, API)
- A cloud provider changes a resource's properties automatically
- Two people run
terraform applysimultaneously without a remote backend lock
When drift happens, the next plan may propose changes that are surprising or destructive. This is why the cardinal rule of Terraform adoption is: once you manage something in Terraform, never touch it outside Terraform.
The remote backend requirement
Storing the state file locally is how small teams get burned. The file gets deleted, corrupted, or falls out of sync when two developers work on infrastructure. The state must live in a shared remote backend with locking.
# backend.tf: store state in S3 with DynamoDB locking
terraform {
backend "s3" {
bucket = "yourcompany-terraform-state"
key = "production/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-state-lock" # prevents concurrent applies
}
}Create the S3 bucket and DynamoDB table before running any Terraform: they need to exist before the backend can be configured. This is a one-time bootstrap step.
HashiCorp's license change and OpenTofu
In 2023, HashiCorp changed Terraform's license from Mozilla Public License to the Business Source License, restricting commercial use by competitors. In early 2025, they locked certain state management commands behind the paid Terraform Cloud tier.
OpenTofu is the open-source fork that emerged in response, maintained by the Linux Foundation. It is fully compatible with existing Terraform configurations. For teams that want to avoid HashiCorp's commercial requirements, OpenTofu is a drop-in replacement.
# Switch from Terraform to OpenTofu:
# 1. Install OpenTofu (https://opentofu.org/docs/intro/install/)
# 2. All existing .tf files work unchanged
# 3. Initialize with tofu instead of terraform
tofu init
tofu plan
tofu applyWhen NOT to use Terraform
Terraform is not the right tool for every situation:
- One-person projects or early-stage MVPs. The overhead of writing and maintaining Terraform code is not worth it if you are the only person touching the infrastructure and the whole thing is two servers. Use the console. Document what you created. Move to Terraform when you have environments to manage or a team to coordinate with.
- Rapid prototyping. When you need to try things and change them hourly, the Terraform plan/apply cycle slows you down. Build with the console first, then codify the final architecture.
- If the team will not maintain the discipline. Terraform requires that all changes go through Terraform. If developers will sometimes use the console, the state drifts and you get the worst of both worlds: code that lies and infrastructure nobody fully understands.
The practical adoption path
Start Terraform on new resources, not existing ones. Importing existing infrastructure into state is possible but painful. When you add a new RDS database or create a new VPC, do it in Terraform from the start. Over time, the proportion of Terraform-managed infrastructure grows as you build new things.
$ setup --iac-correctly
If you want Terraform set up without hitting the state drift trap, I can do the initial setup, remote backend, and bring existing resources under management.
$ ./request-devops-help.sh →