Loading...
All Articles
Infrastructure · 6 min read

Terraform Best Practices: Structuring Your IaC for Scale

Learn how to structure Terraform projects for maintainability, team collaboration, and production-grade infrastructure at scale.

Why Terraform Structure Matters

Every infrastructure team starts the same way: a single main.tf file that provisions a few resources. It works fine for a proof of concept. Then the project grows, more engineers join, and suddenly that monolithic file has 2,000 lines, no one knows what depends on what, and every terraform plan takes eight minutes.

Sound familiar? The way you structure your Terraform code has a direct impact on how fast your team can ship, how safely you can make changes, and how easily new engineers can onboard. In this guide, we share the patterns we use at DevOpsVibe across dozens of production environments.

The Module-Based Architecture

The single most important decision you can make is to adopt a module-based architecture early. Modules are reusable, testable units of infrastructure that encapsulate a logical grouping of resources.

Directory Layout

Here is the structure we recommend for medium-to-large projects:

infrastructure/
├── modules/
│   ├── networking/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── outputs.tf
│   │   └── README.md
│   ├── compute/
│   ├── database/
│   └── monitoring/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   ├── terraform.tfvars
│   │   └── backend.tf
│   ├── staging/
│   └── production/
├── global/
│   ├── iam/
│   └── dns/
└── terragrunt.hcl  # optional

Each environment directory composes modules together with environment-specific variables. The modules themselves contain no hardcoded values.

Writing a Reusable Module

A well-structured module has clear inputs, outputs, and a single responsibility:

# modules/networking/variables.tf
variable "vpc_cidr" {
  description = "CIDR block for the VPC"
  type        = string

  validation {
    condition     = can(cidrnetmask(var.vpc_cidr))
    error_message = "Must be a valid CIDR block."
  }
}

variable "environment" {
  description = "Environment name (dev, staging, production)"
  type        = string
}

variable "availability_zones" {
  description = "List of AZs to use"
  type        = list(string)
}

# modules/networking/main.tf
resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_hostnames = true
  enable_dns_support   = true

  tags = {
    Name        = "${var.environment}-vpc"
    Environment = var.environment
    ManagedBy   = "terraform"
  }
}

resource "aws_subnet" "private" {
  count             = length(var.availability_zones)
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(var.vpc_cidr, 8, count.index)
  availability_zone = var.availability_zones[count.index]

  tags = {
    Name        = "${var.environment}-private-${var.availability_zones[count.index]}"
    Environment = var.environment
    Type        = "private"
  }
}

# modules/networking/outputs.tf
output "vpc_id" {
  description = "ID of the created VPC"
  value       = aws_vpc.main.id
}

output "private_subnet_ids" {
  description = "List of private subnet IDs"
  value       = aws_subnet.private[*].id
}

State Management Strategy

Remote state is non-negotiable for teams. Use S3 with DynamoDB locking (AWS) or GCS with locking (GCP):

# environments/production/backend.tf
terraform {
  backend "s3" {
    bucket         = "mycompany-terraform-state"
    key            = "production/networking/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

Key rules for state management:

  • One state file per component per environment. Never put your entire infrastructure in a single state file. If your VPC state gets corrupted, you do not want it to take your database with it.
  • Enable encryption at rest. State files contain sensitive data including passwords and private keys.
  • Use state locking. Without it, two engineers running terraform apply simultaneously can corrupt your state.
  • Never commit state files to version control. Add *.tfstate and *.tfstate.backup to your .gitignore.

Variable Management

Avoid hardcoding values. Use a layered approach to variables:

# environments/production/terraform.tfvars
environment        = "production"
vpc_cidr           = "10.0.0.0/16"
availability_zones = ["us-east-1a", "us-east-1b", "us-east-1c"]
instance_type      = "m6i.xlarge"
min_capacity       = 3
max_capacity       = 10

For secrets, never store them in .tfvars files. Instead, reference them from a secrets manager:

data "aws_secretsmanager_secret_version" "db_password" {
  secret_id = "production/database/master-password"
}

resource "aws_db_instance" "main" {
  password = data.aws_secretsmanager_secret_version.db_password.secret_string
  # ...
}

Tagging Strategy

Consistent tagging is essential for cost allocation, security auditing, and resource management. Define a common tagging module:

# modules/tags/main.tf
variable "environment" { type = string }
variable "project" { type = string }
variable "team" { type = string }

locals {
  common_tags = {
    Environment = var.environment
    Project     = var.project
    Team        = var.team
    ManagedBy   = "terraform"
    Repository  = "github.com/mycompany/infrastructure"
  }
}

output "tags" {
  value = local.common_tags
}

Then use it everywhere:

module "tags" {
  source      = "../../modules/tags"
  environment = "production"
  project     = "platform"
  team        = "infrastructure"
}

resource "aws_instance" "app" {
  # ...
  tags = merge(module.tags.tags, {
    Name = "app-server"
    Role = "application"
  })
}

CI/CD Integration

Terraform should never be run from a developer's laptop in production. Set up a pipeline:

  1. Pull request opens -- terraform fmt -check and terraform validate run automatically
  2. PR approved -- terraform plan runs and the output is posted as a PR comment
  3. PR merged to main -- terraform apply -auto-approve executes in the pipeline

Use tools like Atlantis or Spacelift for this workflow, or build your own with GitHub Actions:

# .github/workflows/terraform.yml
name: Terraform
on:
  pull_request:
    paths: ['infrastructure/**']

jobs:
  plan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - run: terraform init
        working-directory: infrastructure/environments/production
      - run: terraform plan -no-color -out=tfplan
        working-directory: infrastructure/environments/production
      - uses: actions/github-script@v7
        with:
          script: |
            const output = require('fs').readFileSync('tfplan.txt', 'utf8');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `## Terraform Plan\n\`\`\`\n${output}\n\`\`\``
            });

Testing Your Infrastructure

Use Terratest or terraform test (built-in since Terraform 1.6) to validate your modules:

# modules/networking/tests/vpc.tftest.hcl
run "creates_vpc_with_correct_cidr" {
  command = plan

  variables {
    vpc_cidr           = "10.0.0.0/16"
    environment        = "test"
    availability_zones = ["us-east-1a"]
  }

  assert {
    condition     = aws_vpc.main.cidr_block == "10.0.0.0/16"
    error_message = "VPC CIDR block did not match expected value"
  }
}

Common Anti-Patterns to Avoid

  • Monolithic state files. Split by component and environment.
  • Using count when for_each is more appropriate. for_each with maps gives you stable resource addresses that do not shift when items are added or removed.
  • Ignoring drift. Run terraform plan on a schedule to detect manual changes.
  • Skipping terraform fmt. Enforce formatting in CI. Inconsistent formatting creates noisy diffs.
  • Hardcoding provider versions. Pin them explicitly but review updates regularly.

Conclusion

Structuring Terraform well from the start saves exponential effort later. The patterns above -- module-based architecture, isolated state, layered variables, CI/CD automation, and testing -- form the foundation of every scalable infrastructure project we deliver.

At DevOpsVibe, we help teams design and implement Terraform architectures that scale from a handful of resources to thousands. Whether you are starting fresh or untangling an existing codebase, our engineers can get your infrastructure on solid footing. Reach out to us to learn more.

filed under
terraformterraformiacinfrastructureawsawsdevopsautomation
work with us

Want our team to help with your infrastructure?

talk to an engineerFree 30-min discovery callBook
close