Manual infrastructure management was becoming a nightmare. Here’s how we adopted Terraform to manage our entire AWS infrastructure as code, reducing deployment time from hours to minutes.

Table of contents

The Pain of Manual Infrastructure

Before Terraform:

  • 3 environments (dev, staging, prod)
  • 50+ AWS resources per environment
  • Manual console clicks for every change
  • No version control for infrastructure
  • Inconsistencies between environments
  • 4-6 hours to provision new environment

Problems:

  • Configuration drift
  • No audit trail
  • Error-prone manual processes
  • Difficult to replicate environments
  • No disaster recovery plan

Getting Started with Terraform

Installation

# macOS
brew install terraform

# Linux
wget https://releases.hashicorp.com/terraform/1.3.0/terraform_1.3.0_linux_amd64.zip
unzip terraform_1.3.0_linux_amd64.zip
sudo mv terraform /usr/local/bin/

# Verify
terraform version

Project Structure

terraform/
├── environments/
│   ├── dev/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── terraform.tfvars
│   ├── staging/
│   └── prod/
├── modules/
│   ├── vpc/
│   ├── ec2/
│   ├── rds/
│   └── s3/
├── global/
│   └── iam/
└── README.md

Basic Terraform Configuration

Provider Setup

# main.tf
terraform {
  required_version = ">= 1.0"
  
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 4.0"
    }
  }
  
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "prod/terraform.tfstate"
    region         = "us-east-1"
    encrypt        = true
    dynamodb_table = "terraform-locks"
  }
}

provider "aws" {
  region = var.aws_region
  
  default_tags {
    tags = {
      Environment = var.environment
      ManagedBy   = "Terraform"
      Project     = var.project_name
    }
  }
}

Variables

# variables.tf
variable "aws_region" {
  description = "AWS region"
  type        = string
  default     = "us-east-1"
}

variable "environment" {
  description = "Environment name"
  type        = string
  validation {
    condition     = contains(["dev", "staging", "prod"], var.environment)
    error_message = "Environment must be dev, staging, or prod."
  }
}

variable "vpc_cidr" {
  description = "VPC CIDR block"
  type        = string
  default     = "10.0.0.0/16"
}

Terraform Variables File

# terraform.tfvars
aws_region  = "us-east-1"
environment = "prod"
project_name = "myapp"
vpc_cidr    = "10.0.0.0/16"

Creating Reusable Modules

VPC Module

# modules/vpc/main.tf
resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_hostnames = true
  enable_dns_support   = true
  
  tags = {
    Name = "${var.project_name}-${var.environment}-vpc"
  }
}

resource "aws_subnet" "public" {
  count             = length(var.public_subnet_cidrs)
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.public_subnet_cidrs[count.index]
  availability_zone = var.availability_zones[count.index]
  
  map_public_ip_on_launch = true
  
  tags = {
    Name = "${var.project_name}-${var.environment}-public-${count.index + 1}"
    Type = "public"
  }
}

resource "aws_subnet" "private" {
  count             = length(var.private_subnet_cidrs)
  vpc_id            = aws_vpc.main.id
  cidr_block        = var.private_subnet_cidrs[count.index]
  availability_zone = var.availability_zones[count.index]
  
  tags = {
    Name = "${var.project_name}-${var.environment}-private-${count.index + 1}"
    Type = "private"
  }
}

resource "aws_internet_gateway" "main" {
  vpc_id = aws_vpc.main.id
  
  tags = {
    Name = "${var.project_name}-${var.environment}-igw"
  }
}

resource "aws_nat_gateway" "main" {
  count         = length(var.public_subnet_cidrs)
  allocation_id = aws_eip.nat[count.index].id
  subnet_id     = aws_subnet.public[count.index].id
  
  tags = {
    Name = "${var.project_name}-${var.environment}-nat-${count.index + 1}"
  }
}

resource "aws_eip" "nat" {
  count  = length(var.public_subnet_cidrs)
  domain = "vpc"
  
  tags = {
    Name = "${var.project_name}-${var.environment}-eip-${count.index + 1}"
  }
}

EC2 Module

# modules/ec2/main.tf
data "aws_ami" "amazon_linux_2" {
  most_recent = true
  owners      = ["amazon"]
  
  filter {
    name   = "name"
    values = ["amzn2-ami-hvm-*-x86_64-gp2"]
  }
}

resource "aws_instance" "app" {
  count         = var.instance_count
  ami           = data.aws_ami.amazon_linux_2.id
  instance_type = var.instance_type
  subnet_id     = var.subnet_ids[count.index % length(var.subnet_ids)]
  
  vpc_security_group_ids = [aws_security_group.app.id]
  key_name              = var.key_name
  
  user_data = templatefile("${path.module}/user_data.sh", {
    environment = var.environment
  })
  
  root_block_device {
    volume_size = var.root_volume_size
    volume_type = "gp3"
    encrypted   = true
  }
  
  tags = {
    Name = "${var.project_name}-${var.environment}-app-${count.index + 1}"
  }
}

resource "aws_security_group" "app" {
  name        = "${var.project_name}-${var.environment}-app-sg"
  description = "Security group for application servers"
  vpc_id      = var.vpc_id
  
  ingress {
    from_port   = 80
    to_port     = 80
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  ingress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
  
  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

Using Modules

# environments/prod/main.tf
module "vpc" {
  source = "../../modules/vpc"
  
  project_name          = var.project_name
  environment           = var.environment
  vpc_cidr              = "10.0.0.0/16"
  public_subnet_cidrs   = ["10.0.1.0/24", "10.0.2.0/24"]
  private_subnet_cidrs  = ["10.0.10.0/24", "10.0.11.0/24"]
  availability_zones    = ["us-east-1a", "us-east-1b"]
}

module "ec2" {
  source = "../../modules/ec2"
  
  project_name   = var.project_name
  environment    = var.environment
  vpc_id         = module.vpc.vpc_id
  subnet_ids     = module.vpc.private_subnet_ids
  instance_count = 3
  instance_type  = "t3.medium"
  key_name       = "prod-key"
}

module "rds" {
  source = "../../modules/rds"
  
  project_name       = var.project_name
  environment        = var.environment
  vpc_id             = module.vpc.vpc_id
  subnet_ids         = module.vpc.private_subnet_ids
  instance_class     = "db.t3.large"
  allocated_storage  = 100
  engine_version     = "14.5"
}

State Management

Remote State with S3

# Create S3 bucket for state
resource "aws_s3_bucket" "terraform_state" {
  bucket = "my-terraform-state"
  
  lifecycle {
    prevent_destroy = true
  }
}

resource "aws_s3_bucket_versioning" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  
  versioning_configuration {
    status = "Enabled"
  }
}

resource "aws_s3_bucket_server_side_encryption_configuration" "terraform_state" {
  bucket = aws_s3_bucket.terraform_state.id
  
  rule {
    apply_server_side_encryption_by_default {
      sse_algorithm = "AES256"
    }
  }
}

# DynamoDB for state locking
resource "aws_dynamodb_table" "terraform_locks" {
  name         = "terraform-locks"
  billing_mode = "PAY_PER_REQUEST"
  hash_key     = "LockID"
  
  attribute {
    name = "LockID"
    type = "S"
  }
}

Workflow and Best Practices

Development Workflow

# 1. Initialize
terraform init

# 2. Format code
terraform fmt -recursive

# 3. Validate
terraform validate

# 4. Plan changes
terraform plan -out=tfplan

# 5. Review plan
terraform show tfplan

# 6. Apply changes
terraform apply tfplan

# 7. Verify
terraform show

Using Workspaces

# Create workspace
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod

# List workspaces
terraform workspace list

# Switch workspace
terraform workspace select prod

# Show current workspace
terraform workspace show

Import Existing Resources

# Import existing VPC
terraform import aws_vpc.main vpc-12345678

# Import EC2 instance
terraform import aws_instance.app[0] i-1234567890abcdef0

Advanced Patterns

Dynamic Blocks

resource "aws_security_group" "app" {
  name   = "app-sg"
  vpc_id = var.vpc_id
  
  dynamic "ingress" {
    for_each = var.ingress_rules
    content {
      from_port   = ingress.value.from_port
      to_port     = ingress.value.to_port
      protocol    = ingress.value.protocol
      cidr_blocks = ingress.value.cidr_blocks
    }
  }
}

Conditional Resources

resource "aws_instance" "bastion" {
  count = var.environment == "prod" ? 1 : 0
  
  ami           = data.aws_ami.amazon_linux_2.id
  instance_type = "t3.micro"
  subnet_id     = var.public_subnet_id
}

For Each

resource "aws_s3_bucket" "buckets" {
  for_each = toset(var.bucket_names)
  
  bucket = "${var.project_name}-${each.value}"
  
  tags = {
    Name = each.value
  }
}

CI/CD Integration

GitHub Actions

# .github/workflows/terraform.yml
name: Terraform

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  terraform:
    runs-on: ubuntu-latest
    
    steps:
      - uses: actions/checkout@v2
      
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v1
        with:
          terraform_version: 1.3.0
      
      - name: Terraform Init
        run: terraform init
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      
      - name: Terraform Format
        run: terraform fmt -check
      
      - name: Terraform Validate
        run: terraform validate
      
      - name: Terraform Plan
        run: terraform plan -no-color
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      
      - name: Terraform Apply
        if: github.ref == 'refs/heads/main'
        run: terraform apply -auto-approve
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}

Results

Before vs After

MetricBeforeAfterImprovement
Environment Setup4-6 hours15 minutes16-24x
Configuration DriftCommonNone100%
Audit TrailNoneFull Git history
Disaster RecoveryDaysHours8-24x
Team Onboarding2 weeks2 days7x

Cost Savings

  • Time saved: 20 hours/month
  • Reduced errors: 90% fewer misconfigurations
  • Faster deployments: 95% reduction in deployment time

Lessons Learned

1. Start Small

Don’t try to Terraform everything at once. Start with non-critical resources.

2. Use Modules

Reusable modules save time and ensure consistency.

3. State is Critical

Protect your state file—it’s the source of truth.

4. Plan Before Apply

Always review terraform plan output carefully.

5. Version Everything

Pin provider versions and module versions.

Common Pitfalls

1. Not Using Remote State

Local state doesn’t work for teams.

2. Hardcoding Values

Use variables and data sources instead.

3. No State Locking

Always use state locking to prevent conflicts.

4. Ignoring Drift

Run terraform plan regularly to detect drift.

Conclusion

Terraform transformed our infrastructure management:

Benefits:

  • Version-controlled infrastructure
  • Consistent environments
  • Fast provisioning
  • Easy disaster recovery
  • Better collaboration

Challenges:

  • Learning curve
  • State management complexity
  • Occasional provider bugs

ROI: Terraform paid for itself in the first month through time savings alone.

Recommendation: If you’re managing cloud infrastructure manually, adopt Terraform now. The initial investment is worth it.