Creating a Google Kubernetes Engine (GKE) Cluster with Terraform in a Custom VPC: A Comprehensive Guide

  • by
  • 8 min read

In the rapidly evolving landscape of cloud computing and containerization, Google Kubernetes Engine (GKE) stands out as a robust, managed Kubernetes service. When combined with the power of infrastructure as code through Terraform and the flexibility of a custom Virtual Private Cloud (VPC), it offers a formidable solution for modern, scalable applications. This comprehensive guide will walk you through the process of creating a GKE cluster within a custom VPC using Terraform, providing in-depth insights and best practices along the way.

Understanding the Core Components

Before we delve into the implementation details, it's crucial to have a solid grasp of the key components we'll be working with:

Google Kubernetes Engine (GKE)

GKE is Google Cloud's managed Kubernetes service, allowing you to deploy, manage, and scale containerized applications using Google's infrastructure. It provides a highly available and secure platform for running your workloads, with automatic upgrades and built-in security features.

Virtual Private Cloud (VPC)

A VPC is a secure, isolated private cloud computing environment hosted within a public cloud. It provides network isolation, custom IP address ranges, and fine-grained network control, allowing you to define your network topology to suit your specific requirements.

Terraform

Terraform is an open-source infrastructure as code tool created by HashiCorp. It allows you to define and provision infrastructure using a declarative language, enabling version control, collaboration, and reproducibility in your infrastructure management.

Custom VPC

A custom VPC in the context of this guide refers to a VPC tailored to your specific needs, as opposed to using the default VPC provided by Google Cloud. This gives you greater control over your network design and security posture.

Setting Up Your Environment

To embark on this journey of creating a GKE cluster with Terraform in a custom VPC, you'll need to set up your environment properly. Here's what you'll need:

  1. A Google Cloud Platform (GCP) account: If you don't have one, you can sign up for a free trial that includes $300 in credits.

  2. Google Cloud SDK: This set of tools allows you to manage resources and applications hosted on Google Cloud. Install and initialize it on your local machine.

  3. Terraform: Install the latest version of Terraform on your local machine. As of this writing, the latest version is 1.5.x, but be sure to check for the most recent release.

  4. Basic understanding of GKE, VPC, and Terraform concepts: While this guide will walk you through the process, having a foundational knowledge of these technologies will be beneficial.

Creating a GCP Project and Service Account

Before we start writing Terraform code, we need to set up a few things in GCP:

  1. Create a new GCP project through the Google Cloud Console. This will be the project where we'll deploy our resources.

  2. Create a service account with the necessary permissions. This service account will be used by Terraform to interact with GCP APIs. Assign the following roles to this service account:

    • Kubernetes Engine Admin
    • Service Account User
    • Compute Network Admin
  3. Generate and download a JSON key for this service account. This key will be used to authenticate Terraform with GCP.

Structuring Your Terraform Project

A well-organized Terraform project is crucial for maintainability and scalability. Here's a recommended structure for your project:

my-gke-tf/
├── modules/
│   ├── vpc/
│   │   ├── main.tf
│   │   ├── variables.tf
│   │   └── outputs.tf
│   └── gke/
│       ├── main.tf
│       ├── variables.tf
│       └── outputs.tf
├── main.tf
├── variables.tf
├── outputs.tf
└── terraform.tfvars

This structure separates our VPC and GKE configurations into modules, promoting reusability and making our code easier to manage.

Implementing the VPC Module

Let's start by implementing the VPC module. In the modules/vpc/main.tf file, we'll define our custom VPC and subnet:

resource "google_compute_network" "vpc" {
  name                    = var.vpc_name
  auto_create_subnetworks = false
}

resource "google_compute_subnetwork" "subnet" {
  name          = var.subnet_name
  region        = var.region
  network       = google_compute_network.vpc.self_link
  ip_cidr_range = var.subnet_cidr

  secondary_ip_range {
    range_name    = "pod-range"
    ip_cidr_range = var.pod_cidr
  }

  secondary_ip_range {
    range_name    = "service-range"
    ip_cidr_range = var.service_cidr
  }
}

This configuration creates a VPC with a custom subnet. We've also added secondary IP ranges for pods and services, which is a best practice for GKE clusters.

In the variables.tf file, define the corresponding variables:

variable "vpc_name" {
  description = "Name of the VPC"
  type        = string
}

variable "subnet_name" {
  description = "Name of the subnet"
  type        = string
}

variable "region" {
  description = "Region for the subnet"
  type        = string
}

variable "subnet_cidr" {
  description = "CIDR range for the subnet"
  type        = string
}

variable "pod_cidr" {
  description = "CIDR range for pods"
  type        = string
}

variable "service_cidr" {
  description = "CIDR range for services"
  type        = string
}

Finally, in outputs.tf, we'll define outputs that we'll use later in our GKE module:

output "vpc_self_link" {
  value = google_compute_network.vpc.self_link
}

output "subnet_self_link" {
  value = google_compute_subnetwork.subnet.self_link
}

Creating the GKE Module

Now, let's implement the GKE module. In the modules/gke/main.tf file:

resource "google_container_cluster" "primary" {
  name     = var.cluster_name
  location = var.region

  remove_default_node_pool = true
  initial_node_count       = 1

  network    = var.vpc_self_link
  subnetwork = var.subnet_self_link

  ip_allocation_policy {
    cluster_secondary_range_name  = "pod-range"
    services_secondary_range_name = "service-range"
  }

  private_cluster_config {
    enable_private_nodes    = true
    enable_private_endpoint = false
    master_ipv4_cidr_block  = var.master_ipv4_cidr_block
  }

  master_authorized_networks_config {
    cidr_blocks {
      cidr_block   = var.authorized_ipv4_cidr_block
      display_name = "External Access"
    }
  }

  addons_config {
    http_load_balancing {
      disabled = false
    }
    horizontal_pod_autoscaling {
      disabled = false
    }
  }

  master_auth {
    client_certificate_config {
      issue_client_certificate = false
    }
  }
}

resource "google_container_node_pool" "primary_nodes" {
  name       = "${var.cluster_name}-node-pool"
  location   = var.region
  cluster    = google_container_cluster.primary.name
  node_count = var.node_count

  node_config {
    oauth_scopes = [
      "https://www.googleapis.com/auth/logging.write",
      "https://www.googleapis.com/auth/monitoring",
    ]

    labels = {
      env = var.project_id
    }

    machine_type = "n1-standard-2"
    preemptible  = true
    metadata = {
      disable-legacy-endpoints = "true"
    }
  }

  autoscaling {
    min_node_count = 1
    max_node_count = 5
  }

  management {
    auto_repair  = true
    auto_upgrade = true
  }
}

This configuration creates a private GKE cluster with a node pool. It uses the VPC and subnet we created earlier, sets up IP aliasing for pods and services, and configures authorized networks for accessing the Kubernetes API server.

Define the corresponding variables in variables.tf and outputs in outputs.tf.

Bringing It All Together

In your root main.tf, we'll use these modules to create our infrastructure:

module "vpc" {
  source       = "./modules/vpc"
  vpc_name     = var.vpc_name
  subnet_name  = var.subnet_name
  region       = var.region
  subnet_cidr  = var.subnet_cidr
  pod_cidr     = var.pod_cidr
  service_cidr = var.service_cidr
}

module "gke" {
  source                     = "./modules/gke"
  project_id                 = var.project_id
  cluster_name               = var.cluster_name
  region                     = var.region
  vpc_self_link              = module.vpc.vpc_self_link
  subnet_self_link           = module.vpc.subnet_self_link
  node_count                 = var.node_count
  master_ipv4_cidr_block     = var.master_ipv4_cidr_block
  authorized_ipv4_cidr_block = var.authorized_ipv4_cidr_block
}

Deploying Your Infrastructure

With our Terraform configuration complete, we're ready to deploy our infrastructure:

  1. Initialize Terraform:

    terraform init
    
  2. Plan your deployment:

    terraform plan
    
  3. Apply the configuration:

    terraform apply
    

Connecting to Your GKE Cluster

After successful deployment, you'll need to configure kubectl to interact with your new cluster. Run the following command:

gcloud container clusters get-credentials [CLUSTER_NAME] --region [REGION] --project [PROJECT_ID]

Validating Your Deployment

To ensure your cluster is working correctly, deploy a simple application:

kubectl create deployment hello-server --image=gcr.io/google-samples/hello-app:1.0
kubectl expose deployment hello-server --type LoadBalancer --port 80 --target-port 8080

Best Practices and Considerations

As you work with GKE and Terraform, keep these best practices in mind:

  1. Use Terraform workspaces for managing multiple environments (e.g., development, staging, production).

  2. Implement state locking and remote state storage to enable team collaboration and prevent concurrent modifications.

  3. Regularly update your GKE and node versions to benefit from the latest features and security patches.

  4. Implement proper IAM and security measures, following the principle of least privilege.

  5. Consider using preemptible nodes for cost optimization, but be aware of their limitations.

  6. Implement comprehensive monitoring and logging solutions to gain visibility into your cluster's performance and health.

  7. Use node auto-provisioning and cluster autoscaler for efficient resource utilization.

  8. Implement network policies to control traffic flow between pods.

  9. Use secrets management solutions like HashiCorp Vault or Google Secret Manager for sensitive data.

  10. Regularly perform security scans and audits of your cluster and applications.

Conclusion

Creating a GKE cluster within a custom VPC using Terraform is a powerful approach to infrastructure management. It combines the flexibility and control of a custom network environment with the declarative, version-controlled nature of infrastructure as code.

This guide has walked you through the process of setting up your environment, structuring your Terraform project, implementing VPC and GKE modules, and bringing it all together. By following these steps and best practices, you're well on your way to creating robust, scalable, and maintainable Kubernetes environments on Google Cloud.

Remember, while this guide provides a solid foundation, production environments often require additional considerations around security, scalability, and compliance. Always stay updated with the latest GCP and Kubernetes best practices, and don't hesitate to consult official documentation or seek expert advice when needed.

As you continue your journey with GKE and Terraform, you'll discover even more ways to optimize and enhance your infrastructure. The cloud-native landscape is constantly evolving, offering new tools and techniques to improve your deployments. Embrace continuous learning, and you'll be well-equipped to tackle the challenges of modern application deployment and management.

Happy Terraforming, and may your clusters always be healthy and your deployments smooth!

Did you like this post?

Click on a star to rate it!

Average rating 0 / 5. Vote count: 0

No votes so far! Be the first to rate this post.