Skip to content

fast-ish/aws-eks-infra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

aws-eks-infra

AWS CDK application written in Java that provisions an Amazon EKS (Elastic Kubernetes Service) cluster with managed addons, custom Helm charts, observability integration, and node groups for production Kubernetes workloads.

License: MIT Java AWS CDK Amazon VPC Amazon EKS Kubernetes OpenTelemetry Grafana


Overview

This CDK application provisions a production-ready Amazon EKS cluster with enterprise-grade features for running containerized workloads. The infrastructure integrates with the Fastish platform services for automated deployment and observability. The architecture follows EKS Best Practices Guide recommendations.

Key Features

Feature Description Reference
EKS Cluster Managed Kubernetes control plane with RBAC configuration EKS User Guide
AWS Managed Addons VPC CNI, EBS CSI, CoreDNS, Kube Proxy, Pod Identity Agent, CloudWatch Container Insights EKS Add-ons
Helm Chart Addons cert-manager, AWS Load Balancer Controller, Karpenter, CSI Secrets Store Helm
Grafana Cloud Integration Full observability stack with metrics, logs, and traces Grafana Cloud
Managed Node Groups Bottlerocket AMIs for enhanced security Managed Node Groups
Node Interruption Handling SQS queue for graceful node termination Karpenter Interruption

Architecture

System Overview

flowchart TB
    subgraph "Control Plane (AWS Managed)"
        API[EKS API Server]
        ETCD[(etcd)]
        CTRL[Controllers]
    end

    subgraph "Data Plane"
        subgraph "Managed Node Group"
            NODE1[Worker Node 1<br/>Bottlerocket]
            NODE2[Worker Node 2<br/>Bottlerocket]
        end

        subgraph "Karpenter Nodes"
            KNODE1[Spot Node]
            KNODE2[On-Demand Node]
        end
    end

    subgraph "Networking"
        VPCCNI[VPC CNI]
        ALB[ALB Controller]
        SG[Security Groups]
    end

    subgraph "Observability"
        OTEL[OpenTelemetry Collector]
        GRAF[Grafana Cloud]
        CW[CloudWatch]
    end

    API --> NODE1
    API --> NODE2
    API --> KNODE1
    API --> KNODE2

    NODE1 --> VPCCNI
    NODE2 --> VPCCNI
    VPCCNI --> ALB

    NODE1 --> OTEL
    NODE2 --> OTEL
    OTEL --> GRAF
    OTEL --> CW
Loading

Pod Deployment Flow

sequenceDiagram
    participant User
    participant API as EKS API Server
    participant Scheduler
    participant Karpenter
    participant Node as Worker Node
    participant Pod

    User->>API: kubectl apply -f deployment.yaml
    API->>API: Validate & Store
    API->>Scheduler: Schedule Pods

    alt Capacity Available
        Scheduler->>Node: Bind Pod to Node
    else No Capacity
        Scheduler->>Karpenter: Trigger Provisioning
        Karpenter->>Karpenter: Select Instance Type
        Karpenter->>Node: Launch New Node
        Node->>API: Node Ready
        Scheduler->>Node: Bind Pod to Node
    end

    Node->>Pod: Start Container
    Pod->>Pod: Run Workload
    Pod-->>API: Report Status
Loading

Stack Structure

The EKS infrastructure uses a layered architecture with CloudFormation nested stacks:

flowchart TB
    subgraph "DeploymentStack (main)"
        MAIN[Main Stack]
    end

    subgraph "Nested Stacks"
        VPC[VpcNestedStack]
        EKS[EksNestedStack]
        HELM[HelmAddonsNestedStack]
        OBS[ObservabilityNestedStack]
    end

    MAIN --> VPC
    MAIN --> EKS
    MAIN --> HELM
    MAIN --> OBS

    VPC -.->|depends on| EKS
    EKS -.->|depends on| HELM
    HELM -.->|depends on| OBS
Loading

Dependency Chain:

  1. VPC is created first (network foundation)
  2. EKS cluster is provisioned with managed addons
  3. Helm addons are deployed after cluster is ready
  4. Observability stack configures telemetry collection

AWS Managed EKS Addons

Addon Purpose Reference
VPC CNI Native AWS networking for pods with VPC IP addresses amazon-vpc-cni-k8s
EBS CSI Driver Persistent volume support using Amazon EBS aws-ebs-csi-driver
CoreDNS Cluster DNS for Kubernetes service discovery CoreDNS
Kube Proxy Network proxy for Kubernetes Services kube-proxy
Pod Identity Agent IAM roles for Kubernetes service accounts EKS Pod Identity
CloudWatch Container Insights Container metrics and logs collection Container Insights

Helm Chart Addons

Addon Purpose Reference
cert-manager Automated TLS certificate management cert-manager Docs
AWS Load Balancer Controller ALB/NLB provisioning for Kubernetes Services AWS LB Controller
Karpenter Just-in-time node provisioning and autoscaling Karpenter Docs
CSI Secrets Store Mount secrets from external stores Secrets Store CSI Driver

Observability Stack

The cluster integrates with Grafana Cloud for comprehensive observability:

Component Purpose Reference
Prometheus Metrics collection and storage Grafana Mimir
Loki Log aggregation and querying Grafana Loki
Tempo Distributed tracing Grafana Tempo
Pyroscope Continuous profiling Grafana Pyroscope
OpenTelemetry Collector Telemetry data collection and export OpenTelemetry

Platform Integration

When deployed through the Fastish platform, this infrastructure integrates with internal platform services:

Platform Component Integration Point Purpose
Orchestrator Release pipeline automation Automated CDK synthesis and deployment via CodePipeline
Portal Subscriber management Tenant provisioning, cluster access control
Network Shared VPC infrastructure Cross-stack connectivity for platform services
Reporting Usage metering Pipeline execution tracking and cost attribution

These integrations are managed automatically when deploying via the platform's release workflows.


Prerequisites

Requirement Version Installation
Java 21+ SDKMAN
Maven 3.8+ Maven Download
AWS CLI 2.x AWS CLI Install
AWS CDK CLI 2.221.0+ CDK Getting Started
kubectl 1.28+ kubectl Install
Helm 3.x Helm Install
GitHub CLI Latest GitHub CLI

AWS CDK Bootstrap:

cdk bootstrap aws://<account-id>/<region>

Replace <account-id> with your AWS account ID and <region> with your desired AWS region (e.g., us-west-2). This sets up necessary resources for CDK deployments including an S3 bucket for assets and CloudFormation execution roles. See: CDK Bootstrapping | Bootstrap CLI Reference


Deployment

Step 1: Clone Repositories

gh repo clone fast-ish/cdk-common
gh repo clone fast-ish/aws-eks-infra

Step 2: Build Projects

mvn -f cdk-common/pom.xml clean install
mvn -f aws-eks-infra/pom.xml clean install

Step 3: Configure Deployment

Create aws-eks-infra/cdk.context.json from aws-eks-infra/cdk.context.template.json:

Required Configuration Parameters:

Parameter Description Example
:account AWS account ID (12-digit number) 123456789012
:region AWS region for deployment us-west-2
:domain Registered domain name (optional for SES) example.com
:environment Environment name (do not change) prototype
:version Resource version identifier v1

Notes:

  • :environment and :version map to resource files at aws-eks-infra/src/main/resources/prototype/v1
  • These values determine which configuration templates are loaded during CDK synthesis

Step 4: Configure Cluster Access

Add IAM role mappings in cdk.context.json for EKS access entries:

{
  "deployment:eks:administrators": [
    {
      "username": "administrator",
      "role": "arn:aws:iam::000000000000:role/AWSReservedSSO_AdministratorAccess_abc",
      "email": "[email protected]"
    }
  ],
  "deployment:eks:users": [
    {
      "username": "user",
      "role": "arn:aws:iam::000000000000:role/AWSReservedSSO_DeveloperAccess_abc",
      "email": "[email protected]"
    }
  ]
}
Parameter Description Reference
administrators IAM roles with full cluster admin access Cluster Admin
users IAM roles with read-only cluster access RBAC Authorization
username Identifier for the user in Kubernetes RBAC User Mapping
role AWS IAM role ARN (typically from AWS IAM Identity Center) IAM Roles
email For identification and traceability -

Step 5: Configure Grafana Cloud (Optional)

For observability integration, add Grafana Cloud configuration:

{
  "hosted:eks:grafana:instanceId": "000000",
  "hosted:eks:grafana:key": "glc_xyz",
  "hosted:eks:grafana:lokiHost": "https://logs-prod-000.grafana.net",
  "hosted:eks:grafana:lokiUsername": "000000",
  "hosted:eks:grafana:prometheusHost": "https://prometheus-prod-000-prod-us-west-0.grafana.net",
  "hosted:eks:grafana:prometheusUsername": "0000000",
  "hosted:eks:grafana:tempoHost": "https://tempo-prod-000-prod-us-west-0.grafana.net/tempo",
  "hosted:eks:grafana:tempoUsername": "000000",
  "hosted:eks:grafana:pyroscopeHost": "https://profiles-prod-000.grafana.net:443"
}

Grafana Cloud Setup:

  1. Create Account: Sign up at grafana.com
  2. Create Stack: Navigate to your stack settings
  3. Generate API Key: Create key with required permissions
Parameter Location Description
instanceId Stack details page Unique identifier for your Grafana instance
key API keys section API key (starts with glc_)
lokiHost Logs > Data Sources > Loki Endpoint URL for logs
prometheusHost Metrics > Data Sources > Prometheus Endpoint URL for metrics
tempoHost Traces > Data Sources > Tempo Endpoint URL for traces
pyroscopeHost Profiles > Connect a Data Source Endpoint URL for profiling

Required API Key Permissions: metrics, logs, traces, profiles, alerts, rules (read/write)

See: Grafana Cloud Kubernetes Monitoring

Step 6: Deploy Infrastructure

cd aws-eks-infra

# Preview changes
cdk synth

# Deploy all stacks
cdk deploy

See: CDK Deploy Command | CDK Synth Command

What Gets Deployed:

Resource Type Count Description Reference
CloudFormation Stacks 4+ 1 main + nested stacks Nested Stacks
VPC 1 Multi-AZ with public/private subnets VPC Documentation
EKS Cluster 1 Kubernetes 1.28+ control plane EKS Clusters
Managed Node Groups 1+ Bottlerocket-based worker nodes Managed Node Groups
SQS Queue 1 Node interruption handling SQS Developer Guide
IAM Roles Multiple Service accounts and node roles EKS IAM

Step 7: Access the Cluster

# Update kubeconfig
aws eks update-kubeconfig --name <cluster-name> --region <region>

# Verify connectivity
kubectl get nodes
kubectl get pods -A

See: Connecting to EKS


Configuration Reference

CDK Context Variables

The build process uses Mustache templating to inject context variables into configuration files. See cdk-common for the complete build process documentation.

Variable Type Description
{{account}} String AWS account ID
{{region}} String AWS region
{{environment}} String Environment name
{{version}} String Resource version
{{hosted:id}} String Unique deployment identifier

Template Structure

src/main/resources/
└── prototype/
    └── v1/
        ├── conf.mustache           # Main configuration
        ├── eks/
        │   ├── cluster.mustache    # EKS cluster configuration
        │   ├── addons.mustache     # Managed addons
        │   └── nodegroups.mustache # Node group configuration
        ├── helm/
        │   ├── karpenter.mustache  # Karpenter values
        │   └── monitoring.mustache # Grafana stack values
        └── iam/
            └── roles.mustache      # IAM role definitions

Node Management with Karpenter

Karpenter provides just-in-time node provisioning for optimal cost and performance:

Feature Description Reference
Provisioners Define node requirements and constraints Provisioners
Consolidation Automatically right-size cluster capacity Consolidation
Spot Instances Cost optimization with Spot capacity Spot Best Practices
Interruption Handling Graceful node draining on Spot termination Interruption

See: Karpenter Best Practices


Security Considerations

Aspect Implementation Reference
Node AMI Bottlerocket for minimal attack surface Bottlerocket
Pod Identity IAM roles for service accounts Pod Identity
Network Policies VPC CNI for pod-level network isolation Network Policies
Secrets Management CSI Secrets Store with AWS Secrets Manager Secrets Store
Cluster Access RBAC with IAM integration Access Management

See: EKS Best Practices Guide - Security


Troubleshooting

For common deployment issues and resolutions, see the Troubleshooting Guide.

Quick Diagnostics

# Update kubeconfig
aws eks update-kubeconfig --name <cluster-name> --region <region>

# Check node status
kubectl get nodes -o wide

# Check system pods
kubectl get pods -n kube-system

# Check EKS add-on status
aws eks list-addons --cluster-name <cluster-name>

# Check Karpenter
kubectl get nodepools
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter --tail=50

# Test pod scheduling
kubectl run test --image=nginx --restart=Never && kubectl wait --for=condition=Ready pod/test && kubectl delete pod test

Related Documentation

Platform Documentation

Resource Description
Fastish Documentation Main documentation hub
Troubleshooting Common issues and resolutions
Deployment Validation Post-deployment health checks
Upgrade Guide Version upgrade procedures
Capacity Planning Sizing recommendations
IAM Permissions Required IAM policies
Network Requirements CIDR and port requirements

AWS Documentation

Resource Description
cdk-common Shared CDK constructs library
EKS User Guide Official EKS documentation
EKS Best Practices AWS EKS best practices guide
EKS Workshop Hands-on EKS tutorials
Kubernetes Documentation Official Kubernetes docs
Karpenter Documentation Karpenter autoscaler docs
Bottlerocket Documentation Container-optimized OS
AWS CDK EKS Module CDK EKS construct reference
Grafana Cloud Docs Grafana Cloud documentation
OpenTelemetry Documentation Telemetry collection framework

License

MIT License

For your convenience, you can find the full MIT license text at:

About

eks cluster infrastructure and kubernetes platform components

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •