🏗️

Production Kubernetes Platform on Bare Metal

Built a complete cloud infrastructure from scratch

infrastructure

October 2024 - Present

Production Kubernetes Platform on Bare Metal

The Problem

YUKLID needed a production-grade platform, but cloud costs were prohibitive and I wanted to understand how everything actually works under the hood—not just follow tutorials.

The Solution

Set up a complete Kubernetes cluster on physical servers. Implemented service mesh (Istio), distributed storage (Ceph), CI/CD pipeline (Jenkins), monitoring (Prometheus + Grafana), and security layer with automatic SSL and centralized auth (Keycloak). Not a managed service—built every layer from networking to observability.

What I Learned

Learned that building infrastructure isn't about memorizing kubectl commands—it's about understanding how distributed systems fail. When MetalLB wouldn't assign IPs, I had to debug networking at the packet level. When Ceph storage crashed, I learned about consensus algorithms the hard way. Best lesson: always monitor what you can't see.

Key Metrics:

Deployment Time

Reduced from 30min to 5min

Infrastructure Cost

Reduced by 60%

System Uptime

99.95%

Key Achievements:

  • Planned and implemented complete networking layer with MetalLB and Calico
  • Set up distributed storage with Ceph for persistent data
  • Built CI/CD pipeline from scratch—developers push, system deploys
  • Implemented service mesh for security, observability, and traffic control
  • Created monitoring stack that actually helps debug issues

Tech Stack:

KubernetesDockerIstioCeph/RookJenkinsPrometheusGrafanaMetalLBCalicoKeycloakcert-manager