The Journey from Monolith to Microservices: A Platform Engineer's Tale

How I transformed a legacy monolithic application into a scalable microservices architecture using Kubernetes, Docker, and modern DevOps practices

The Journey from Monolith to Microservices 🚀

After three years of maintaining a monolithic application that was growing increasingly complex and difficult to scale, I made the decision to embark on a complete architectural transformation. This is the story of how I broke down a legacy system into a modern, scalable microservices architecture using Kubernetes, Docker, and cloud-native technologies.

The Legacy System: A Monolith’s Struggles

Our monolithic application had served us well initially, but as the team and user base grew, cracks began to appear:

  • Deployment nightmares - A single bug could bring down the entire system
  • Technology lock-in - We were stuck with outdated frameworks
  • Scaling bottlenecks - Different parts had different resource needs
  • Team conflicts - Multiple teams stepping on each other’s toes
  • Testing complexity - Integration tests were slow and brittle

The breaking point came when a simple feature request required changes across 15 different files and took three weeks to implement safely.

The Migration Strategy: Strangler Fig Pattern

Instead of a big-bang rewrite, I adopted the Strangler Fig pattern to gradually replace the monolith:

Phase 1: Containerization

First, I containerized the existing monolith to establish a baseline:

# Dockerfile for the monolith
FROM node:18-alpine

WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

COPY . .
RUN npm run build

EXPOSE 3000
CMD ["npm", "start"]

Phase 2: Extract User Service

I started by extracting the user management functionality:

// user-service/main.go
package main

import (
    "context"
    "encoding/json"
    "log"
    "net/http"
    "time"
    
    "github.com/gin-gonic/gin"
    "github.com/go-redis/redis/v8"
    "gorm.io/gorm"
)

type UserService struct {
    db    *gorm.DB
    redis *redis.Client
}

func (s *UserService) GetUser(c *gin.Context) {
    userID := c.Param("id")
    
    // Check cache first
    cached, err := s.redis.Get(context.Background(), "user:"+userID).Result()
    if err == nil {
        var user User
        json.Unmarshal([]byte(cached), &user)
        c.JSON(http.StatusOK, user)
        return
    }
    
    // Fallback to database
    var user User
    if err := s.db.First(&user, userID).Error; err != nil {
        c.JSON(http.StatusNotFound, gin.H{"error": "User not found"})
        return
    }
    
    // Cache the result
    userJSON, _ := json.Marshal(user)
    s.redis.Set(context.Background(), "user:"+userID, userJSON, time.Hour)
    
    c.JSON(http.StatusOK, user)
}

Phase 3: API Gateway Implementation

To route traffic between the monolith and new services:

# api-gateway.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: api-gateway
spec:
  hosts:
  - api.example.com
  http:
  - match:
    - uri:
        prefix: /api/v1/users
    route:
    - destination:
        host: user-service
        port:
          number: 8080
  - match:
    - uri:
        prefix: /api/v1/orders
    route:
    - destination:
        host: order-service
        port:
          number: 8080
  - route:
    - destination:
        host: monolith
        port:
          number: 3000

Service Discovery and Communication

Service Mesh with Istio

I implemented Istio for service-to-service communication:

# istio-config.yaml
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: user-service
spec:
  host: user-service
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 100
      http:
        http1MaxPendingRequests: 10
        maxRequestsPerConnection: 2
    circuitBreaker:
      consecutiveErrors: 3
      interval: 30s
      baseEjectionTime: 30s

Event-Driven Architecture

For loose coupling between services:

// event-publisher.go
type EventPublisher struct {
    producer kafka.Producer
}

func (p *EventPublisher) PublishUserCreated(user *User) error {
    event := UserCreatedEvent{
        UserID:    user.ID,
        Email:     user.Email,
        CreatedAt: user.CreatedAt,
    }
    
    return p.producer.Produce(&kafka.Message{
        TopicPartition: kafka.TopicPartition{Topic: &"user.events", Partition: kafka.PartitionAny},
        Value:          event.ToJSON(),
    }, nil)
}

// event-consumer.go
func (s *OrderService) HandleUserCreated(event UserCreatedEvent) error {
    // Create user profile in order service
    profile := UserProfile{
        UserID: event.UserID,
        Email:  event.Email,
    }
    
    return s.db.Create(&profile).Error
}

Database Strategy: Database per Service

Each service got its own database to ensure data isolation:

User Service Database

-- user-service/schema.sql
CREATE TABLE users (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    email VARCHAR(255) UNIQUE NOT NULL,
    name VARCHAR(255) NOT NULL,
    created_at TIMESTAMP DEFAULT NOW(),
    updated_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_users_email ON users(email);

Order Service Database

-- order-service/schema.sql
CREATE TABLE orders (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID NOT NULL,
    total DECIMAL(10,2) NOT NULL,
    status VARCHAR(50) NOT NULL,
    created_at TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_orders_user_id ON orders(user_id);

Monitoring and Observability

Distributed Tracing with Jaeger

// tracing.go
import (
    "github.com/opentracing/opentracing-go"
    "github.com/uber/jaeger-client-go"
)

func initTracing() {
    cfg := jaegerconfig.Configuration{
        ServiceName: "user-service",
        Sampler: &jaegerconfig.SamplerConfig{
            Type:  jaeger.SamplerTypeConst,
            Param: 1,
        },
        Reporter: &jaegerconfig.ReporterConfig{
            LogSpans: true,
        },
    }
    
    tracer, closer, _ := cfg.NewTracer()
    opentracing.SetGlobalTracer(tracer)
    defer closer.Close()
}

func (s *UserService) GetUser(c *gin.Context) {
    span, ctx := opentracing.StartSpanFromContext(c.Request.Context(), "get_user")
    defer span.Finish()
    
    // Add tags for better trace visibility
    span.SetTag("user.id", c.Param("id"))
    span.SetTag("service.name", "user-service")
    
    // ... rest of the implementation
}

Metrics Collection

// metrics.go
import (
    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/client_golang/prometheus/promauto"
)

var (
    httpRequestsTotal = promauto.NewCounterVec(
        prometheus.CounterOpts{
            Name: "http_requests_total",
            Help: "Total number of HTTP requests",
        },
        []string{"method", "endpoint", "status_code", "service"},
    )
    
    httpRequestDuration = promauto.NewHistogramVec(
        prometheus.HistogramOpts{
            Name:    "http_request_duration_seconds",
            Help:    "Duration of HTTP requests in seconds",
            Buckets: prometheus.DefBuckets,
        },
        []string{"method", "endpoint", "service"},
    )
)

Testing Strategy

Contract Testing with Pact

// user-service/contract_test.go
func TestUserServiceContract(t *testing.T) {
    // Create a mock provider
    provider := pact.NewV3Pact(pact.MockServerConfig{
        Consumer: "order-service",
        Provider: "user-service",
        Host:     "127.0.0.1",
        Port:     8080,
    })
    
    provider.
        AddInteraction().
        Given("user exists").
        UponReceiving("a request for user details").
        WithRequest(dsl.Request{
            Method: "GET",
            Path:   dsl.String("/api/v1/users/123"),
        }).
        WillRespondWith(dsl.Response{
            Status: 200,
            Headers: dsl.MapMatcher{
                "Content-Type": dsl.String("application/json"),
            },
            Body: dsl.Match(userResponse),
        })
    
    err := provider.ExecuteTest(func(config pact.ServerConfig) error {
        // Test the actual service
        client := NewUserServiceClient(fmt.Sprintf("http://localhost:%d", config.Port))
        user, err := client.GetUser("123")
        
        assert.NoError(t, err)
        assert.Equal(t, "123", user.ID)
        
        return nil
    })
    
    assert.NoError(t, err)
}

Integration Testing

// integration_test.go
func TestUserServiceIntegration(t *testing.T) {
    // Setup test environment
    testDB := setupTestDB(t)
    testRedis := setupTestRedis(t)
    defer cleanup(t, testDB, testRedis)
    
    // Create service instance
    service := NewUserService(testDB, testRedis)
    
    // Test user creation
    user := &User{
        Email: "test@example.com",
        Name:  "Test User",
    }
    
    err := service.CreateUser(user)
    assert.NoError(t, err)
    assert.NotEmpty(t, user.ID)
    
    // Test user retrieval
    retrieved, err := service.GetUser(user.ID)
    assert.NoError(t, err)
    assert.Equal(t, user.Email, retrieved.Email)
}

Deployment Pipeline

GitOps with ArgoCD

# argocd-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: user-service
  namespace: argocd
spec:
  project: default
  source:
    repoURL: https://github.com/company/k8s-manifests
    targetRevision: HEAD
    path: user-service
  destination:
    server: https://kubernetes.default.svc
    namespace: production
  syncPolicy:
    automated:
      prune: true
      selfHeal: true
    syncOptions:
    - CreateNamespace=true

CI/CD Pipeline

# .github/workflows/user-service.yml
name: User Service CI/CD

on:
  push:
    branches: [main]
    paths: ['user-service/**']

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-go@v4
        with:
          go-version: '1.21'
      
      - name: Run tests
        run: |
          cd user-service
          go test ./...
      
      - name: Run contract tests
        run: |
          cd user-service
          go test -tags=contract ./...

  build:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Build Docker image
        run: |
          cd user-service
          docker build -t user-service:${{ github.sha }} .
      
      - name: Push to registry
        run: |
          docker tag user-service:${{ github.sha }} ${{ secrets.REGISTRY }}/user-service:${{ github.sha }}
          docker push ${{ secrets.REGISTRY }}/user-service:${{ github.sha }}

  deploy:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Update image tag
        run: |
          sed -i "s|IMAGE_TAG|${{ github.sha }}|g" k8s-manifests/user-service/deployment.yaml
      
      - name: Commit changes
        run: |
          git config --local user.email "action@github.com"
          git config --local user.name "GitHub Action"
          git add k8s-manifests/user-service/deployment.yaml
          git commit -m "Update user-service image to ${{ github.sha }}"
          git push

The Results: A Modern Architecture

After 8 months of gradual migration, here’s what we achieved:

Performance Improvements

  • 60% reduction in average response time
  • 99.9% uptime with automatic failover
  • Independent scaling of each service
  • Sub-second deployments with zero downtime

Developer Experience

  • Faster feature development with team autonomy
  • Easier testing with isolated services
  • Technology diversity - each team can choose their stack
  • Reduced conflicts with clear service boundaries

Operational Benefits

  • Granular monitoring with service-level metrics
  • Easier debugging with distributed tracing
  • Faster incident response with isolated failures
  • Cost optimization with right-sized resources

Lessons Learned

1. Start Small, Think Big

Don’t try to migrate everything at once. Start with the least coupled service and build momentum.

2. Data Consistency is Hard

Choose the right consistency model for each use case. Not everything needs to be strongly consistent.

3. Observability is Critical

Invest in monitoring, logging, and tracing from day one. You’ll need it to debug distributed systems.

4. Team Communication is Key

Microservices require excellent communication between teams. Invest in API contracts and documentation.

5. Testing is More Complex

Contract testing and integration testing become crucial in a distributed system.

What’s Next?

The migration journey never truly ends. Current focus areas:

  • Service mesh optimization with Istio advanced features
  • Event sourcing for better audit trails
  • Multi-region deployment for global availability
  • Machine learning integration for intelligent scaling

The transformation from monolith to microservices has been challenging but incredibly rewarding. The system is now more resilient, scalable, and maintainable than ever before.

Interested in microservices architecture? I’d love to hear about your experiences and challenges. Connect with me on GitHub or LinkedIn!