The Journey from Monolith to Microservices: A Platform Engineer's Tale
How I transformed a legacy monolithic application into a scalable microservices architecture using Kubernetes, Docker, and modern DevOps practices
The Journey from Monolith to Microservices 🚀
After three years of maintaining a monolithic application that was growing increasingly complex and difficult to scale, I made the decision to embark on a complete architectural transformation. This is the story of how I broke down a legacy system into a modern, scalable microservices architecture using Kubernetes, Docker, and cloud-native technologies.
The Legacy System: A Monolith’s Struggles
Our monolithic application had served us well initially, but as the team and user base grew, cracks began to appear:
- Deployment nightmares - A single bug could bring down the entire system
- Technology lock-in - We were stuck with outdated frameworks
- Scaling bottlenecks - Different parts had different resource needs
- Team conflicts - Multiple teams stepping on each other’s toes
- Testing complexity - Integration tests were slow and brittle
The breaking point came when a simple feature request required changes across 15 different files and took three weeks to implement safely.
The Migration Strategy: Strangler Fig Pattern
Instead of a big-bang rewrite, I adopted the Strangler Fig pattern to gradually replace the monolith:
Phase 1: Containerization
First, I containerized the existing monolith to establish a baseline:
# Dockerfile for the monolith
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
RUN npm run build
EXPOSE 3000
CMD ["npm", "start"]
Phase 2: Extract User Service
I started by extracting the user management functionality:
// user-service/main.go
package main
import (
"context"
"encoding/json"
"log"
"net/http"
"time"
"github.com/gin-gonic/gin"
"github.com/go-redis/redis/v8"
"gorm.io/gorm"
)
type UserService struct {
db *gorm.DB
redis *redis.Client
}
func (s *UserService) GetUser(c *gin.Context) {
userID := c.Param("id")
// Check cache first
cached, err := s.redis.Get(context.Background(), "user:"+userID).Result()
if err == nil {
var user User
json.Unmarshal([]byte(cached), &user)
c.JSON(http.StatusOK, user)
return
}
// Fallback to database
var user User
if err := s.db.First(&user, userID).Error; err != nil {
c.JSON(http.StatusNotFound, gin.H{"error": "User not found"})
return
}
// Cache the result
userJSON, _ := json.Marshal(user)
s.redis.Set(context.Background(), "user:"+userID, userJSON, time.Hour)
c.JSON(http.StatusOK, user)
}
Phase 3: API Gateway Implementation
To route traffic between the monolith and new services:
# api-gateway.yaml
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: api-gateway
spec:
hosts:
- api.example.com
http:
- match:
- uri:
prefix: /api/v1/users
route:
- destination:
host: user-service
port:
number: 8080
- match:
- uri:
prefix: /api/v1/orders
route:
- destination:
host: order-service
port:
number: 8080
- route:
- destination:
host: monolith
port:
number: 3000
Service Discovery and Communication
Service Mesh with Istio
I implemented Istio for service-to-service communication:
# istio-config.yaml
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: user-service
spec:
host: user-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
http1MaxPendingRequests: 10
maxRequestsPerConnection: 2
circuitBreaker:
consecutiveErrors: 3
interval: 30s
baseEjectionTime: 30s
Event-Driven Architecture
For loose coupling between services:
// event-publisher.go
type EventPublisher struct {
producer kafka.Producer
}
func (p *EventPublisher) PublishUserCreated(user *User) error {
event := UserCreatedEvent{
UserID: user.ID,
Email: user.Email,
CreatedAt: user.CreatedAt,
}
return p.producer.Produce(&kafka.Message{
TopicPartition: kafka.TopicPartition{Topic: &"user.events", Partition: kafka.PartitionAny},
Value: event.ToJSON(),
}, nil)
}
// event-consumer.go
func (s *OrderService) HandleUserCreated(event UserCreatedEvent) error {
// Create user profile in order service
profile := UserProfile{
UserID: event.UserID,
Email: event.Email,
}
return s.db.Create(&profile).Error
}
Database Strategy: Database per Service
Each service got its own database to ensure data isolation:
User Service Database
-- user-service/schema.sql
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email VARCHAR(255) UNIQUE NOT NULL,
name VARCHAR(255) NOT NULL,
created_at TIMESTAMP DEFAULT NOW(),
updated_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_users_email ON users(email);
Order Service Database
-- order-service/schema.sql
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id UUID NOT NULL,
total DECIMAL(10,2) NOT NULL,
status VARCHAR(50) NOT NULL,
created_at TIMESTAMP DEFAULT NOW()
);
CREATE INDEX idx_orders_user_id ON orders(user_id);
Monitoring and Observability
Distributed Tracing with Jaeger
// tracing.go
import (
"github.com/opentracing/opentracing-go"
"github.com/uber/jaeger-client-go"
)
func initTracing() {
cfg := jaegerconfig.Configuration{
ServiceName: "user-service",
Sampler: &jaegerconfig.SamplerConfig{
Type: jaeger.SamplerTypeConst,
Param: 1,
},
Reporter: &jaegerconfig.ReporterConfig{
LogSpans: true,
},
}
tracer, closer, _ := cfg.NewTracer()
opentracing.SetGlobalTracer(tracer)
defer closer.Close()
}
func (s *UserService) GetUser(c *gin.Context) {
span, ctx := opentracing.StartSpanFromContext(c.Request.Context(), "get_user")
defer span.Finish()
// Add tags for better trace visibility
span.SetTag("user.id", c.Param("id"))
span.SetTag("service.name", "user-service")
// ... rest of the implementation
}
Metrics Collection
// metrics.go
import (
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promauto"
)
var (
httpRequestsTotal = promauto.NewCounterVec(
prometheus.CounterOpts{
Name: "http_requests_total",
Help: "Total number of HTTP requests",
},
[]string{"method", "endpoint", "status_code", "service"},
)
httpRequestDuration = promauto.NewHistogramVec(
prometheus.HistogramOpts{
Name: "http_request_duration_seconds",
Help: "Duration of HTTP requests in seconds",
Buckets: prometheus.DefBuckets,
},
[]string{"method", "endpoint", "service"},
)
)
Testing Strategy
Contract Testing with Pact
// user-service/contract_test.go
func TestUserServiceContract(t *testing.T) {
// Create a mock provider
provider := pact.NewV3Pact(pact.MockServerConfig{
Consumer: "order-service",
Provider: "user-service",
Host: "127.0.0.1",
Port: 8080,
})
provider.
AddInteraction().
Given("user exists").
UponReceiving("a request for user details").
WithRequest(dsl.Request{
Method: "GET",
Path: dsl.String("/api/v1/users/123"),
}).
WillRespondWith(dsl.Response{
Status: 200,
Headers: dsl.MapMatcher{
"Content-Type": dsl.String("application/json"),
},
Body: dsl.Match(userResponse),
})
err := provider.ExecuteTest(func(config pact.ServerConfig) error {
// Test the actual service
client := NewUserServiceClient(fmt.Sprintf("http://localhost:%d", config.Port))
user, err := client.GetUser("123")
assert.NoError(t, err)
assert.Equal(t, "123", user.ID)
return nil
})
assert.NoError(t, err)
}
Integration Testing
// integration_test.go
func TestUserServiceIntegration(t *testing.T) {
// Setup test environment
testDB := setupTestDB(t)
testRedis := setupTestRedis(t)
defer cleanup(t, testDB, testRedis)
// Create service instance
service := NewUserService(testDB, testRedis)
// Test user creation
user := &User{
Email: "test@example.com",
Name: "Test User",
}
err := service.CreateUser(user)
assert.NoError(t, err)
assert.NotEmpty(t, user.ID)
// Test user retrieval
retrieved, err := service.GetUser(user.ID)
assert.NoError(t, err)
assert.Equal(t, user.Email, retrieved.Email)
}
Deployment Pipeline
GitOps with ArgoCD
# argocd-app.yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: user-service
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/company/k8s-manifests
targetRevision: HEAD
path: user-service
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
CI/CD Pipeline
# .github/workflows/user-service.yml
name: User Service CI/CD
on:
push:
branches: [main]
paths: ['user-service/**']
jobs:
test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-go@v4
with:
go-version: '1.21'
- name: Run tests
run: |
cd user-service
go test ./...
- name: Run contract tests
run: |
cd user-service
go test -tags=contract ./...
build:
needs: test
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build Docker image
run: |
cd user-service
docker build -t user-service:${{ github.sha }} .
- name: Push to registry
run: |
docker tag user-service:${{ github.sha }} ${{ secrets.REGISTRY }}/user-service:${{ github.sha }}
docker push ${{ secrets.REGISTRY }}/user-service:${{ github.sha }}
deploy:
needs: build
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Update image tag
run: |
sed -i "s|IMAGE_TAG|${{ github.sha }}|g" k8s-manifests/user-service/deployment.yaml
- name: Commit changes
run: |
git config --local user.email "action@github.com"
git config --local user.name "GitHub Action"
git add k8s-manifests/user-service/deployment.yaml
git commit -m "Update user-service image to ${{ github.sha }}"
git push
The Results: A Modern Architecture
After 8 months of gradual migration, here’s what we achieved:
Performance Improvements
- 60% reduction in average response time
- 99.9% uptime with automatic failover
- Independent scaling of each service
- Sub-second deployments with zero downtime
Developer Experience
- Faster feature development with team autonomy
- Easier testing with isolated services
- Technology diversity - each team can choose their stack
- Reduced conflicts with clear service boundaries
Operational Benefits
- Granular monitoring with service-level metrics
- Easier debugging with distributed tracing
- Faster incident response with isolated failures
- Cost optimization with right-sized resources
Lessons Learned
1. Start Small, Think Big
Don’t try to migrate everything at once. Start with the least coupled service and build momentum.
2. Data Consistency is Hard
Choose the right consistency model for each use case. Not everything needs to be strongly consistent.
3. Observability is Critical
Invest in monitoring, logging, and tracing from day one. You’ll need it to debug distributed systems.
4. Team Communication is Key
Microservices require excellent communication between teams. Invest in API contracts and documentation.
5. Testing is More Complex
Contract testing and integration testing become crucial in a distributed system.
What’s Next?
The migration journey never truly ends. Current focus areas:
- Service mesh optimization with Istio advanced features
- Event sourcing for better audit trails
- Multi-region deployment for global availability
- Machine learning integration for intelligent scaling
The transformation from monolith to microservices has been challenging but incredibly rewarding. The system is now more resilient, scalable, and maintainable than ever before.
Interested in microservices architecture? I’d love to hear about your experiences and challenges. Connect with me on GitHub or LinkedIn!