๐ About This Project
This is a comprehensive Kubernetes Site Reliability Engineering (SRE) platform demonstrating microservices deployment, observability, testing, and GitOps practices. The project showcases modern cloud-native technologies and best practices for running production workloads on Kubernetes.
The platform includes a complete microservices e-commerce application (Online Boutique), comprehensive monitoring and logging, continuous delivery via GitOps, and automated testing frameworks for reliability assurance.
๐๏ธ Online Boutique Microservices
A Google Cloud microservices demo application showcasing a 12-tier e-commerce application. The application consists of multiple microservices written in different languages (Java, Go, Python, Node.js, C#) communicating via gRPC and HTTP.
Namespace: online-boutique
Services: frontend, cartservice, checkoutservice, productcatalogservice, recommendationservice, currencyservice, paymentservice, shippingservice, emailservice, adservice, redis-cart
๐งช Testing Frameworks
Sanity Test
Automated health check framework that continuously validates the health endpoints of all microservices in the online-boutique namespace. It tests connectivity, response times, and service availability for each microservice (gRPC, HTTP, and TCP protocols).
Namespace: sanity-test
Tests: All microservices in online-boutique namespace
Availability Test
Continuous availability testing framework that runs periodic tests (every 5 minutes) on critical services (Cart Service and Frontend Service). It provides a dashboard showing test history, success rates, and failure analysis.
Namespace: availability-test
Test Interval: 300 seconds (5 minutes)
Tests: Cart Service, Frontend Service
๐ Continuous Delivery (GitOps)
ArgoCD is used for GitOps-based continuous delivery, automatically syncing applications from Git repositories to Kubernetes clusters. All deployments are managed declaratively through Git, ensuring version control, audit trails, and consistent deployments.
ArgoCD UI GitOps
GitOps continuous delivery platform
Access: Port-forward required (kubectl port-forward -n argocd svc/argocd-server 8080:443)
URL: https://localhost:8080
Internal URL: https://argocd-server.argocd.svc.cluster.local:443
Namespace: argocd
Applications Managed:
- Microservices Demo (online-boutique)
- Monitoring Stack (Prometheus, Grafana, Loki)
- Availability Test Framework
- Sanity Test Framework
Features:
- Automatic sync from Git repositories
- Self-healing deployments
- Application health monitoring
- Rollback capabilities
๐ Monitoring & Observability
Comprehensive observability stack providing metrics, logs, and visualization for the entire Kubernetes platform. The stack follows industry best practices for monitoring cloud-native applications.
Prometheus Metrics
Time-series metrics database and monitoring system
Features: Metrics collection, alerting, PromQL queries
Access PrometheusGrafana Visualization
Analytics and visualization platform
Features: Dashboards, metrics visualization, log queries
Access GrafanaLoki Logs
Log aggregation system (Prometheus-inspired)
Features: Log collection, LogQL queries, log visualization
Access via Grafana Explore (no direct UI)
Prometheus
Namespace: monitoring
What it monitors:
- Kubernetes cluster metrics (nodes, pods, services)
- Application metrics from microservices
- Kube State Metrics (cluster state)
- Node Exporter (host metrics)
Grafana
Namespace: monitoring
Data Sources: Prometheus (metrics), Loki (logs)
Features:
- Pre-configured dashboards for Kubernetes monitoring
- Custom dashboards for microservices
- Log exploration with LogQL
- Metrics visualization with PromQL
- Alerting and notification rules
Loki
Namespace: monitoring
Log Collection: Promtail (DaemonSet) collects logs from all pods
Access: Via Grafana Explore (Loki data source)
Features:
- Centralized log aggregation from all namespaces
- LogQL query language (similar to PromQL)
- Label-based log indexing
- Log retention and storage