Skip to content

Anushreebasics/Store-Orchestration

Repository files navigation

Store Provisioning Platform

A Kubernetes-native platform for automatically provisioning and managing multiple ecommerce stores (WooCommerce/MedusaJS).

Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                       User's Browser                          │
└───────────────────────────┬───────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                    React Dashboard                            │
│                 (Port 3000 / Ingress)                         │
└───────────────────────────┬───────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                     Backend API                               │
│          (Node.js/Express + PostgreSQL)                       │
│                                                               │
│  - Store CRUD operations                                      │
│  - Status tracking                                            │
│  - Validation & rate limiting                                 │
└───────────────────────────┬───────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│              Provisioning Orchestrator                        │
│           (Kubernetes Controller/Operator)                    │
│                                                               │
│  - Watches store creation requests                            │
│  - Creates namespaces + resources                             │
│  - Manages lifecycle (create/update/delete)                   │
│  - Health checks & status updates                             │
└───────────────────────────┬───────────────────────────────────┘
                            │
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                   Per-Store Resources                         │
│                  (Isolated Namespaces)                        │
│                                                               │
│  Each store gets:                                             │
│  - Namespace: store-{id}                                      │
│  - Deployment: WordPress+WooCommerce or MedusaJS              │
│  - StatefulSet: MySQL/PostgreSQL                              │
│  - PVC: Persistent storage                                    │
│  - Service: Internal routing                                  │
│  - Ingress: External access                                   │
│  - Secrets: DB credentials, API keys                          │
│  - ConfigMap: Store configuration                             │
│  - ResourceQuota: CPU/Memory limits                           │
└─────────────────────────────────────────────────────────────┘

Technology Stack

Platform Components

  • Dashboard: React + Vite + TailwindCSS
  • Backend API: Node.js + Express + TypeScript
  • Database: PostgreSQL (stores metadata)
  • Provisioning: Node.js controller using k8s client
  • Container Registry: Docker Hub (or private registry)

Store Engines

  • WooCommerce: WordPress + WooCommerce + MySQL
  • MedusaJS: Node.js + PostgreSQL + Redis

Infrastructure

  • Local: Kind/k3d/Minikube
  • Production: k3s on VPS
  • Ingress: NGINX Ingress Controller
  • Storage: Local Path Provisioner (local) / Longhorn/NFS (prod)
  • Deployment: Helm 3

System Components

1. Backend API

Responsibilities:

  • Store CRUD operations via REST API
  • Authentication & authorization
  • Rate limiting & abuse prevention
  • Store status tracking
  • Database operations (PostgreSQL)

API Endpoints

Store Management

POST   /api/stores              - Create new store
GET    /api/stores              - List all stores (requires x-user-id header)
GET    /api/stores/:id          - Get store details
DELETE /api/stores/:id          - Delete store (requires x-user-id header)
GET    /api/health              - Health check
GET    /api/metrics             - Platform metrics (stores count, status breakdown, etc.)

Example Requests

# Create store
curl -X POST http://api.127.0.0.1.nip.io/api/stores \
  -H "Content-Type: application/json" \
  -H "x-user-id: user123" \
  -d '{"name": "my-store", "engine": "woocommerce"}'

# List stores
curl http://api.127.0.0.1.nip.io/api/stores \
  -H "x-user-id: user123"

# Get metrics
curl http://api.127.0.0.1.nip.io/api/metrics

# Delete store
curl -X DELETE http://api.127.0.0.1.nip.io/api/stores/{store-id} \
  -H "x-user-id: user123"

2. Provisioning Orchestrator

Responsibilities:

  • Watch for new store creation requests
  • Generate Kubernetes resources from templates
  • Apply resources using k8s API
  • Monitor provisioning status
  • Handle failures & retries
  • Cleanup on deletion

Flow:

  1. Backend creates store record (status: Provisioning)
  2. Orchestrator picks up request
  3. Creates namespace store-{id}
  4. Applies ResourceQuota & LimitRange
  5. Deploys database (StatefulSet + PVC)
  6. Deploys store application (Deployment)
  7. Creates Service & Ingress
  8. Runs init Jobs (DB setup, sample data)
  9. Polls readiness probes
  10. Updates status to Ready/Failed

3. React Dashboard

Features:

  • View all stores in a table/grid
  • Create new store form (select engine: WooCommerce/Medusa)
  • Store status indicators
  • Store URLs with direct links
  • Delete store with confirmation
  • Real-time status updates (polling/WebSocket)

Getting Started

Prerequisites

  • Docker Desktop with Kubernetes enabled, OR
  • Kind/k3d/Minikube installed
  • Helm 3.x
  • kubectl
  • Node.js 20+
  • Git

Quick Start (Local)

# 1. Clone repository
git clone <repo-url>
cd urumi

# 2. Start local Kubernetes cluster using Kind
./scripts/setup-local-cluster.sh
# This creates a cluster named 'kind-store-platform'

# 3. Install NGINX Ingress Controller
kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/kind/deploy.yaml

# Wait for ingress controller to be ready
kubectl wait --namespace ingress-nginx \
  --for=condition=ready pod \
  --selector=app.kubernetes.io/component=controller \
  --timeout=90s

# 4. Set up local DNS (using nip.io - no hosts file needed)
# nip.io automatically resolves *.127.0.0.1.nip.io to 127.0.0.1
# No /etc/hosts entries required!

# 5. Install platform using Helm
helm install store-platform ./helm/store-platform \
  -f ./helm/store-platform/values-local.yaml \
  --namespace platform \
  --create-namespace

# Note: The cluster created is named 'kind-store-platform'
# Access it with: kubectl config use-context kind-store-platform

# 6. Wait for platform to be ready
kubectl wait --namespace platform \
  --for=condition=ready pod \
  --selector=app=backend \
  --timeout=300s

# 7. Access dashboard
# Open http://dashboard.127.0.0.1.nip.io in browser
# API available at: http://api.127.0.0.1.nip.io

Production Deployment (k3s on VPS)

# 1. Install k3s on VPS
curl -sfL https://get.k3s.io | sh -

# 2. Copy kubeconfig
sudo cat /etc/rancher/k3s/k3s.yaml

# 3. Install platform
helm install store-platform ./helm/store-platform \
  -f ./helm/store-platform/values-prod.yaml \
  --namespace platform \
  --create-namespace \
  --set domain=yourdomain.com \
  --set ingress.tls.enabled=true

# 4. Set up DNS records
# Point *.yourdomain.com to your VPS IP

# 5. Install cert-manager (optional but recommended)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.0/cert-manager.yaml

Creating a Store

Via Dashboard

  1. Open http://dashboard.127.0.0.1.nip.io
  2. Click "Create New Store"
  3. Enter store name
  4. Select engine (WooCommerce/Medusa)
  5. Click "Create"
  6. Wait for status to become "Ready" (~2-3 minutes)
  7. Click store URL to access storefront

Via API

curl -X POST http://api.127.0.0.1.nip.io/api/stores \
  -H "Content-Type: application/json" \
  -H "x-user-id: test-user" \
  -d '{
    "name": "my-store",
    "engine": "woocommerce"
  }'

Placing an Order (Definition of Done)

WooCommerce

  1. Open storefront URL: http://store-{id}.127.0.0.1.nip.io
  2. Browse products (sample products pre-loaded: Blue Shirt $29.99, Red Hat $14.99)
  3. Click "Add to Cart" on any product
  4. Go to Cart → Proceed to Checkout
  5. Fill in billing details (test data)
  6. Select "Cash on Delivery" payment method (enabled by default)
  7. Click "Place Order"
  8. Verify order created in admin: http://store-{id}.127.0.0.1.nip.io/wp-admin
    • Username: admin
    • Password: admin

MedusaJS

  1. Open storefront URL: http://store-{id}.127.0.0.1.nip.io
  2. Browse products (sample products pre-loaded)
  3. Add product to cart
  4. Proceed to checkout
  5. Fill in shipping/payment details (test mode)
  6. Complete order
  7. Verify in admin: http://store-{id}.127.0.0.1.nip.io/admin
    • Login with default credentials

Deleting a Store

Via Dashboard

  1. Click "Delete" button on store card
  2. Confirm deletion
  3. All resources cleaned up within 30s

Via API

curl -X DELETE http://api.127.0.0.1.nip.io/api/stores/{store-id} \
  -H "x-user-id: test-user"

Cleanup includes:

  • Namespace deletion (cascading delete of all resources)
  • PVC removal
  • Database record deletion
  • Ingress rule cleanup

Project Structure

urumi/
├── README.md
├── helm/
│   ├── store-platform/          # Main platform chart
│   │   ├── Chart.yaml
│   │   ├── values.yaml
│   │   ├── values-local.yaml
│   │   ├── values-prod.yaml
│   │   ├── templates/
│   │   │   ├── backend/
│   │   │   ├── orchestrator/
│   │   │   ├── dashboard/
│   │   │   ├── postgresql/
│   │   │   └── ingress/
│   │   └── charts/              # Sub-charts
│   └── store-instance/          # Store instance chart
│       ├── Chart.yaml
│       ├── values.yaml
│       └── templates/
│           ├── woocommerce/
│           └── medusa/
├── backend/                     # Node.js API
│   ├── src/
│   │   ├── controllers/
│   │   ├── models/
│   │   ├── routes/
│   │   ├── services/
│   │   └── index.ts
│   ├── package.json
│   ├── tsconfig.json
│   └── Dockerfile
├── orchestrator/                # Provisioning controller
│   ├── src/
│   │   ├── controllers/
│   │   ├── templates/
│   │   ├── reconciler.ts
│   │   └── index.ts
│   ├── package.json
│   └── Dockerfile
├── dashboard/                   # React frontend
│   ├── src/
│   │   ├── components/
│   │   ├── pages/
│   │   ├── services/
│   │   └── App.tsx
│   ├── package.json
│   ├── vite.config.ts
│   └── Dockerfile
├── scripts/
│   ├── setup-local-cluster.sh
│   ├── test-store-creation.sh
│   ├── upgrade-platform.sh          # Automated platform upgrade with checks
│   ├── rollback-platform.sh         # Automated platform rollback
│   ├── upgrade-store-image.sh       # Upgrade individual store images
│   └── demo-upgrade-rollback.sh     # Interactive demo of upgrade/rollback
└── docs/
    ├── ARCHITECTURE.md              # System design & trade-offs
    ├── IMPLEMENTATION_DETAILS.md    # Comprehensive implementation guide

Development Workflow

1. Local Development Setup

# Backend
cd backend
npm install
npm run dev

# Orchestrator
cd orchestrator
npm install
npm run dev

# Dashboard
cd dashboard
npm install
npm run dev

2. Build Docker Images

# Backend
docker build -t store-platform-backend:latest ./backend

# Orchestrator
docker build -t store-platform-orchestrator:latest ./orchestrator

# Dashboard
docker build -t store-platform-dashboard:latest ./dashboard

3. Load Images to Kind (local testing)

kind load docker-image store-platform-backend:latest --name store-platform
kind load docker-image store-platform-orchestrator:latest --name store-platform
kind load docker-image store-platform-dashboard:latest --name store-platform

4. Upgrade Helm Release

helm upgrade store-platform ./helm/store-platform \
  -f ./helm/store-platform/values-local.yaml \
  --namespace platform

Key Design Decisions

Namespace-per-Store Isolation

  • Each store gets its own namespace: store-{uuid}
  • Provides resource isolation, security boundaries
  • Enables easy cleanup via namespace deletion
  • Allows per-store ResourceQuota & NetworkPolicy

Helm for Deployment

  • Platform chart deploys API/orchestrator/dashboard/DB
  • Store instance chart used by orchestrator to provision stores
  • Values files handle local vs prod differences
  • Enables version control and rollback

Idempotency & Failure Handling

  • Store creation assigned UUID before K8s resources created
  • Orchestrator checks if resources exist before creating
  • Failed provisioning marked with error status
  • Retry mechanism with exponential backoff
  • Timeout after 10 minutes → marked as Failed

Database Choices

  • Platform DB (PostgreSQL): Stores metadata, user data, audit logs
  • WooCommerce DB (MySQL): Standard for WordPress
  • MedusaJS DB (PostgreSQL): Required by Medusa + Redis for cache

Security Considerations

  • RBAC: ✅ Orchestrator ServiceAccount with ClusterRole (least-privilege permissions)
    • Namespace management, Deployment/StatefulSet/Job operations
    • NetworkPolicy creation and management
    • Pod log access for debugging
  • Secrets: ✅ Generated per-store, stored in Kubernetes Secrets
  • NetworkPolicy: ✅ Implemented - Deny-by-default, allow only required traffic
    • Backend: Ingress from nginx, egress to PostgreSQL/Redis/DNS
    • Dashboard: Ingress from nginx, egress to DNS
    • Orchestrator: Ingress from backend, egress to K8s API/Redis/DNS
  • Container security: ✅ Implemented
    • Non-root users (UID 1000 for backend/orchestrator, UID 101 for dashboard)
    • Read-only root filesystem (dashboard with tmpfs volumes)
    • Dropped all capabilities
    • Seccomp profile: RuntimeDefault
  • No secrets in source code: ✅ All secrets in Kubernetes Secret objects or environment variables

Local vs Production Differences (Helm Values)

Aspect Local (values-local.yaml) Production (values-prod.yaml)
Domain *.127.0.0.1.nip.io (nip.io) *.yourdomain.com (DNS)
Ingress HTTP only HTTPS with cert-manager
Storage Kind local storage Longhorn/NFS/Cloud PV
Replicas Min 1 (HPA enabled) Min 2+ for HA
Resources Minimal limits Production-grade limits
Secrets Auto-generated External secrets manager
Monitoring Optional Prometheus + Grafana
Backup None Velero/Restic

Testing the Platform

End-to-End Test

# Run automated test
./scripts/test-store-creation.sh

# This script will:
# 1. Create a WooCommerce store via API
# 2. Wait for Ready status
# 3. Access storefront
# 4. Simulate adding product to cart (curl)
# 5. Verify checkout page accessible
# 6. Delete store
# 7. Verify cleanup complete

Upgrade/Rollback Demo

# Interactive demo showing upgrade and rollback workflow
./scripts/demo-upgrade-rollback.sh

# This demo will:
# 1. Show current platform state
# 2. Simulate an upgrade (change backend image)
# 3. Roll back to previous version
# 4. Show revision history

Monitoring & Observability

Platform Metrics

  • Stores created (total, by engine)
  • Provisioning duration (avg, p95, p99)
  • Failed provisioning count
  • Active stores
  • Resource usage per store

Store-Level Events

  • Audit log: store created/deleted, by whom, when
  • Provisioning logs: step-by-step progress
  • Health check failures
  • Resource quota violations

Abuse Prevention

Rate Limiting

  • Max 5 store creations per hour per user/IP
  • Max 10 concurrent stores per user

Resource Quotas

  • Per-store CPU: 2 cores max
  • Per-store memory: 4Gi max
  • Per-store storage: 10Gi max
  • Max PVCs per namespace: 5

Timeouts

  • Provisioning timeout: 10 minutes
  • Store idle timeout: Optional auto-delete after 7 days

Audit Trail

  • All API calls logged with user/IP/timestamp
  • Store creation/deletion events stored in DB
  • Failed attempts tracked for security analysis

Scaling Plan

  • Backend API: ✅ HPA Configured (1-10 replicas, CPU/Memory targets)

    • Stateless, scales horizontally behind Service
    • Multiple replicas handle concurrent requests
  • Dashboard: ✅ HPA Configured (1-5 replicas)

    • Static files, scales via multiple pods
    • All pods serve same content from persistent volume
  • Orchestrator: ✅ HPA Configured (1-5 replicas) + ✅ Queue-based concurrency

    • Leader election (single active provisioner)
    • Queue: Redis + Bull (configurable concurrent provisions)
    • Currently limited to 3 concurrent provisions (configurable via MAX_CONCURRENT_PROVISIONS)
    • Multiple workers in pool share the queue

Handling High Load:

  • ✅ Queue-based provisioning (Bull + Redis)
  • ✅ Worker concurrency control (max 3 concurrent jobs)
  • ✅ Caching ready (Redis available)
  • Per-store isolation ensures one slow store doesn't block others

Upgrade & Rollback

Automated Scripts

# Upgrade platform (with pre-checks and validation)
./scripts/upgrade-platform.sh --local   # For local development
./scripts/upgrade-platform.sh --prod    # For production

# Rollback platform
./scripts/rollback-platform.sh          # Rollback to previous revision
./scripts/rollback-platform.sh 2        # Rollback to specific revision

# Upgrade individual store images
./scripts/upgrade-store-image.sh store-abc123 wordpress wordpress:6.5-php8.2-apache

# Upgrade store instances
./scripts/upgrade-store-image.sh <store-namespace> <deployment> <new-image>

Zero-Downtime Strategy

  • ✅ Implemented: RollingUpdate deployment strategy
  • ✅ Configured readinessProbe and livenessProbe on all components
  • maxUnavailable: 0, maxSurge: 1 ensures zero downtime
  • ✅ Automated scripts validate health before/after upgrades
  • Test upgrades with continuous traffic to verify no dropped requests

Troubleshooting

Store stuck in "Provisioning"

# Check orchestrator logs
kubectl logs -n platform -l app=orchestrator

# Check store namespace events
kubectl get events -n store-{id} --sort-by='.lastTimestamp'

# Check pod status
kubectl get pods -n store-{id}

Store creation fails immediately

  • Check API logs for validation errors
  • Verify ResourceQuota not exceeded
  • Check RBAC permissions for orchestrator
  • Ensure storage class available

Cannot access store URL

  • Verify Ingress controller running
  • Check Ingress resource created: kubectl get ingress -n store-{id}
  • Verify DNS/hosts file configured
  • Check Service endpoints: kubectl get endpoints -n store-{id}

Implemented Features ✅

  • WooCommerce store engine (fully functional)
  • NetworkPolicies (deny-by-default security)
  • RBAC for orchestrator (ClusterRole with least-privilege)
  • Horizontal Pod Autoscaling (HPA) for all components
  • Prometheus metrics export (/api/metrics)
  • Security contexts (non-root, read-only filesystem, dropped capabilities)
  • Concurrency controls (max 3 concurrent provisions)
  • Automated product initialization for WooCommerce stores
  • Upgrade & Rollback scripts with automated validation
  • Zero-downtime deployments (RollingUpdate strategy with preStop hooks)
  • Graceful shutdown with connection draining

Next Steps / Roadmap

  • Add second store engine (MedusaJS - template exists but needs testing)
  • Implement TLS with cert-manager
  • Add user authentication (OAuth2/OIDC for dashboard/API)
  • Implement WebSocket for real-time status updates
  • Add backup/restore functionality (Velero)
  • Create Terraform modules for VPS setup
  • Implement store version upgrades (WordPress/WooCommerce updates)
  • Add custom domain mapping per store
  • Implement metrics-server for HPA metric collection
  • Add Grafana dashboards for observability

License

MIT

Contact

For questions, reach out via GitHub issues.

About

A Kubernetes-based platform for automated provisioning and lifecycle management of ecommerce stores, featuring a React dashboard, Node.js API, and infrastructure orchestration for scalable deployments.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors