Skip to content

paperclipinc/paperclip-operator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Paperclip Kubernetes Operator

CI Go Report Card License Kubernetes Go

Deploy and manage Paperclip AI agent orchestration instances on Kubernetes with production-grade security, observability, and lifecycle management.

Paperclip is an open-source AI agent orchestration platform. While you can deploy it manually, production Kubernetes deployments involve more than a Deployment and a Service -- you need database provisioning, secret management, persistent storage, health monitoring, network isolation, scaling, backup, and config rollouts, all wired correctly. This operator encodes those concerns into a single Instance custom resource so you can go from zero to production in minutes:

apiVersion: paperclip.inc/v1alpha1
kind: Instance
metadata:
  name: my-paperclip
spec:
  deployment:
    mode: authenticated
  database:
    mode: managed
  auth:
    secretRef:
      name: paperclip-auth
      key: BETTER_AUTH_SECRET
  adapters:
    apiKeysSecretRef:
      name: paperclip-api-keys
  storage:
    persistence:
      enabled: true
      size: 5Gi

The operator reconciles this into a fully managed stack of Kubernetes resources: secured, monitored, and self-healing.


Features

Feature Details
Declarative Single CRD One resource defines the entire stack: StatefulSet, Service, ConfigMap, PVC, ServiceAccount, NetworkPolicy, Ingress, HPA, PDB, and more
Database Managed PostgreSQL Provisions PostgreSQL 17 with auto-generated credentials, data checksums, and graceful shutdown -- or connect to an external database, or use embedded PGlite
Auth Full auth lifecycle Better Auth with OAuth providers (Google, Apple), email verification via Resend, and automatic admin user bootstrap
Secure Hardened by default Non-root, all capabilities dropped, seccomp RuntimeDefault, default-deny NetworkPolicy, minimal RBAC
Observable Built-in metrics 7 Prometheus metrics, ServiceMonitor integration, configurable log levels
Scalable Auto-scaling HPA with CPU/memory targets, PodDisruptionBudgets, topology spread constraints
Smart Probes Mode-aware health checks Automatically uses TCP probes in authenticated mode (where /api/health returns 403)
Storage S3 + Redis S3/MinIO/R2 for multi-replica file storage, managed or external Redis for rate limiting
Backup S3-backed snapshots Scheduled backups with configurable retention, point-in-time restore into new instances
Secrets Encrypted secrets Paperclip's built-in secrets management with master key support and strict mode
Connections OAuth integrations GitHub, GitLab, Slack, and more via the Paperclip connections system
Cloud Sandbox Isolated execution Agent runtimes in isolated Kubernetes pods with persistent workspaces, inference metering proxy, resource tiers, and multi-namespace isolation
Extensible Sidecars & init containers Add custom sidecar containers, init containers, extra volumes, and volume mounts
Auto-Update Registry polling Opt-in digest-based image update detection with automatic rollouts
Plugins Declarative install Install Paperclip plugins via spec.plugins

Architecture

+--------------------------------------------------------------+
|  Instance CR                                                  |
|  (your declarative config)                                    |
+--------------+-----------------------------------------------+
               | watch
               v
+--------------------------------------------------------------+
|  Paperclip Operator                                          |
|  +----------+  +-----------+  +---------------------------+  |
|  | Reconciler|  | Finalizer |  |   Prometheus Metrics      |  |
|  |           |  | (backup   |  |  (reconcile count,        |  |
|  | creates  -->  |  on delete)|  |   duration, phases)      |  |
|  +----------+  +-----------+  +---------------------------+  |
+--------------+-----------------------------------------------+
               | manages
               v
+--------------------------------------------------------------+
|  Managed Resources (per instance)                            |
|                                                              |
|  ServiceAccount    ConfigMap       NetworkPolicy             |
|  PVC               Ingress         PDB                       |
|  HPA               ServiceMonitor  CronJob (backup)          |
|                                                              |
|  StatefulSet                                                 |
|  +--------------------------------------------------------+  |
|  | Paperclip Container (Node.js, port 3100)               |  |
|  +--------------------------------------------------------+  |
|  + custom init containers + custom sidecars                  |
|                                                              |
|  Service (ClusterIP/LoadBalancer/NodePort)                   |
|                                                              |
|  [Managed PostgreSQL StatefulSet + Service + PVC] (optional) |
|  [Managed Redis StatefulSet + Service + PVC]      (optional) |
+--------------------------------------------------------------+

Quick Start

Prerequisites

  • Kubernetes 1.28+
  • Helm 3 (recommended) or kubectl

1. Install the operator

# Via Helm (recommended)
helm install paperclip-operator \
  oci://ghcr.io/paperclipinc/charts/paperclip-operator \
  --namespace paperclip-operator-system \
  --create-namespace
Alternative: install with kubectl
kubectl apply -f https://github.com/paperclipinc/paperclip-operator/releases/latest/download/install.yaml
Alternative: install with Kustomize
make install   # Install CRDs
make deploy IMG=ghcr.io/paperclipinc/paperclip-operator:latest

2. Create required Secrets

# Auth secret (required for authenticated mode)
kubectl create secret generic paperclip-auth \
  --from-literal=BETTER_AUTH_SECRET="$(openssl rand -hex 32)"

# LLM API keys (optional)
kubectl create secret generic paperclip-api-keys \
  --from-literal=ANTHROPIC_API_KEY="sk-ant-..." \
  --from-literal=OPENAI_API_KEY="sk-..."

3. Deploy a Paperclip instance

apiVersion: paperclip.inc/v1alpha1
kind: Instance
metadata:
  name: my-paperclip
spec:
  image:
    tag: latest
  deployment:
    mode: authenticated
  database:
    mode: managed
  auth:
    secretRef:
      name: paperclip-auth
      key: BETTER_AUTH_SECRET
  adapters:
    apiKeysSecretRef:
      name: paperclip-api-keys
  storage:
    persistence:
      enabled: true
      size: 5Gi
kubectl apply -f my-paperclip.yaml

4. Verify

kubectl get instances
# or use the shorthand:
kubectl get pci
NAME           PHASE     ENDPOINT                                              AGE
my-paperclip   Running   http://my-paperclip.default.svc.cluster.local:3100    5m
kubectl get pods
# NAME              READY   STATUS    AGE
# my-paperclip-0    1/1     Running   5m
# my-paperclip-db-0 1/1     Running   5m   (managed PostgreSQL)

Configuration

Deployment Modes

Control authentication and network exposure:

spec:
  deployment:
    mode: authenticated        # "open", "authenticated", or "single-tenant"
    exposure: private          # "private" (ClusterIP) or "public" (Ingress/LB)
    publicURL: https://paperclip.example.com   # required when exposure is "public"
    allowedHostnames:
      - paperclip.example.com  # CORS allowed hostnames
Mode Description
authenticated (default) Login required via Better Auth. Requires BETTER_AUTH_SECRET.
open No authentication. The operator binds to loopback (HOST=127.0.0.1) for safety.
single-tenant Single-user mode with authentication.
Exposure Description
private (default) ClusterIP Service only. Access via port-forward or internal DNS.
public Enables Ingress/LoadBalancer. Set publicURL for the external-facing URL.

Database

Three database modes for different deployment scenarios:

spec:
  database:
    mode: managed   # "embedded", "external", or "managed"
Mode Use Case
managed (default) Operator provisions PostgreSQL 17 as a StatefulSet with PVC and auto-generated credentials. Suitable for development and small deployments.
external Connect to an existing PostgreSQL instance. Recommended for production HA deployments (e.g., Amazon RDS, Cloud SQL, Azure Database for PostgreSQL).
embedded Uses PGlite (in-process SQLite-compatible storage). Single-node only, good for local development and testing.

Managed PostgreSQL

spec:
  database:
    mode: managed
    managed:
      image: postgres:17-alpine   # default
      storageSize: 10Gi           # default
      storageClass: gp3           # optional
      resources:
        requests:
          cpu: 250m
          memory: 256Mi
        limits:
          cpu: "1"
          memory: 1Gi

The operator provisions a dedicated PostgreSQL StatefulSet, Service, and PVC. Credentials are auto-generated and stored in a managed Secret. Data checksums are enabled and stop_mode is set to fast for graceful shutdown.

External database

spec:
  database:
    mode: external
    # Option 1: connection string (stored in etcd -- avoid if it contains credentials)
    externalURL: "postgresql://user:pass@host:5432/paperclip?sslmode=require"
    # Option 2: Secret reference (recommended for credentials)
    externalURLSecretRef:
      name: paperclip-database
      key: DATABASE_URL

Security: Prefer externalURLSecretRef over externalURL. The CRD spec is stored in etcd -- plaintext connection strings containing passwords are visible to anyone with read access to the custom resource.

Authentication

Better Auth secret

Required for authenticated and single-tenant modes:

spec:
  auth:
    secretRef:
      name: paperclip-auth
      key: BETTER_AUTH_SECRET

Automatic admin user bootstrap

Skip the manual setup screen by configuring an initial admin user. The operator creates a bootstrap Job that registers the admin account on first deployment:

spec:
  auth:
    adminUser:
      email: admin@example.com
      name: Admin                     # default: "Admin"
      passwordSecretRef:
        name: paperclip-admin
        key: password

OAuth providers

Enable social sign-in via Google or Apple. Each provider's Secret must contain the corresponding client ID and client secret keys:

spec:
  auth:
    google:
      credentialsSecretRef:
        name: google-oauth
        # Secret must contain GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET
    apple:
      credentialsSecretRef:
        name: apple-oauth
        # Secret must contain APPLE_CLIENT_ID and APPLE_CLIENT_SECRET

Email verification

Configure email delivery for verification and password reset via Resend:

spec:
  auth:
    email:
      resendAPIKeySecretRef:
        name: resend-api-key
        key: RESEND_API_KEY
      from: "Paperclip <noreply@example.com>"
      verificationRequired: true

Secrets Management

Paperclip includes a built-in encrypted secrets system. The operator injects the master encryption key:

spec:
  secrets:
    masterKeySecretRef:
      name: paperclip-secrets
      key: MASTER_KEY
    strictMode: true    # require all sensitive values to use encrypted references

LLM API Keys

Inject API keys for Anthropic, OpenAI, and other LLM providers from a Kubernetes Secret:

spec:
  adapters:
    apiKeysSecretRef:
      name: paperclip-api-keys
      # Secret should contain: ANTHROPIC_API_KEY, OPENAI_API_KEY, etc.

Managed Inference

For platform-managed LLM access with per-provider API keys:

spec:
  adapters:
    managedInferenceSecretRef:
      name: paperclip-managed-keys
      # Secret keys (one or more):
      #   PAPERCLIP_MANAGED_ANTHROPIC_API_KEY
      #   PAPERCLIP_MANAGED_OPENAI_API_KEY
      #   PAPERCLIP_MANAGED_GEMINI_API_KEY
      #   PAPERCLIP_MANAGED_OPENROUTER_API_KEY
    managedInferenceProvider: anthropic       # default provider for legacy single-key mode
    managedInferenceModel: claude-sonnet-4-6  # default model

Cloud Sandbox

Run agent runtimes in isolated Kubernetes pods with resource limits, persistent workspaces, and an optional inference metering proxy:

spec:
  adapters:
    cloudSandbox:
      enabled: true
      defaultImage: ghcr.io/paperclipinc/agent-multi:latest
      namespace: paperclip-sandboxes   # defaults to instance namespace
      idleTimeoutMin: 30               # reap idle pods after 30 minutes
      multiNamespace: true             # per-company namespace isolation
      resources:
        requests:
          cpu: 500m
          memory: 512Mi
        limits:
          cpu: "2"
          memory: 2Gi
      persistence:
        enabled: true
        storageClass: gp3
        size: 10Gi
      resourceTiers:
        small:
          requests:
            cpu: 250m
            memory: 256Mi
        large:
          requests:
            cpu: "2"
            memory: 4Gi
      inferenceProxy:
        enabled: true
        image: ghcr.io/paperclipinc/inference-proxy:latest
        port: 8090
Feature Description
Persistent workspaces PVC-backed workspaces that survive pod restarts
Multi-namespace Per-company namespace isolation for sandbox pods
Resource tiers Named presets (small, medium, large) for sandbox resource limits
Inference proxy Transparent metering proxy sidecar for API usage tracking
Idle reaping Automatic cleanup of idle sandbox pods

Connections (OAuth Integrations)

Enable Paperclip's connections system for third-party OAuth integrations (GitHub, GitLab, Slack, etc.):

spec:
  connections:
    credentialsSecretRef:
      name: paperclip-oauth-credentials
    credentialsKey: PAPERCLIP_OAUTH_CREDENTIALS   # default key name
    providersConfigRef:
      name: custom-providers   # optional: extend built-in provider catalog

The credentials Secret must contain a JSON object mapping provider IDs to OAuth client credentials:

apiVersion: v1
kind: Secret
metadata:
  name: paperclip-oauth-credentials
type: Opaque
stringData:
  PAPERCLIP_OAUTH_CREDENTIALS: |
    {
      "github": {
        "clientId": "Iv1.xxxxxxxxxxxxxxxx",
        "clientSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
      },
      "slack": {
        "clientId": "1234567890.1234567890",
        "clientSecret": "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
      }
    }

Set the OAuth callback URL to https://<your-domain>/api/connections/callback.

Plugins

Install Paperclip plugins declaratively:

spec:
  plugins:
    - name: "@paperclip/analytics"
      version: "1.2.0"
    - name: "some-other-plugin"

S3 Object Storage

Required for multi-replica deployments where all replicas need access to the same files. Supports AWS S3, MinIO, and Cloudflare R2:

spec:
  objectStorage:
    provider: s3           # "s3", "minio", or "r2"
    bucket: my-paperclip-storage
    region: us-east-1      # optional for S3
    endpoint: ""           # required for MinIO/R2
    credentialsSecretRef:
      name: paperclip-s3
      # Secret must contain S3_ACCESS_KEY_ID and S3_SECRET_ACCESS_KEY

Redis

Required for rate limiting and caching in multi-replica deployments:

spec:
  redis:
    mode: managed   # "managed" or "external"

Managed Redis

spec:
  redis:
    mode: managed
    managed:
      image: redis:7-alpine   # default
      storageSize: 1Gi        # default
      storageClass: gp3       # optional
      resources:
        requests:
          cpu: 100m
          memory: 128Mi

The operator provisions a dedicated Redis StatefulSet, Service, and PVC.

External Redis

spec:
  redis:
    mode: external
    # Option 1: connection string (avoid if it contains credentials)
    externalURL: "redis://host:6379"
    # Option 2: Secret reference (recommended)
    externalURLSecretRef:
      name: redis-credentials
      key: REDIS_URL

Heartbeat Scheduler

Paperclip runs a heartbeat scheduler for periodic agent tasks. In multi-replica deployments, only pod-0 (ordinal 0) runs the scheduler to prevent duplicate execution:

spec:
  heartbeat:
    enabled: true        # default: true
    intervalMS: 60000    # default: 60000 (1 minute)

Persistent Storage

By default, the operator creates a 5Gi PVC mounted at /paperclip:

spec:
  storage:
    persistence:
      enabled: true          # default: true
      size: 5Gi              # default
      storageClass: gp3      # optional
      accessModes:
        - ReadWriteOnce      # optional

Networking

Service

spec:
  networking:
    service:
      type: ClusterIP          # "ClusterIP", "LoadBalancer", or "NodePort"
      port: 3100               # default: 3100
      annotations:
        service.beta.kubernetes.io/aws-load-balancer-type: nlb

Ingress

Full Ingress support with TLS and WebSocket annotations:

spec:
  networking:
    ingress:
      enabled: true
      ingressClassName: nginx
      hosts:
        - paperclip.example.com
      tls:
        - hosts:
            - paperclip.example.com
          secretName: paperclip-tls
      annotations:
        nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
        nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"
        nginx.ingress.kubernetes.io/proxy-http-version: "1.1"
        nginx.ingress.kubernetes.io/proxy-set-headers: "Upgrade"

WebSocket support: Paperclip uses WebSockets for real-time UI updates. Add appropriate timeout annotations for your ingress controller to prevent WebSocket disconnections.

Scaling

Manual replicas

spec:
  availability:
    replicas: 3

When running multiple replicas, use database.mode: external with a production-grade PostgreSQL service and configure objectStorage for shared file access. The operator ensures only pod-0 runs the heartbeat scheduler.

Horizontal Pod Autoscaler

spec:
  availability:
    autoScaling:
      enabled: true
      minReplicas: 1              # default: 1
      maxReplicas: 3              # default: 3
      targetCPUUtilizationPercentage: 80          # default: 80
      targetMemoryUtilizationPercentage: 70       # optional

When auto-scaling is enabled, the HPA manages the replica count and the StatefulSet's replicas field is set to nil.

Pod Disruption Budget

spec:
  availability:
    podDisruptionBudget:
      enabled: true
      minAvailable: 1
      # or: maxUnavailable: 1

Topology Spread Constraints

Spread pods across zones or nodes for improved availability:

spec:
  availability:
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule
        labelSelector:
          matchLabels:
            app.kubernetes.io/instance: my-paperclip

Node Scheduling

spec:
  availability:
    nodeSelector:
      kubernetes.io/arch: amd64
    tolerations:
      - key: dedicated
        operator: Equal
        value: paperclip
        effect: NoSchedule
    affinity:
      nodeAffinity:
        requiredDuringSchedulingIgnoredDuringExecution:
          nodeSelectorTerms:
            - matchExpressions:
                - key: node-type
                  operator: In
                  values: [compute]

Health Probes

The operator configures liveness, readiness, and startup probes automatically:

spec:
  probes:
    type: auto   # "auto" (default), "http", or "tcp"
    liveness:
      initialDelaySeconds: 30
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 3
    readiness:
      periodSeconds: 5
    startup:
      failureThreshold: 60
      periodSeconds: 5
Probe Type Behavior
auto (default) HTTP probes (GET /api/health) in open mode, TCP probes (port 3100) in authenticated/single-tenant mode
http Always use HTTP probes against /api/health
tcp Always use TCP probes against port 3100

Why auto mode? In authenticated mode, /api/health returns 403 without credentials, causing HTTP probes to fail. The operator automatically switches to TCP probes in these modes.

Image Configuration

spec:
  image:
    repository: ghcr.io/paperclipinc/paperclip   # default
    tag: latest                                   # default
    digest: sha256:abc123...                      # optional, overrides tag
    pullPolicy: IfNotPresent                      # "Always", "Never", or "IfNotPresent"
    pullSecrets:
      - name: my-registry-secret
    autoUpdate:
      enabled: true
      interval: 5m    # polling interval (minimum: 1m)

When autoUpdate is enabled, the operator polls the container registry for new digests matching the configured tag and triggers a rolling update when a new digest is detected. Auto-update is a no-op for digest-pinned images.

Backup and Restore

Scheduled backups

spec:
  backup:
    schedule: "0 2 * * *"    # cron expression (daily at 2 AM UTC)
    retentionDays: 30        # default: 30
    s3:
      bucket: my-paperclip-backups
      path: backups/my-instance
      region: us-east-1
      endpoint: ""           # for MinIO/R2
      credentialsSecretRef:
        name: backup-s3-credentials
        # Secret must contain AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY

If backup.s3 is not set, the operator falls back to the objectStorage configuration.

Restore from backup

spec:
  restoreFrom: "backups/my-instance/2026-01-15T10:30:00Z"

The operator runs a restore Job to populate the PVC before starting the StatefulSet, then clears restoreFrom automatically. This works on both existing and brand-new instances -- you can clone an instance by creating a new Instance CR with restoreFrom pointing to an existing backup.

Custom Sidecars and Init Containers

spec:
  sidecars:
    - name: cloud-sql-proxy
      image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.14.3
      args: ["--structured-logs", "my-project:us-central1:my-db"]
      ports:
        - containerPort: 5432
  initContainers:
    - name: fetch-models
      image: curlimages/curl:8.5.0
      command: ["sh", "-c", "curl -o /data/model.bin https://..."]
      volumeMounts:
        - name: data
          mountPath: /data

Extra Volumes and Volume Mounts

Mount additional ConfigMaps, Secrets, or PVCs into the Paperclip container:

spec:
  extraVolumes:
    - name: shared-data
      persistentVolumeClaim:
        claimName: shared-pvc
  extraVolumeMounts:
    - name: shared-data
      mountPath: /shared

Environment Variables

Inject additional environment variables directly or from ConfigMaps/Secrets:

spec:
  env:
    - name: MY_CUSTOM_VAR
      value: "my-value"
    - name: SECRET_VAR
      valueFrom:
        secretKeyRef:
          name: my-secret
          key: secret-key
  envFrom:
    - configMapRef:
        name: my-configmap
    - secretRef:
        name: my-secret

Compute Resources

spec:
  resources:
    requests:
      cpu: 500m
      memory: 512Mi
    limits:
      cpu: "2"
      memory: 2Gi

Pod Annotations

spec:
  podAnnotations:
    cluster-autoscaler.kubernetes.io/safe-to-evict: "false"
    prometheus.io/scrape: "true"

Security

The operator follows a secure-by-default philosophy. Every instance ships with hardened settings out of the box.

Defaults

  • Non-root execution: containers run as non-root by default
  • All capabilities dropped: no ambient Linux capabilities
  • Seccomp RuntimeDefault: syscall filtering enabled
  • Read-only root filesystem: writable only at the PVC mount point (/paperclip) and /tmp
  • Default-deny NetworkPolicy: only DNS (53) and HTTPS (443) egress allowed; ingress limited to the service port from the same namespace
  • Minimal RBAC: each instance gets its own ServiceAccount; automountServiceAccountToken is disabled
  • No wildcard RBAC: operator uses minimum required verbs with no wildcards

Network Policies

spec:
  security:
    networkPolicy:
      enabled: true          # default: true
      allowIngressCIDRs:     # additional CIDR blocks allowed to reach the service
        - 10.0.0.0/8
      allowEgressCIDRs:      # additional CIDR blocks the pod can reach
        - 172.16.0.0/12

When enabled, the operator creates a NetworkPolicy with a deny-all baseline and selective allow rules for DNS, HTTPS egress, and same-namespace ingress on the service port. The managed PostgreSQL and Redis pods get their own allow rules.

Pod and Container Security Context

spec:
  security:
    podSecurityContext:
      runAsNonRoot: true
      fsGroup: 1000
    containerSecurityContext:
      readOnlyRootFilesystem: true
      allowPrivilegeEscalation: false
      capabilities:
        drop: [ALL]

RBAC and ServiceAccount

spec:
  security:
    rbac:
      create: true   # default: true
      serviceAccountAnnotations:
        # AWS IRSA
        eks.amazonaws.com/role-arn: "arn:aws:iam::123456789012:role/paperclip"
        # GCP Workload Identity
        # iam.gke.io/gcp-service-account: "paperclip@project.iam.gserviceaccount.com"

Observability

Prometheus Metrics

The operator exposes 7 Prometheus metrics:

Metric Type Description
paperclip_reconcile_total Counter Total reconciliations by instance, namespace, and result (success/error)
paperclip_reconcile_duration_seconds Histogram Reconciliation latency in seconds
paperclip_instance_phase Gauge Current phase per instance (1 = active for given phase)
paperclip_instance_info Gauge Instance metadata (always 1, use for PromQL joins); labels: version, image
paperclip_instance_ready Gauge Whether the instance pod is ready (1/0)
paperclip_managed_instances Gauge Total number of managed instances across the cluster
paperclip_resource_creation_failures_total Counter Resource creation failures by resource type

ServiceMonitor

spec:
  observability:
    metrics:
      enabled: true
      serviceMonitor:
        enabled: true
        interval: 30s       # default: 30s

Logging

spec:
  observability:
    logging:
      level: info   # "debug", "info", "warn", or "error"

Status and Lifecycle

Phases

Phase Description
Pending CR accepted, reconciliation not yet started
Provisioning Creating managed resources (StatefulSet, Service, database, etc.)
Running All resources healthy, pods ready
Updating Rolling update in progress
BackingUp Backup operation in progress
Restoring Restore operation in progress
Degraded Some resources unhealthy but recoverable
Failed Unrecoverable error
Terminating Finalizer running, cleaning up resources

Inspecting status

# Check phase and endpoint
kubectl get pci my-paperclip

# View conditions
kubectl get instance my-paperclip -o jsonpath='{.status.conditions}' | jq .

# View managed resources
kubectl get instance my-paperclip -o jsonpath='{.status.managedResources}' | jq .

# View auto-update status
kubectl get instance my-paperclip -o jsonpath='{.status.autoUpdate}' | jq .

# View backup status
kubectl get instance my-paperclip -o jsonpath='{.status.backup}' | jq .

What the operator manages automatically

These behaviors are always applied -- no configuration needed:

Behavior Details
HOST=0.0.0.0 Always set so Paperclip binds to all interfaces in the container
SERVE_UI=true Always set so the web UI is served
Heartbeat leader election Only pod-0 runs the heartbeat scheduler in multi-replica deployments
Config hash rollouts Environment/config changes trigger rolling updates via SHA-256 hash annotation
Owner references All managed resources have owner references for automatic garbage collection
Finalizer Runs backup (if configured) and cleanup on CR deletion
Status tracking Phase, conditions, endpoint, and managed resource names are continuously updated

Production Deployment Example

A full production deployment with external database, S3 storage, Redis, OAuth, Ingress with TLS, and monitoring:

apiVersion: paperclip.inc/v1alpha1
kind: Instance
metadata:
  name: paperclip-prod
  namespace: paperclip
spec:
  image:
    tag: v1.2.3
    pullPolicy: IfNotPresent

  deployment:
    mode: authenticated
    exposure: public
    publicURL: https://paperclip.example.com
    allowedHostnames:
      - paperclip.example.com

  database:
    mode: external
    externalURLSecretRef:
      name: paperclip-database
      key: DATABASE_URL

  auth:
    secretRef:
      name: paperclip-auth
      key: BETTER_AUTH_SECRET
    adminUser:
      email: admin@example.com
      passwordSecretRef:
        name: paperclip-admin
        key: password
    google:
      credentialsSecretRef:
        name: google-oauth
    email:
      resendAPIKeySecretRef:
        name: resend-key
        key: RESEND_API_KEY
      from: "Paperclip <noreply@example.com>"
      verificationRequired: true

  secrets:
    masterKeySecretRef:
      name: paperclip-secrets
      key: MASTER_KEY
    strictMode: true

  storage:
    persistence:
      enabled: true
      size: 20Gi
      storageClass: gp3

  objectStorage:
    provider: s3
    bucket: paperclip-storage
    region: us-east-1
    credentialsSecretRef:
      name: paperclip-s3

  redis:
    mode: external
    externalURLSecretRef:
      name: redis-credentials
      key: REDIS_URL

  adapters:
    apiKeysSecretRef:
      name: paperclip-api-keys

  connections:
    credentialsSecretRef:
      name: paperclip-oauth-credentials

  security:
    networkPolicy:
      enabled: true
    rbac:
      create: true
      serviceAccountAnnotations:
        eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/paperclip

  networking:
    service:
      type: ClusterIP
      port: 3100
    ingress:
      enabled: true
      ingressClassName: nginx
      hosts:
        - paperclip.example.com
      tls:
        - hosts:
            - paperclip.example.com
          secretName: paperclip-tls
      annotations:
        nginx.ingress.kubernetes.io/proxy-read-timeout: "3600"
        nginx.ingress.kubernetes.io/proxy-send-timeout: "3600"

  observability:
    metrics:
      enabled: true
      serviceMonitor:
        enabled: true
        interval: 30s
    logging:
      level: info

  availability:
    replicas: 3
    podDisruptionBudget:
      enabled: true
      minAvailable: 1
    topologySpreadConstraints:
      - maxSkew: 1
        topologyKey: topology.kubernetes.io/zone
        whenUnsatisfiable: DoNotSchedule

  probes:
    startup:
      failureThreshold: 60
      periodSeconds: 5

  backup:
    schedule: "0 2 * * *"
    retentionDays: 30
    s3:
      bucket: paperclip-backups
      path: backups/prod
      region: us-east-1
      credentialsSecretRef:
        name: backup-s3-credentials

  resources:
    requests:
      cpu: "1"
      memory: 1Gi
    limits:
      cpu: "4"
      memory: 4Gi

Full CRD Specification

For the complete list of configurable fields, see the Instance CRD types or run:

kubectl explain instance.spec
kubectl explain instance.spec.database
kubectl explain instance.spec.auth

See config/samples/ for additional examples.


Development

Prerequisites

  • Go 1.24+
  • Docker
  • kubectl
  • A Kubernetes cluster (Kind, minikube, or remote)

Build and run locally

git clone https://github.com/paperclipinc/paperclip-operator.git
cd paperclip-operator
go mod download

make install      # Install CRDs into current cluster
make run          # Run operator locally against current kubeconfig

Run tests

make test                              # Unit + integration tests (envtest)
go test ./internal/resources/ -v       # Fast unit tests (no envtest needed)
make bench                             # Benchmarks for resource builders
make test-e2e                          # E2E tests (requires Kind cluster)
make scorecard                         # Operator SDK scorecard tests

Lint and vet

make lint          # golangci-lint
go vet ./...       # Go vet

After changing CRD types

make generate          # Regenerate deepcopy methods
make manifests         # Regenerate CRD YAML and RBAC
make sync-chart-crds   # Sync CRDs into Helm chart

Build Docker image

make docker-build IMG=my-registry/paperclip-operator:dev

Project structure

api/v1alpha1/          CRD types (Instance)
internal/controller/   Reconciliation logic (single controller + metrics)
internal/resources/    Pure resource builder functions (StatefulSet, Service, etc.)
config/crd/bases/      Generated CRD YAML (committed to git)
config/samples/        Example Instance CRs
charts/                Helm chart (CRDs as templates in templates/crds/)
bundle/                OLM bundle for OperatorHub submissions
hack/                  Build/sync scripts
.github/workflows/     CI/CD pipelines

The operator follows a clean separation of concerns: the controller orchestrates reconciliation, while all Kubernetes resource construction happens in pure functions inside internal/resources/. This makes builders easy to unit test without envtest.

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feat/my-feature)
  3. Commit using conventional commits (feat:, fix:, docs:, etc.)
  4. Push and open a pull request

All PRs require passing CI checks (lint, test, security scan, reconcile guard, Helm sync, E2E) and one approval.

License

Apache License 2.0

About

Kubernetes operator for managing Paperclip instances - the open-source AI agent orchestration platform

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors