Blueberry IDP Component Architecture

Overview

This document provides a comprehensive view of the Blueberry Internal Developer Platform's component architecture, showing how different parts of the system interact to provide ephemeral Kubernetes environments for testing pull requests.

System Component Diagram

The following diagram shows all major components of the Blueberry IDP and their relationships:

graph TB subgraph "External Systems" GIT[Git Repository
GitLab/GitHub] GCP[Google Cloud Platform] USER[Users/Developers] CI[CI/CD Systems] end subgraph "Frontend Layer" UI[Web UI
HTMX + Alpine.js] STATIC[Static Assets
CSS/JS/Images] end subgraph "API Layer" API[FastAPI Backend
Python] AUTH[Authentication
Firebase Auth] MIDDLEWARE[Middleware
Auth/Logging/Proxy] end subgraph "Business Logic" ENV_SVC[Environment Service] CONFIG_SVC[Config Service] TOKEN_SVC[API Token Service] WEBHOOK_SVC[Webhook Service] AUDIT_SVC[Audit Service] COST_SVC[Cost Tracking Service] READINESS[Readiness Checker] end subgraph "Data Layer" FIRESTORE[(Firestore
Metadata Storage)] GCS[(Google Cloud Storage
Artifacts/Logs)] REDIS[(Redis Cache
Session/Temp Data)] SECRET_MGR[Secret Manager
API Tokens/Secrets] end subgraph "Infrastructure Layer" K8S[GKE Autopilot
Kubernetes Cluster] ARGOCD[ArgoCD
GitOps Controller] CROSSPLANE[Crossplane
Cloud Resources] INGRESS[Ingress Controller
+ Cert Manager] end subgraph "Deployed Environments" ENV1[PR Environment 1
Namespace: pr-123] ENV2[PR Environment 2
Namespace: pr-456] ENVN[Environment N
Namespace: feature-xyz] end %% User interactions USER --> UI CI --> API %% Frontend to API UI --> API UI --> STATIC %% API Authentication API --> AUTH AUTH --> FIRESTORE %% API to Services API --> ENV_SVC API --> CONFIG_SVC API --> TOKEN_SVC API --> WEBHOOK_SVC API --> AUDIT_SVC API --> COST_SVC %% Service interactions ENV_SVC --> ARGOCD ENV_SVC --> FIRESTORE ENV_SVC --> GCS ENV_SVC --> READINESS CONFIG_SVC --> FIRESTORE TOKEN_SVC --> SECRET_MGR TOKEN_SVC --> FIRESTORE WEBHOOK_SVC --> ENV_SVC WEBHOOK_SVC --> GIT AUDIT_SVC --> FIRESTORE COST_SVC --> GCP COST_SVC --> FIRESTORE READINESS --> K8S READINESS --> REDIS %% Infrastructure interactions ARGOCD --> GIT ARGOCD --> K8S ARGOCD --> CROSSPLANE K8S --> ENV1 K8S --> ENV2 K8S --> ENVN CROSSPLANE --> GCP INGRESS --> ENV1 INGRESS --> ENV2 INGRESS --> ENVN %% Data layer connections ENV_SVC --> REDIS CONFIG_SVC --> REDIS %% Styling classDef external fill:#f9f,stroke:#333,stroke-width:2px classDef frontend fill:#bbf,stroke:#333,stroke-width:2px classDef api fill:#bfb,stroke:#333,stroke-width:2px classDef service fill:#fbb,stroke:#333,stroke-width:2px classDef data fill:#ff9,stroke:#333,stroke-width:2px classDef infra fill:#9ff,stroke:#333,stroke-width:2px classDef env fill:#f96,stroke:#333,stroke-width:2px class GIT,GCP,USER,CI external class UI,STATIC frontend class API,AUTH,MIDDLEWARE api class ENV_SVC,CONFIG_SVC,TOKEN_SVC,WEBHOOK_SVC,AUDIT_SVC,COST_SVC,READINESS service class FIRESTORE,GCS,REDIS,SECRET_MGR data class K8S,ARGOCD,CROSSPLANE,INGRESS infra class ENV1,ENV2,ENVN env

Component Descriptions

External Systems

  • Git Repository: Source code repositories (GitLab/GitHub) containing application code and Kubernetes manifests
  • Google Cloud Platform: Cloud provider for infrastructure services
  • Users/Developers: End users accessing the platform through web UI
  • CI/CD Systems: Automated systems that trigger environment creation via API

Frontend Layer

  • Web UI: Server-side rendered interface using HTMX for dynamic updates and Alpine.js for client-side interactivity
  • Static Assets: CSS, JavaScript, and image files served directly

API Layer

  • FastAPI Backend: Main API service handling all business logic and orchestration
  • Authentication: Firebase Auth integration for user authentication
  • Middleware: Cross-cutting concerns like authentication, logging, and proxy handling

Business Logic Services

  • Environment Service: Manages environment lifecycle (create, update, delete)
  • Config Service: Handles configuration management and overrides
  • API Token Service: Manages API tokens for programmatic access
  • Webhook Service: Processes webhooks from Git providers
  • Audit Service: Records all actions for compliance and debugging
  • Cost Tracking Service: Monitors and reports on resource costs
  • Readiness Checker: Continuously monitors environment health and readiness

Data Layer

  • Firestore: NoSQL database for persistent metadata storage
  • Google Cloud Storage: Object storage for build artifacts and logs
  • Redis: In-memory cache for session data and temporary storage
  • Secret Manager: Secure storage for API tokens and sensitive data

Infrastructure Layer

  • GKE Autopilot: Serverless Kubernetes cluster for running workloads
  • ArgoCD: GitOps controller that syncs desired state from Git to cluster
  • Crossplane: Kubernetes-native infrastructure provisioning
  • Ingress Controller: Routes external traffic to environments with automatic TLS via Cert Manager

Deployed Environments

Individual Kubernetes namespaces containing deployed applications, each with:
- Unique namespace (e.g., pr-123, feature-xyz)
- Isolated resources and network policies
- Automatic cleanup on PR merge/close

Environment Creation Flow

The following sequence diagram shows the detailed flow of creating a new environment:

sequenceDiagram participant User/CI participant Web UI participant FastAPI participant Auth participant Env Service participant Firestore participant ArgoCD participant Git Repo participant K8s Cluster participant Crossplane participant GCS participant Redis User/CI->>Web UI: Request new environment Web UI->>FastAPI: POST /environments FastAPI->>Auth: Validate Firebase token Auth->>Firestore: Check user permissions Auth-->>FastAPI: User authenticated FastAPI->>Env Service: Create environment Env Service->>Firestore: Store environment metadata Env Service->>Redis: Cache environment data Note over Env Service,ArgoCD: GitOps Flow Begins Env Service->>ArgoCD: Create Application CR
(not API call) ArgoCD->>Git Repo: Fetch app manifests ArgoCD->>K8s Cluster: Apply manifests K8s Cluster->>K8s Cluster: Create namespace K8s Cluster->>K8s Cluster: Deploy Helm charts alt If cloud resources needed K8s Cluster->>Crossplane: Create resource claims Crossplane->>GCS: Provision cloud resources end K8s Cluster-->>ArgoCD: Resources created Note over Env Service,Redis: Status Monitoring loop Readiness checks Env Service->>K8s Cluster: Check pod status Env Service->>Redis: Update status cache Env Service->>Firestore: Update environment status end Env Service-->>FastAPI: Environment ready FastAPI-->>Web UI: Return environment details Web UI-->>User/CI: Display URL & status

Key Architecture Decisions

GitOps-First Approach

  • All deployments go through ArgoCD, no direct kubectl commands
  • Git repository is the single source of truth
  • Declarative configuration enables easy rollbacks and auditing

Stateless Application Design

  • FastAPI backend is completely stateless
  • All persistent data in Firestore
  • Redis used only for caching and temporary data
  • Enables horizontal scaling and zero-downtime deployments

Security Boundaries

  • Each environment runs in its own Kubernetes namespace
  • Network policies enforce isolation between environments
  • Firebase Auth provides user authentication
  • Workload Identity for GCP service authentication
  • No cluster-admin permissions for the application

Cost Optimization

  • GKE Autopilot for serverless Kubernetes (pay per pod)
  • Automatic environment cleanup reduces resource waste
  • Resource limits prevent runaway costs
  • Cost tracking service provides visibility

Data Flow Patterns

Synchronous Operations

  1. User authentication via Firebase
  2. API requests to create/update/delete environments
  3. Configuration management
  4. Direct status queries

Asynchronous Operations

  1. Environment provisioning via ArgoCD
  2. Readiness checking and status updates
  3. Webhook processing from Git providers
  4. Cost calculation and reporting

Caching Strategy

  • Redis caches frequently accessed data:
  • Environment list and status
  • User sessions
  • Configuration overrides
  • TTL-based expiration ensures data freshness
  • Cache invalidation on updates

Scalability Considerations

Horizontal Scaling

  • Stateless FastAPI instances can scale based on load
  • Redis can be clustered for high availability
  • Multiple ArgoCD controllers for large deployments

Performance Optimizations

  • Async Python for non-blocking I/O
  • Database query optimization with indexes
  • Pagination for large result sets
  • Server-sent events (SSE) instead of polling

Monitoring and Observability

Metrics Collection

  • Application metrics exported to Prometheus
  • GCP Cloud Monitoring for infrastructure metrics
  • Custom metrics for business KPIs

Logging

  • Structured JSON logging from all components
  • Centralized log aggregation in GCP Cloud Logging
  • Correlation IDs for request tracing

Health Checks

  • Kubernetes liveness and readiness probes
  • Application-level health endpoints
  • Dependency health monitoring

Security Architecture

Authentication & Authorization

  • Firebase Auth for user authentication
  • Role-based access control (RBAC)
  • API token authentication for CI/CD

Network Security

  • Private GKE cluster with authorized networks
  • Network policies for pod-to-pod communication
  • TLS encryption for all external traffic

Secrets Management

  • Google Secret Manager for sensitive data
  • External Secrets Operator for Kubernetes integration
  • Workload Identity for service authentication

Disaster Recovery

Backup Strategy

  • Firestore automatic backups
  • Git repository as source of truth for configurations
  • GCS versioning for artifacts

Recovery Procedures

  • ArgoCD can recreate all environments from Git
  • Firestore point-in-time recovery
  • Infrastructure as Code enables full rebuild

Future Considerations

Planned Enhancements

  • Multi-cluster support for geographic distribution
  • Advanced cost allocation and chargeback
  • Integration with more Git providers
  • Enhanced monitoring and alerting

Scalability Path

  • Federation of multiple Blueberry instances
  • Cross-region deployment capabilities
  • Advanced traffic routing with service mesh
Document ID: architecture/diagrams/component-architecture