Data Flow Architecture

This document illustrates how data flows through the Blueberry IDP system, including user data, environment metadata, configuration data, and operational metrics.

System Data Flow Overview

graph TB subgraph "Data Sources" USER_INPUT[User Input
Web Forms/API] GIT_REPOS[Git Repositories
Code & Configs] CI_EVENTS[CI/CD Events
Webhooks] MONITORING[System Monitoring
Metrics & Logs] end subgraph "Data Processing Layer" AUTH_SERVICE[Authentication
Service] VALIDATION[Data Validation
& Sanitization] CONFIG_MERGE[Configuration
Merging Engine] TEMPLATE_ENGINE[Template
Generation] end subgraph "Core Data Stores" FIRESTORE[(Firestore
Primary Database)] REDIS[(Redis
Cache Layer)] SECRET_MGR[Secret Manager
Sensitive Data] GCS[(Cloud Storage
Artifacts & Logs)] end subgraph "External Systems" ARGOCD[ArgoCD
GitOps State] K8S_API[Kubernetes API
Cluster State] FIREBASE_AUTH[Firebase Auth
User Identity] GCP_APIS[GCP APIs
Cloud Resources] end subgraph "Data Consumers" WEB_UI[Web UI
Real-time Updates] API_CLIENTS[API Clients
Programmatic Access] MONITORING_SYS[Monitoring Systems
Observability] COST_TRACKING[Cost Tracking
Analytics] end %% Input flows USER_INPUT --> AUTH_SERVICE GIT_REPOS --> CONFIG_MERGE CI_EVENTS --> VALIDATION MONITORING --> MONITORING_SYS %% Processing flows AUTH_SERVICE --> VALIDATION VALIDATION --> CONFIG_MERGE CONFIG_MERGE --> TEMPLATE_ENGINE TEMPLATE_ENGINE --> FIRESTORE %% Data storage flows AUTH_SERVICE --> FIREBASE_AUTH VALIDATION --> FIRESTORE CONFIG_MERGE --> REDIS TEMPLATE_ENGINE --> ARGOCD %% Cache flows FIRESTORE -.->|"Read Cache"| REDIS REDIS -.->|"Cache Miss"| FIRESTORE %% External system flows TEMPLATE_ENGINE --> K8S_API AUTH_SERVICE --> SECRET_MGR ARGOCD --> GIT_REPOS K8S_API --> GCP_APIS %% Output flows FIRESTORE --> WEB_UI REDIS --> WEB_UI FIRESTORE --> API_CLIENTS MONITORING --> COST_TRACKING GCS --> MONITORING_SYS classDef source fill:#e3f2fd,stroke:#1976d2,stroke-width:2px classDef process fill:#fff3e0,stroke:#f57c00,stroke-width:2px classDef store fill:#e8f5e9,stroke:#388e3c,stroke-width:2px classDef external fill:#fce4ec,stroke:#c2185b,stroke-width:2px classDef consumer fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px class USER_INPUT,GIT_REPOS,CI_EVENTS,MONITORING source class AUTH_SERVICE,VALIDATION,CONFIG_MERGE,TEMPLATE_ENGINE process class FIRESTORE,REDIS,SECRET_MGR,GCS store class ARGOCD,K8S_API,FIREBASE_AUTH,GCP_APIS external class WEB_UI,API_CLIENTS,MONITORING_SYS,COST_TRACKING consumer

Environment Data Lifecycle

sequenceDiagram participant User participant API participant Validation participant Environment Service participant Firestore participant Redis participant ArgoCD participant K8s participant Monitoring Note over User,Monitoring: Environment Data Lifecycle %% Creation Phase User->>API: Create environment request API->>Validation: Validate input data Validation->>Environment Service: Validated data Environment Service->>Firestore: Store environment metadata Environment Service->>Redis: Cache environment data Environment Service->>ArgoCD: Generate K8s manifests %% Deployment Phase ArgoCD->>K8s: Deploy resources K8s->>ArgoCD: Resource status ArgoCD->>Environment Service: Sync status %% Status Updates loop Every 30 seconds Environment Service->>K8s: Check resource health Environment Service->>Firestore: Update status Environment Service->>Redis: Update cache Environment Service->>Monitoring: Send metrics end %% User Queries User->>API: Get environment status API->>Redis: Check cache alt Cache Hit Redis->>API: Cached data else Cache Miss API->>Firestore: Query database Firestore->>API: Fresh data API->>Redis: Update cache end API->>User: Environment data

Configuration Data Flow

graph LR subgraph "Configuration Sources" DEFAULTS[Helm Chart
Default Values] CONFIG_SETS[User-defined
Config Sets] ENV_OVERRIDES[Environment
Overrides] RUNTIME[Runtime
Computed Values] end subgraph "Processing Pipeline" LOADER[Config
Loader] MERGER[Value
Merger] VALIDATOR[Schema
Validator] GENERATOR[Template
Generator] end subgraph "Storage & Distribution" FIRESTORE_CONFIG[(Firestore
Config Storage)] REDIS_CACHE[(Redis
Config Cache)] ARGOCD_APPS[ArgoCD
Applications] K8S_RESOURCES[Kubernetes
Resources] end %% Source to processing DEFAULTS --> LOADER CONFIG_SETS --> LOADER ENV_OVERRIDES --> LOADER RUNTIME --> LOADER %% Processing pipeline LOADER --> MERGER MERGER --> VALIDATOR VALIDATOR --> GENERATOR %% Storage and distribution GENERATOR --> FIRESTORE_CONFIG GENERATOR --> REDIS_CACHE GENERATOR --> ARGOCD_APPS ARGOCD_APPS --> K8S_RESOURCES %% Cache invalidation FIRESTORE_CONFIG -.->|"Invalidate"| REDIS_CACHE classDef source fill:#e3f2fd,stroke:#1976d2,stroke-width:2px classDef process fill:#fff3e0,stroke:#f57c00,stroke-width:2px classDef storage fill:#e8f5e9,stroke:#388e3c,stroke-width:2px class DEFAULTS,CONFIG_SETS,ENV_OVERRIDES,RUNTIME source class LOADER,MERGER,VALIDATOR,GENERATOR process class FIRESTORE_CONFIG,REDIS_CACHE,ARGOCD_APPS,K8S_RESOURCES storage

User Authentication Data Flow

graph TB subgraph "User Actions" LOGIN[User Login
Google OAuth] API_REQUEST[API Request
with Token] TOKEN_CREATE[Create API Token] end subgraph "Authentication Layer" FIREBASE[Firebase Auth
Token Validation] TOKEN_SERVICE[Token Service
API Token Validation] AUTH_MIDDLEWARE[Auth Middleware
Request Processing] end subgraph "User Data Storage" FIRESTORE_USERS[(Firestore
User Profiles)] SECRET_TOKENS[Secret Manager
API Tokens] REDIS_SESSIONS[(Redis
Session Cache)] end subgraph "Authorization" RBAC[Role-Based
Access Control] SCOPES[API Token
Scopes] PERMISSIONS[Permission
Checks] end %% Login flow LOGIN --> FIREBASE FIREBASE --> FIRESTORE_USERS FIREBASE --> REDIS_SESSIONS %% API request flow API_REQUEST --> AUTH_MIDDLEWARE AUTH_MIDDLEWARE --> FIREBASE AUTH_MIDDLEWARE --> TOKEN_SERVICE TOKEN_SERVICE --> SECRET_TOKENS TOKEN_SERVICE --> FIRESTORE_USERS %% Token creation flow TOKEN_CREATE --> TOKEN_SERVICE TOKEN_SERVICE --> SECRET_TOKENS TOKEN_SERVICE --> FIRESTORE_USERS %% Authorization flow AUTH_MIDDLEWARE --> RBAC RBAC --> SCOPES SCOPES --> PERMISSIONS %% Cache flows FIRESTORE_USERS -.->|"Cache"| REDIS_SESSIONS REDIS_SESSIONS -.->|"Refresh"| FIRESTORE_USERS classDef user fill:#e3f2fd,stroke:#1976d2,stroke-width:2px classDef auth fill:#fff3e0,stroke:#f57c00,stroke-width:2px classDef storage fill:#e8f5e9,stroke:#388e3c,stroke-width:2px classDef authz fill:#fce4ec,stroke:#c2185b,stroke-width:2px class LOGIN,API_REQUEST,TOKEN_CREATE user class FIREBASE,TOKEN_SERVICE,AUTH_MIDDLEWARE auth class FIRESTORE_USERS,SECRET_TOKENS,REDIS_SESSIONS storage class RBAC,SCOPES,PERMISSIONS authz

Monitoring and Metrics Data Flow

graph TB subgraph "Metric Sources" APP_METRICS[Application
Metrics] K8S_METRICS[Kubernetes
Metrics] GCP_METRICS[GCP
Metrics] COST_DATA[Cost
Data] end subgraph "Collection Layer" PROMETHEUS[Prometheus
Metrics Collection] TELEMETRY[Telemetry
Service] COST_SERVICE[Cost Tracking
Service] LOG_COLLECTOR[Log
Collector] end subgraph "Storage & Processing" TSDB[(Time Series
Database)] CLOUD_MONITORING[Cloud Monitoring
Metrics] FIRESTORE_COSTS[(Firestore
Cost Data)] GCS_LOGS[(GCS
Log Storage)] end subgraph "Visualization" GRAFANA[Grafana
Dashboards] COST_DASHBOARD[Cost
Dashboard] ALERTS[Alert
Manager] API_METRICS[Metrics
API] end %% Collection flows APP_METRICS --> PROMETHEUS K8S_METRICS --> PROMETHEUS GCP_METRICS --> CLOUD_MONITORING COST_DATA --> COST_SERVICE %% Processing flows PROMETHEUS --> TELEMETRY COST_SERVICE --> FIRESTORE_COSTS TELEMETRY --> CLOUD_MONITORING LOG_COLLECTOR --> GCS_LOGS %% Storage flows TELEMETRY --> TSDB CLOUD_MONITORING --> TSDB %% Visualization flows TSDB --> GRAFANA FIRESTORE_COSTS --> COST_DASHBOARD CLOUD_MONITORING --> ALERTS TSDB --> API_METRICS classDef source fill:#e3f2fd,stroke:#1976d2,stroke-width:2px classDef collect fill:#fff3e0,stroke:#f57c00,stroke-width:2px classDef storage fill:#e8f5e9,stroke:#388e3c,stroke-width:2px classDef visual fill:#fce4ec,stroke:#c2185b,stroke-width:2px class APP_METRICS,K8S_METRICS,GCP_METRICS,COST_DATA source class PROMETHEUS,TELEMETRY,COST_SERVICE,LOG_COLLECTOR collect class TSDB,CLOUD_MONITORING,FIRESTORE_COSTS,GCS_LOGS storage class GRAFANA,COST_DASHBOARD,ALERTS,API_METRICS visual

Webhook Data Processing

sequenceDiagram participant Git Provider participant Webhook Endpoint participant Validation participant Event Processor participant Environment Service participant Firestore participant Notification Note over Git Provider,Notification: Webhook Event Processing Git Provider->>Webhook Endpoint: POST webhook payload Webhook Endpoint->>Validation: Validate signature & payload alt Valid Webhook Validation->>Event Processor: Process event Event Processor->>Event Processor: Parse event type alt Environment Creation Event Event Processor->>Environment Service: Create environment Environment Service->>Firestore: Store environment data Environment Service->>Event Processor: Environment created else Status Update Event Event Processor->>Environment Service: Update status Environment Service->>Firestore: Update environment else Cleanup Event Event Processor->>Environment Service: Delete environment Environment Service->>Firestore: Mark for deletion end Event Processor->>Notification: Send notification Event Processor->>Webhook Endpoint: Success response else Invalid Webhook Validation->>Webhook Endpoint: Reject request end Webhook Endpoint->>Git Provider: HTTP response

Caching Strategy

graph LR subgraph "Cache Layers" L1[Application
Memory Cache] L2[Redis
Distributed Cache] L3[Database
Persistent Storage] end subgraph "Cache Patterns" READ_THROUGH[Read-Through
Pattern] WRITE_THROUGH[Write-Through
Pattern] CACHE_ASIDE[Cache-Aside
Pattern] TTL[Time-To-Live
Expiration] end subgraph "Cache Data Types" ENVIRONMENT_LIST[Environment
Lists] USER_SESSIONS[User
Sessions] CONFIG_DATA[Configuration
Data] API_RESPONSES[API
Responses] end %% Cache hierarchy L1 --> L2 L2 --> L3 %% Pattern applications READ_THROUGH --> ENVIRONMENT_LIST WRITE_THROUGH --> USER_SESSIONS CACHE_ASIDE --> CONFIG_DATA TTL --> API_RESPONSES %% Data type mapping ENVIRONMENT_LIST --> L2 USER_SESSIONS --> L2 CONFIG_DATA --> L1 API_RESPONSES --> L1 classDef cache fill:#e3f2fd,stroke:#1976d2,stroke-width:2px classDef pattern fill:#fff3e0,stroke:#f57c00,stroke-width:2px classDef data fill:#e8f5e9,stroke:#388e3c,stroke-width:2px class L1,L2,L3 cache class READ_THROUGH,WRITE_THROUGH,CACHE_ASIDE,TTL pattern class ENVIRONMENT_LIST,USER_SESSIONS,CONFIG_DATA,API_RESPONSES data

Data Security and Encryption

Data Classification

Data Type Classification Storage Encryption
User Profiles PII Firestore At-rest + TLS
API Tokens Secret Secret Manager AES-256 + TLS
Environment Configs Internal Firestore At-rest + TLS
Application Logs Internal GCS At-rest + TLS
Metrics Data Internal Cloud Monitoring At-rest + TLS
Session Data Temporary Redis TLS in-transit

Encryption in Transit

graph LR subgraph "External Communication" USER[User Browser] CI[CI/CD Systems] GIT[Git Providers] end subgraph "TLS Termination" LB[Load Balancer
TLS 1.3] INGRESS[Ingress Controller
TLS 1.3] end subgraph "Internal Services" API[Blueberry API] FIRESTORE[Firestore] REDIS[Redis] SECRET_MGR[Secret Manager] end USER -->|HTTPS/TLS 1.3| LB CI -->|HTTPS/TLS 1.3| LB GIT -->|HTTPS/TLS 1.3| LB LB -->|TLS 1.3| INGRESS INGRESS -->|mTLS| API API -->|gRPC/TLS| FIRESTORE API -->|TLS| REDIS API -->|HTTPS/TLS| SECRET_MGR classDef external fill:#e3f2fd,stroke:#1976d2,stroke-width:2px classDef termination fill:#fff3e0,stroke:#f57c00,stroke-width:2px classDef internal fill:#e8f5e9,stroke:#388e3c,stroke-width:2px class USER,CI,GIT external class LB,INGRESS termination class API,FIRESTORE,REDIS,SECRET_MGR internal

Data Retention and Cleanup

Retention Policies

Data Type Retention Period Cleanup Method
Environments TTL-based (1-168 hours) Automatic deletion
Audit Logs 90 days Automated archival
Metrics 30 days Cloud Monitoring retention
User Sessions 24 hours Redis TTL
API Tokens User-defined (max 1 year) Manual/automatic revocation
Failed Environments 7 days Background cleanup

Cleanup Workflow

graph TB SCHEDULER[Cleanup Scheduler
Every 6 hours] --> CHECK_EXPIRED[Check Expired
Environments] CHECK_EXPIRED --> MARK_TERMINATED[Mark as
TERMINATED] MARK_TERMINATED --> DELETE_K8S[Delete Kubernetes
Resources] DELETE_K8S --> DELETE_ARGOCD[Delete ArgoCD
Applications] DELETE_ARGOCD --> ARCHIVE_DATA[Archive to
GCS] ARCHIVE_DATA --> DELETE_FIRESTORE[Delete from
Firestore] SCHEDULER --> CHECK_AUDIT[Check Audit
Log Age] CHECK_AUDIT --> ARCHIVE_LOGS[Archive Old
Logs] SCHEDULER --> CHECK_SESSIONS[Check Session
Expiry] CHECK_SESSIONS --> CLEAR_REDIS[Clear Expired
Sessions] classDef process fill:#e8f5e9,stroke:#388e3c,stroke-width:2px class SCHEDULER,CHECK_EXPIRED,MARK_TERMINATED,DELETE_K8S,DELETE_ARGOCD,ARCHIVE_DATA,DELETE_FIRESTORE,CHECK_AUDIT,ARCHIVE_LOGS,CHECK_SESSIONS,CLEAR_REDIS process

Last Updated: January 2024

Document ID: architecture/diagrams/data-flow