Configuration Workflows

This directory contains documentation for Blueberry IDP's configuration workflows and environment management processes.

Overview

Configuration workflows define how configuration changes flow through the system, from user input to deployed environments, ensuring consistency, security, and reliability throughout the process.

Core Workflows

1. Environment Creation Workflow

The environment creation workflow handles the complete lifecycle of provisioning new environments:

Input Phase

  • User submits environment creation request
  • System validates request parameters
  • Configuration set is loaded or created
  • Base Helm values are retrieved

Validation Phase

  • User overrides are validated against schema
  • Non-overridable fields are filtered out
  • Resource specifications are checked
  • Warnings are generated for invalid fields

Processing Phase

  • Runtime overrides are applied
  • Computed values are generated
  • Final Helm values are assembled
  • ArgoCD Application is created

Deployment Phase

  • ArgoCD syncs the application
  • Kubernetes resources are created
  • Health checks are performed
  • Environment status is updated

2. Configuration Update Workflow

The configuration update workflow handles changes to existing environments:

Change Detection

  • User submits configuration changes
  • System compares with current configuration
  • Validation is performed on changes
  • Impact analysis is conducted

Validation and Processing

  • Changes are validated against constraints
  • Runtime overrides are recalculated
  • Computed values are updated
  • Rollback plan is prepared

Deployment

  • ArgoCD Application is updated
  • Kubernetes resources are modified
  • Rolling update is performed
  • Health checks validate changes

3. Secret Management Workflow

The secret management workflow handles sensitive configuration data:

Secret Creation

  • Secrets are created in Google Secret Manager
  • External Secrets Operator synchronizes to Kubernetes
  • Service accounts are configured with proper permissions
  • Access is granted through Workload Identity

Secret Rotation

  • New secret versions are created
  • External Secrets Operator detects changes
  • Kubernetes Secrets are updated
  • Pods are restarted to load new secrets

Secret Access

  • Applications request secrets through lazy loading
  • Settings class manages secret caching
  • Fallback to environment variables when available
  • Access is logged for audit purposes

Workflow Implementation

Environment Creation Process

async def create_environment(
    env_request: EnvironmentCreate,
    user: User,
    config_set: ConfigSet | None = None
) -> Environment:
    """Complete environment creation workflow."""

    # 1. Validate request
    if not is_valid_environment_name(env_request.name):
        raise ValueError("Invalid environment name")

    # 2. Load configuration
    if config_set:
        user_overrides = config_set.helm_overrides or {}
    else:
        user_overrides = {}

    # 3. Validate overrides
    validation_result = validate_user_overrides(user_overrides)
    if validation_result["warnings"]:
        # Log warnings but continue
        logger.warning("Configuration validation warnings",
                      warnings=validation_result["warnings"])

    # 4. Load base values
    base_values = load_base_helm_values("charts/environment/values.yaml")

    # 5. Apply runtime overrides
    runtime_overrides = {
        "environment.name": env_request.name,
        "environment.namespace": f"ephemeral-{env_request.name}",
        "environment.ttl": env_request.ttl or "72h",
        "backend.image.tag": env_request.backend_image_tag,
        "backend.gitCommitSha": env_request.git_commit_sha,
        # ... other required overrides
    }

    final_values = apply_runtime_overrides(
        base_values,
        {**validation_result["filtered_overrides"], **runtime_overrides}
    )

    # 6. Create ArgoCD Application
    app_manifest = render_argocd_application(env_request.name, final_values)
    await create_argocd_application(app_manifest)

    # 7. Save environment metadata
    environment = Environment(
        name=env_request.name,
        namespace=f"ephemeral-{env_request.name}",
        status=EnvironmentStatus.CREATING,
        created_by=user.uid,
        helm_values=final_values,
        config_set_id=config_set.id if config_set else None,
    )

    await save_environment(environment)
    return environment

Configuration Validation Process

def validate_environment_configuration(
    overrides: dict,
    base_values: dict
) -> ValidationResult:
    """Validate configuration overrides."""

    warnings = []
    filtered_overrides = {}

    # 1. Schema validation
    for key, value in overrides.items():
        if not is_valid_field_path(key, base_values):
            warnings.append({
                "field": key,
                "type": "invalid_field",
                "message": f"Field '{key}' is not valid in the schema"
            })
            continue

        # 2. Check against non-overridable fields
        if key in NON_OVERRIDABLE_FIELDS:
            warnings.append({
                "field": key,
                "type": "non_overridable",
                "message": f"Field '{key}' is system-managed and cannot be overridden"
            })
            continue

        # 3. Check for resource specifications
        if contains_resource_spec(key, value):
            warnings.append({
                "field": key,
                "type": "resource_spec",
                "message": f"Field '{key}' contains resource specifications not allowed in GKE Autopilot"
            })
            continue

        # 4. Environment variable validation
        if key.startswith(("backend.env.", "frontend1.env.", "frontend2.env.")):
            if not is_valid_environment_variable(key, value):
                warnings.append({
                    "field": key,
                    "type": "invalid_env_var",
                    "message": f"Environment variable '{key}' is not valid"
                })
                continue

        # Field passes validation
        filtered_overrides[key] = value

    return ValidationResult(
        warnings=warnings,
        filtered_overrides=filtered_overrides,
        is_valid=len(warnings) == 0
    )

Secret Loading Workflow

async def load_secrets_for_environment(
    environment: Environment
) -> dict[str, str]:
    """Load secrets required for environment deployment."""

    secrets = {}

    # 1. Load from Google Secret Manager
    secret_manager_client = get_secret_manager_client()

    # 2. Load required secrets
    required_secrets = [
        "mysql-root-password",
        "redis-password",
        "gitlab-token",
        "container-registry-key"
    ]

    for secret_name in required_secrets:
        try:
            secret_value = await access_secret(
                secret_manager_client,
                f"projects/{settings.project_id}/secrets/{secret_name}/versions/latest"
            )
            secrets[secret_name] = secret_value
        except Exception as e:
            logger.error(f"Failed to load secret {secret_name}: {e}")
            # Use fallback or fail gracefully
            if secret_name == "mysql-root-password":
                secrets[secret_name] = generate_random_password()

    # 3. Validate all required secrets are available
    missing_secrets = [s for s in required_secrets if s not in secrets]
    if missing_secrets:
        raise ConfigurationError(f"Missing required secrets: {missing_secrets}")

    return secrets

Workflow States and Transitions

Environment States

  • CREATING: Environment is being provisioned
  • ACTIVE: Environment is running and accessible
  • UPDATING: Configuration changes are being applied
  • DELETING: Environment is being torn down
  • FAILED: Error occurred during lifecycle operation
  • EXPIRED: Environment has exceeded its TTL

Configuration States

  • PENDING: Configuration changes are queued
  • VALIDATING: Changes are being validated
  • APPLYING: Changes are being deployed
  • ACTIVE: Configuration is successfully applied
  • FAILED: Configuration deployment failed
  • ROLLED_BACK: Previous configuration restored

Secret States

  • AVAILABLE: Secret is accessible and current
  • SYNCING: Secret is being synchronized
  • EXPIRED: Secret needs rotation
  • FAILED: Secret access failed
  • ROTATING: Secret is being rotated

Error Handling and Recovery

Validation Errors

  • Generate user-friendly error messages
  • Provide suggestions for fixing issues
  • Log detailed error information
  • Allow partial success with warnings

Deployment Errors

  • Implement retry logic for transient failures
  • Provide rollback mechanisms
  • Monitor deployment health
  • Alert on persistent failures

Secret Management Errors

  • Implement fallback mechanisms
  • Provide graceful degradation
  • Log security events
  • Monitor secret access patterns

Monitoring and Observability

Workflow Metrics

  • Configuration validation success rate
  • Environment creation time
  • Secret access latency
  • Deployment success rate

Audit Logging

  • All configuration changes
  • Secret access events
  • Validation failures
  • Deployment outcomes

Health Checks

  • Environment health status
  • Configuration consistency
  • Secret availability
  • Service dependencies

Best Practices

Workflow Design

  • Implement idempotent operations
  • Use clear state transitions
  • Provide rollback mechanisms
  • Include proper error handling

Configuration Management

  • Validate all user inputs
  • Use structured logging
  • Implement audit trails
  • Monitor configuration drift

Security

  • Minimize secret exposure
  • Use least privilege access
  • Implement proper secret rotation
  • Monitor for security violations
  • blueberry/services/environment_creator.py - Environment creation logic
  • blueberry/api/config_sets.py - Configuration set management
  • blueberry/models/helm_values.py - Validation and override logic
  • blueberry/infrastructure/secret_manager.py - Secret management
  • blueberry/services/argocd.py - ArgoCD integration
Document ID: workflows/core/configuration-management/configuration-workflows