Skip to content

Docker Configuration

Overview

Docker containers package applications with their dependencies into isolated, reproducible units. For Python/Django applications, Docker enables consistent deployment across development, staging, and production environments while optimizing for fast builds, small image sizes, and security.

Why Docker for Django?

Key Benefits

  • Reproducible builds - same image works everywhere
  • Dependency isolation - no conflicts with host system
  • Fast deployments - immutable images deploy quickly
  • Resource efficiency - containers share the host kernel
  • Layer caching - rebuild only what changed
  • Security - isolated execution environment

Challenges Addressed

  1. "Works on my machine" syndrome: Docker ensures identical environments
  2. Dependency management: System packages and Python packages in one place
  3. Version conflicts: Each container has its own dependencies
  4. Deployment complexity: Single artifact (image) contains everything
  5. Scaling: Container orchestrators (ECS, Kubernetes) manage instances

Dockerfile Architecture

Multi-Stage Build Pattern

Multi-stage builds separate concerns and minimize final image size:

# Stage 1: Base - Common dependencies
FROM python:3.13.5-slim AS base
# ... install system dependencies

# Stage 2: Builder - Compile/download artifacts
FROM base AS builder
# ... build wheels, compile assets

# Stage 3: Production - Minimal runtime
FROM base AS production
# ... copy only runtime artifacts

# Stage 4: Development - Include dev tools
FROM base AS development
# ... install dev dependencies and tools

Theory: Each stage builds on previous stages, but only the final stage becomes the image. Intermediate stages exist only during build time. This enables:

  1. Smaller production images: Exclude build tools and dev dependencies
  2. Faster builds: Change only the affected stage
  3. Clear separation: Base (shared) vs production vs development
  4. Layer reuse: Common layers shared across stages

Stage Breakdown

Base Stage

The base stage contains dependencies needed by all other stages:

ARG PYTHON_VERSION=3.13.5

FROM python:${PYTHON_VERSION}-slim AS base

ENV PATH="/root/.cargo/bin:/root/.local/bin:$PATH" \
    PROMPT_COMMAND='history -a' \
    TERM=xterm-color \
    VIRTUAL_ENV=/usr/local \
    PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1

WORKDIR /app

RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    apt-get update && apt-get install --no-install-recommends -y \
        build-essential \
        ca-certificates \
        curl \
        default-libmysqlclient-dev \
        git \
        libssl3 \
        libxml2 \
        libxmlsec1 \
        pkg-config \
        && rm -rf /var/lib/apt/lists/*

Key Elements:

Build Arguments:

ARG PYTHON_VERSION=3.13.5

Build args enable parameterized builds. The same Dockerfile can build multiple Python versions:

docker build --build-arg PYTHON_VERSION=3.12.7 .
docker build --build-arg PYTHON_VERSION=3.13.5 .

Theory: Build args provide flexibility without maintaining separate Dockerfiles. They're resolved at build time and don't persist in the final image.

Environment Variables:

ENV PATH="/root/.cargo/bin:/root/.local/bin:$PATH" \
    PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1

Theory:

  • PATH: Include tool installation paths (Rust cargo, Python local)
  • PYTHONUNBUFFERED: Force Python to output immediately (important for log aggregation)
  • PYTHONDONTWRITEBYTECODE: Skip .pyc file generation (not needed in containers)
  • VIRTUAL_ENV: Point pip to system Python (no separate venv needed)

Cache Mounts:

RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    apt-get update && apt-get install ...

Theory: Cache mounts persist across builds, dramatically speeding up rebuilds. The package manager's cache survives, avoiding re-downloads. sharing=locked allows multiple builds to share the cache safely.

Without cache: Every build downloads packages from scratch. With cache: Only new or updated packages download.

Layer Optimization:

&& rm -rf /var/lib/apt/lists/*

Theory: Combine installation and cleanup in one RUN command. Docker creates one layer per RUN instruction. Separate cleanup would add a layer but not reduce image size (previous layer still contains the files).

Production Stage

Production images prioritize size and security:

FROM base AS production

ARG PYTHON_VERSION=3.13.5

# Compile requirements
RUN uv pip compile requirements/requirements-production.in \
    -o requirements/requirements-production.txt \
    --python-version=${PYTHON_VERSION}

# Install to system Python
RUN uv pip install --system -r requirements/requirements-production.txt

# Copy application code
COPY . /app/

# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser appuser
RUN chown -R appuser:appuser /app

USER appuser

EXPOSE 8000

CMD ["gunicorn", "myapp.wsgi:application", \
     "--bind", "0.0.0.0:8000", \
     "--workers", "4"]

Key Patterns:

Inline Requirement Compilation:

RUN uv pip compile requirements/requirements-production.in \
    -o requirements/requirements-production.txt

Theory: Compiling requirements inside the Dockerfile ensures they match the target Python version and platform exactly. Pre-compiled requirements might target the wrong platform (macOS vs Linux) or Python version.

System Installation:

RUN uv pip install --system -r requirements/requirements-production.txt

Theory: --system installs to /usr/local/lib/python3.x/site-packages instead of creating a virtual environment. Containers don't need virtual environments; they already provide isolation. System installation:

  • Reduces image size (no venv overhead)
  • Simplifies PATH management
  • Matches Python base image expectations

Copy Timing:

RUN uv pip install -r requirements.txt
COPY . /app/

Theory: Copy application code AFTER installing dependencies. Dependencies change less frequently than code. Docker caches layers; this ordering maximizes cache hits:

  1. Code changes → Only rerun COPY (fast)
  2. Dependency changes → Rerun install + COPY (slower)

Security Hardening:

RUN groupadd -r appuser && useradd -r -g appuser appuser
RUN chown -R appuser:appuser /app
USER appuser

Theory: Never run containers as root. Create a non-root user and switch to it. If the container is compromised, the attacker has limited privileges. The -r flag creates a system account (UID < 1000), which can't log in interactively.

Port Exposure:

EXPOSE 8000

Theory: EXPOSE is documentation; it doesn't actually publish the port. It tells container orchestrators which ports the container listens on. Publishing happens at runtime (-p 8000:8000).

Default Command:

CMD ["gunicorn", "myapp.wsgi:application", \
     "--bind", "0.0.0.0:8000", \
     "--workers", "4"]

Theory: CMD provides the default command when no command is specified. It can be overridden at runtime. Use JSON syntax (["cmd", "arg"]) instead of shell syntax (cmd arg) to avoid spawning an unnecessary shell process.

Development Stage

Development images include tooling and test frameworks:

FROM base AS development

ARG PYTHON_VERSION=3.13.5

# Compile development requirements
RUN uv pip compile requirements/requirements-production.in \
    -o requirements/requirements-production.txt \
    --python-version=${PYTHON_VERSION}
RUN uv pip compile requirements/requirements-dev.in \
    -o requirements/requirements-dev.txt \
    --python-version=${PYTHON_VERSION}

# Install development dependencies
RUN uv pip install --system -r requirements/requirements-dev.txt \
    pytest-playwright playwright

# Install Playwright browser dependencies
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
    --mount=type=cache,target=/var/lib/apt,sharing=locked \
    apt-get update && apt-get install -y --no-install-recommends \
    libdbus-1-3 \
    libatk1.0-0 \
    libatk-bridge2.0-0 \
    libcups2 \
    libxkbcommon0 \
    libxcomposite1 \
    libxdamage1 \
    libxrandr2 \
    libgbm1 \
    libasound2 \
    && rm -rf /var/lib/apt/lists/*

# Install Playwright browsers
RUN playwright install --with-deps chromium

# Copy application code
COPY . /app/

# Create test-results directory
RUN mkdir -p /app/test-results

EXPOSE 8000

Key Differences from Production:

Development Dependencies:

RUN uv pip install --system -r requirements/requirements-dev.txt

Development stage includes:

  • Test frameworks (pytest, pytest-django)
  • Code quality tools (ruff, mypy)
  • Debugging tools (ipython, django-debug-toolbar)
  • End-to-end testing (playwright)

Browser Testing Support:

RUN playwright install --with-deps chromium

Theory: Modern web applications need browser testing. Playwright provides cross-browser testing capabilities. Installing --with-deps includes system dependencies for running browsers headlessly. Chromium is chosen over Firefox/WebKit for smaller image size.

No User Switching:

# No USER command in development

Theory: Development containers run as root for convenience. Developers need to install packages, modify files, and debug without permission issues. This is acceptable in development but never in production.

Layer Optimization

Understanding Layers

Each Dockerfile instruction creates a layer:

FROM python:3.13.5-slim     # Layer 1: Base image
RUN apt-get update          # Layer 2: Package index
RUN apt-get install curl    # Layer 3: Install curl
COPY requirements.txt .     # Layer 4: Copy file
RUN pip install -r reqs.txt # Layer 5: Install packages

Theory: Layers are immutable and stacked. Each layer stores the differences from the previous layer. Docker caches layers; unchanged layers reuse the cache.

Problem: If requirements.txt changes, layers 4-5 rebuild, but layers 1-3 use cache.

Optimization Strategies

Inefficient:

RUN apt-get update
RUN apt-get install -y curl
RUN apt-get install -y git
RUN rm -rf /var/lib/apt/lists/*

Optimized:

RUN apt-get update && \
    apt-get install -y curl git && \
    rm -rf /var/lib/apt/lists/*

Theory: Four RUN commands = four layers. One RUN command = one layer. Combining commands:

  • Reduces layer count
  • Enables cleanup in the same layer (reduces size)
  • Speeds up build (less layer management overhead)

Order by Change Frequency

Inefficient:

COPY . /app/
RUN pip install -r requirements.txt

Optimized:

COPY requirements.txt /app/
RUN pip install -r requirements.txt
COPY . /app/

Theory: Application code changes frequently; dependencies change rarely. Copying code first invalidates the cache for all subsequent layers. Copying requirements first maximizes cache hits.

Change scenarios:

  1. Code change only: Rebuild only COPY (fast)
  2. Requirement change: Rebuild RUN pip install + COPY (slower)
  3. Both change: Rebuild both (expected)

Use .dockerignore

# .dockerignore
.git/
.gitignore
.env*
!.env.example
venv/
venv-*/
__pycache__/
*.pyc
*.pyo
*.pyd
.pytest_cache/
.coverage
htmlcov/
node_modules/
.DS_Store
*.log

Theory: .dockerignore excludes files from the build context. This:

  • Reduces context size (faster upload to Docker daemon)
  • Prevents secrets from being copied into images
  • Excludes unnecessary files (tests, docs, caches)
  • Speeds up COPY operations

Pattern: Exclude everything development-related that isn't needed at runtime.

Leverage BuildKit Cache Mounts

RUN --mount=type=cache,target=/root/.cache/pip \
    pip install -r requirements.txt

Theory: Pip downloads packages to ~/.cache/pip. Without cache mounts, this cache is lost after the RUN command. With cache mounts, the cache persists across builds, avoiding re-downloads.

Supported package managers:

  • Python: /root/.cache/pip or /root/.cache/uv
  • npm: /root/.npm
  • apt: /var/cache/apt and /var/lib/apt

Multi-Stage Builds for Size Reduction

FROM python:3.13.5 AS builder
RUN pip install --user -r requirements.txt

FROM python:3.13.5-slim
COPY --from=builder /root/.local /root/.local
ENV PATH=/root/.local/bin:$PATH

Theory: The builder stage uses the full Python image (includes compilers). The final stage uses the slim image. Copying only /root/.local (installed packages) excludes build tools, reducing image size by 100s of MB.

Health Checks

Application Health Check

HEALTHCHECK --interval=30s --timeout=5s --start-period=40s --retries=3 \
  CMD curl -f http://localhost:8000/health/ || exit 1

Theory: Health checks let Docker/ECS monitor container health. The orchestrator can:

  • Restart unhealthy containers
  • Remove unhealthy containers from load balancers
  • Prevent deployments of unhealthy containers

Parameters:

  • interval: How often to check (30s = every 30 seconds)
  • timeout: How long to wait for response (5s)
  • start-period: Grace period for container startup (40s)
  • retries: Failed checks before marking unhealthy (3)

Django Health Check Endpoint

Implement a health check view:

# views/health.py
from django.http import JsonResponse
from django.db import connection

def health_check(request):
    """Health check endpoint for container orchestration."""
    try:
        # Check database connectivity
        connection.ensure_connection()

        # Check cache connectivity
        from django.core.cache import cache
        cache.set('health_check', 'ok', 1)

        return JsonResponse({
            'status': 'healthy',
            'database': 'connected',
            'cache': 'connected'
        })
    except Exception as e:
        return JsonResponse({
            'status': 'unhealthy',
            'error': str(e)
        }, status=503)

Theory: Health checks should verify critical dependencies:

  1. Database: Can the app query the database?
  2. Cache: Is Redis/Memcached accessible?
  3. External APIs: Are critical external services reachable?

Return 200 for healthy, 503 for unhealthy. Keep checks fast (<1s).

ECS-Specific Considerations

Logging Configuration

# Install AWS log driver (automatic in AWS images)
ENV PYTHONUNBUFFERED=1

Theory: ECS captures stdout/stderr and sends to CloudWatch. PYTHONUNBUFFERED=1 ensures logs appear immediately without buffering. Django logging should output to stdout:

LOGGING = {
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
            'stream': sys.stdout,
        },
    },
}

Secrets Management

Never include secrets in the image:

# ❌ WRONG - secrets in image
ENV DATABASE_PASSWORD=secretpassword

# ✅ RIGHT - inject at runtime
# (No ENV command - pass via ECS task definition)

Theory: Docker images are immutable artifacts. Anyone with access to the image can extract environment variables. ECS task definitions inject secrets from:

  • AWS Systems Manager Parameter Store
  • AWS Secrets Manager
  • Environment variables (for non-sensitive config)

Task Definition Integration

The Dockerfile complements the ECS task definition:

{
  "family": "myapp-task",
  "containerDefinitions": [
    {
      "name": "app",
      "image": "123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:latest",
      "memory": 512,
      "cpu": 256,
      "essential": true,
      "portMappings": [
        {
          "containerPort": 8000,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {
          "name": "DJANGO_SETTINGS_MODULE",
          "value": "myapp.settings.production"
        }
      ],
      "secrets": [
        {
          "name": "DATABASE_PASSWORD",
          "valueFrom": "arn:aws:ssm:us-east-1:123:parameter/myapp/db-password"
        }
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/myapp",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "app"
        }
      }
    }
  ]
}

Theory: Separation of concerns:

  • Dockerfile: Application packaging (what's inside)
  • Task Definition: Runtime configuration (how it runs)

Security Best Practices

Minimal Base Images

# ✅ Use slim variants
FROM python:3.13.5-slim

# ❌ Avoid full images
FROM python:3.13.5  # 900MB vs 150MB

Theory: Slim images exclude compilers, man pages, and utilities. Smaller images:

  • Reduce attack surface (fewer packages = fewer vulnerabilities)
  • Faster pulls (less data to download)
  • Faster deployments
  • Lower storage costs

Scan for Vulnerabilities

# Scan image for vulnerabilities
docker scan myapp:latest

# Trivy scanner (more comprehensive)
trivy image myapp:latest

Theory: Images inherit vulnerabilities from base images and installed packages. Regular scanning detects CVEs. Integrate scanning into CI/CD:

# .github/workflows/build.yml
- name: Scan image
  uses: aquasecurity/trivy-action@master
  with:
    image-ref: myapp:${{ github.sha }}
    severity: 'CRITICAL,HIGH'

Least Privilege

# Create non-root user
RUN groupadd -r appuser && useradd -r -g appuser appuser
USER appuser

Theory: Containers should never run as root in production. If compromised, the attacker inherits the container's privileges. Non-root users:

  • Prevent privilege escalation
  • Limit file system access
  • Comply with security policies (PCI-DSS, SOC2)

Avoid Secrets in Layers

# ❌ WRONG - secret in layer history
RUN echo "API_KEY=secret123" > /app/.env

# ✅ RIGHT - mount secret at runtime
# (No secret in image - pass via environment variable)

Theory: Even if you delete a file in a later layer, it still exists in the layer where it was created. Docker layer history is immutable. Secrets should only be injected at runtime.

Build Optimization

BuildKit Features

Enable BuildKit for advanced features:

export DOCKER_BUILDKIT=1
docker build -t myapp:latest .

Features:

  1. Parallel build stages: Build independent stages concurrently
  2. Cache mounts: Persist package manager caches
  3. Better caching: Smarter cache invalidation logic
  4. Secrets: Mount secrets during build without including in image

Build Cache Strategies

Remote Cache

# Push cache to registry
docker buildx build \
  --cache-to=type=registry,ref=myapp:cache \
  --tag myapp:latest \
  .

# Pull cache in CI
docker buildx build \
  --cache-from=type=registry,ref=myapp:cache \
  --tag myapp:latest \
  .

Theory: CI/CD systems don't have local cache. Remote cache stores layers in a registry, enabling cache hits across builds and runners.

Inline Cache

docker build \
  --build-arg BUILDKIT_INLINE_CACHE=1 \
  --tag myapp:latest \
  .

Theory: Inline cache embeds cache metadata in the image itself. Subsequent builds can use the image as a cache source without a separate cache artifact.

Parallel Stage Building

FROM base AS build-backend
RUN pip install -r requirements.txt

FROM base AS build-frontend
RUN npm install && npm run build

FROM base AS final
COPY --from=build-backend /app/venv /app/venv
COPY --from=build-frontend /app/dist /app/static

Theory: build-backend and build-frontend have no dependencies on each other. BuildKit builds them in parallel, reducing total build time.

Testing Docker Images

Testing Strategy

Test images before pushing to production:

# Build test stage
docker build --target testing -t myapp:test .

# Run tests in container
docker run --rm myapp:test pytest

# Test production image
docker build --target production -t myapp:prod .
docker run -p 8000:8000 myapp:prod
curl http://localhost:8000/health/

Theory: Separate testing stage ensures tests run in an environment identical to production. This catches environment-specific issues early.

Testing Stage

FROM development AS testing

# Copy test fixtures and data
COPY tests/ /app/tests/
COPY pytest.ini /app/

# Run tests
RUN pytest --cov=myapp --cov-report=term-missing

# Fail build if coverage is too low
RUN coverage report --fail-under=80

Theory: Making tests part of the build prevents deploying broken code. The build fails if tests fail. This enforces quality standards in the build pipeline.

Common Patterns

Django Static Files

FROM base AS production

COPY . /app/
RUN python manage.py collectstatic --noinput

# Static files now in /app/staticfiles/

Theory: collectstatic gathers static files from all apps into one directory. In production, these are served by:

  1. Nginx/Apache (direct file serving)
  2. S3/CloudFront (CDN)
  3. WhiteNoise (Django middleware)

Running collectstatic during build ensures static files are always up-to-date with the code.

Multiple Python Versions

ARG PYTHON_VERSION=3.13.5
FROM python:${PYTHON_VERSION}-slim AS base

Build for multiple versions:

docker build --build-arg PYTHON_VERSION=3.12.7 -t myapp:py312 .
docker build --build-arg PYTHON_VERSION=3.13.5 -t myapp:py313 .

Theory: Build args parameterize the Dockerfile. One Dockerfile supports multiple configurations. This is especially useful for testing compatibility across Python versions.

Database-Aware Builds

FROM base AS production

# Install database-specific drivers
ARG DATABASE_TYPE=mysql
RUN if [ "$DATABASE_TYPE" = "mysql" ]; then \
        apt-get install -y default-libmysqlclient-dev; \
    elif [ "$DATABASE_TYPE" = "postgresql" ]; then \
        apt-get install -y libpq-dev; \
    fi

Theory: Different databases require different system packages. Conditional installation keeps images lean (no unnecessary drivers).

Platform Considerations

Multi-Architecture Builds

docker buildx build \
  --platform linux/amd64,linux/arm64 \
  --tag myapp:latest \
  --push \
  .

Theory: Different deployment targets use different architectures:

  • linux/amd64: Traditional servers, Intel/AMD CPUs
  • linux/arm64: ARM servers (AWS Graviton, Apple Silicon)

Multi-architecture builds create separate images for each platform, stored under the same tag. Docker automatically pulls the correct architecture.

Development on Apple Silicon

# Force AMD64 for consistency
FROM --platform=linux/amd64 python:3.13.5-slim

Theory: Apple Silicon Macs use ARM64. If your production environment uses AMD64, specify the platform explicitly to avoid subtle bugs from architecture differences.

Docker Compose for Development

Development Compose File

version: '3.8'

services:
  app:
    build:
      context: .
      dockerfile: docker/Dockerfile.app
      target: development
      args:
        PYTHON_VERSION: "3.13.5"
    command: python manage.py runserver 0.0.0.0:8000
    volumes:
      - .:/app
    ports:
      - "8000:8000"
    environment:
      - DJANGO_SETTINGS_MODULE=myapp.settings.development
    depends_on:
      db:
        condition: service_healthy
      redis:
        condition: service_started

  db:
    image: mysql:8.0
    environment:
      MYSQL_ROOT_PASSWORD: root
      MYSQL_DATABASE: myapp
    volumes:
      - db-data:/var/lib/mysql
    healthcheck:
      test: ["CMD", "mysqladmin", "ping", "-h", "localhost"]
      interval: 10s
      timeout: 5s
      retries: 5

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

volumes:
  db-data:

Theory: Docker Compose orchestrates multiple services. The app service builds from the development stage, enabling live code reloading via bind mounts. Service dependencies ensure proper startup order.

Troubleshooting

Large Images

Symptom: Image is > 1GB

Diagnosis:

# Analyze layer sizes
docker history myapp:latest --no-trunc

# Dive into image
dive myapp:latest

Solutions:

  1. Use slim base images
  2. Combine RUN commands
  3. Clean up in the same layer
  4. Use multi-stage builds
  5. Review installed packages

Slow Builds

Symptom: Builds take > 5 minutes

Diagnosis:

# Profile build
docker build --progress=plain -t myapp:latest . 2>&1 | tee build.log

Solutions:

  1. Enable BuildKit
  2. Use cache mounts
  3. Order layers by change frequency
  4. Optimize .dockerignore
  5. Use remote cache in CI

Cache Not Working

Symptom: Layers rebuild unnecessarily

Diagnosis: Check Dockerfile order and COPY commands

Solutions:

  1. Order by change frequency (dependencies before code)
  2. Copy only what's needed at each step
  3. Use specific COPY commands (not COPY . /app/ too early)
  4. Check for timestamp changes (git updates timestamps)

Next Steps