Container Building Strategies¶
Container images package your application with all its dependencies into a portable, reproducible unit. Well-designed container builds are fast, secure, and optimized for production deployment.
This guide covers Docker build strategies for Python/Django applications, from multi-stage builds to layer caching optimization.
Philosophy
Container images should be deterministic, minimal, and fast to build. Every layer adds size and build time—include only what's necessary. Cache aggressively, but invalidate correctly.
Multi-Stage Build Fundamentals¶
Why Multi-Stage Builds¶
Traditional single-stage Dockerfiles include everything: build tools, test dependencies, source code, and runtime dependencies. This creates unnecessarily large images.
Single-Stage Problems: - Production image includes dev dependencies (pytest, black, mypy) - Build tools remain in final image (gcc, make, build-essential) - Image size: 2-3 GB - Security surface: Hundreds of unnecessary packages - Slow deployments: Pulling large images takes time
Multi-Stage Solution:
# Stage 1: Base with system dependencies
FROM python:3.13-slim AS base
RUN apt-get update && apt-get install -y libpq5
WORKDIR /app
# Stage 2: Build dependencies
FROM base AS builder
RUN apt-get install -y build-essential libpq-dev
COPY requirements.txt .
RUN pip install --user -r requirements.txt
# Stage 3: Production
FROM base AS production
COPY --from=builder /root/.local /root/.local
COPY . .
ENV PATH=/root/.local/bin:$PATH
CMD ["gunicorn", "myproject.wsgi:application"]
Multi-Stage Benefits: - Production image: ~400 MB (vs 2+ GB) - Only runtime dependencies included - Build tools discarded automatically - Each stage optimized for its purpose - Development and production from same Dockerfile
Multi-Stage Architecture Pattern¶
A complete multi-stage Dockerfile for Django applications:
# syntax=docker/dockerfile:1.4
ARG PYTHON_VERSION=3.13
# ============================================================================
# BASE STAGE: System dependencies shared across all stages
# ============================================================================
FROM python:${PYTHON_VERSION}-slim AS base
ENV PATH="/root/.local/bin:$PATH" \
PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_NO_CACHE_DIR=1 \
PIP_DISABLE_PIP_VERSION_CHECK=1
WORKDIR /app
# Install runtime system dependencies
# Use mount cache to speed up apt operations
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update && apt-get install --no-install-recommends -y \
ca-certificates \
curl \
libpq5 \
libxml2 \
libxmlsec1-openssl \
&& rm -rf /var/lib/apt/lists/*
# ============================================================================
# BUILDER STAGE: Compile dependencies and build artifacts
# ============================================================================
FROM base AS builder
# Install build dependencies
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update && apt-get install --no-install-recommends -y \
build-essential \
gcc \
g++ \
libpq-dev \
libxml2-dev \
libxmlsec1-dev \
pkg-config \
&& rm -rf /var/lib/apt/lists/*
# Copy only requirements first (better caching)
COPY requirements/requirements-production.txt .
# Install Python dependencies to user directory
RUN --mount=type=cache,target=/root/.cache/pip \
pip install --user -r requirements-production.txt
# Build static assets
COPY package.json package-lock.json ./
RUN --mount=type=cache,target=/root/.npm \
npm ci --production
COPY . .
RUN npm run build
# ============================================================================
# PRODUCTION STAGE: Minimal runtime image
# ============================================================================
FROM base AS production
# Copy Python packages from builder
COPY --from=builder /root/.local /root/.local
# Copy application code
COPY --chown=nobody:nogroup . .
# Copy built static assets
COPY --from=builder /app/static/dist /app/static/dist
# Run as non-root user
USER nobody
EXPOSE 8000
CMD ["gunicorn", "--bind", "0.0.0.0:8000", "--workers", "4", "myproject.wsgi:application"]
# ============================================================================
# DEVELOPMENT STAGE: Development tools and hot reloading
# ============================================================================
FROM base AS development
# Install all system dependencies (including dev tools)
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update && apt-get install --no-install-recommends -y \
build-essential \
gcc \
git \
libpq-dev \
postgresql-client \
vim \
&& rm -rf /var/lib/apt/lists/*
# Install all Python dependencies (including dev/test)
COPY requirements/requirements-dev.txt .
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements-dev.txt
# Copy application code
COPY . .
# Run Django development server
CMD ["python", "manage.py", "runserver", "0.0.0.0:8000"]
# ============================================================================
# TESTING STAGE: Test execution environment
# ============================================================================
FROM development AS testing
# Install additional test dependencies
RUN --mount=type=cache,target=/root/.cache/pip \
pip install pytest-xdist pytest-cov playwright
# Install Playwright browsers
RUN playwright install --with-deps chromium
# Create test artifacts directory
RUN mkdir -p /app/test-results
# Run tests by default
CMD ["pytest", "--cov=myproject", "--cov-report=term-missing", "-v"]
Stage Purposes:
| Stage | Purpose | Image Size | Use Case |
|---|---|---|---|
base |
Shared foundation | ~200 MB | Never used directly |
builder |
Compile dependencies | ~1.5 GB | Build-time only |
production |
Serve traffic | ~400 MB | ECS deployment |
development |
Local development | ~1.2 GB | Devcontainer |
testing |
Run tests | ~1.8 GB | CI pipeline |
Build Optimization Techniques¶
Layer Caching Strategy¶
Docker caches each layer. When a layer changes, all subsequent layers rebuild.
Optimization: Order from least to most frequently changed
# ❌ BAD: Code changes invalidate dependency installation
FROM python:3.13-slim
COPY . /app
RUN pip install -r /app/requirements.txt
CMD ["python", "app.py"]
# ✅ GOOD: Dependencies cached separately from code
FROM python:3.13-slim
COPY requirements.txt /app/
RUN pip install -r /app/requirements.txt
COPY . /app
CMD ["python", "app.py"]
Layer Ordering Best Practices:
- Base image and system packages (changes rarely)
- Package manager configuration (changes rarely)
- Application dependencies (changes occasionally)
- Application code (changes frequently)
- Configuration files (changes very frequently)
Practical Example:
FROM python:3.13-slim
# System packages - rarely change
RUN apt-get update && apt-get install -y libpq5
# Requirements files - change occasionally
COPY requirements/requirements-production.txt .
RUN pip install -r requirements-production.txt
# Application code - changes frequently
COPY myproject/ /app/myproject/
COPY manage.py /app/
# Configuration - changes very frequently
COPY config/ /app/config/
WORKDIR /app
BuildKit Features¶
Docker BuildKit enables advanced build features:
Enable BuildKit:
BuildKit Syntax in Dockerfile:
# syntax=docker/dockerfile:1.4
FROM python:3.13-slim
# Mount caches (BuildKit only)
RUN --mount=type=cache,target=/var/cache/apt \
--mount=type=cache,target=/var/lib/apt \
apt-get update && apt-get install -y libpq5
# Mount pip cache
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt
# Mount secrets (not persisted in image)
RUN --mount=type=secret,id=npmrc,target=/root/.npmrc \
npm install
BuildKit Cache Mounts:
# Persistent apt cache across builds
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked \
--mount=type=cache,target=/var/lib/apt,sharing=locked \
apt-get update && apt-get install -y package
# Persistent pip cache
RUN --mount=type=cache,target=/root/.cache/pip \
pip install -r requirements.txt
# Persistent npm cache
RUN --mount=type=cache,target=/root/.npm \
npm ci
Cache Benefits: - Caches persist across builds - Multiple builds can share caches - Dramatically faster package installation - No RUN commands needed to clear caches
Dependency Compilation Optimization¶
Python packages with C extensions must be compiled during installation:
Inefficient Approach:
FROM python:3.13-slim
# Install build tools
RUN apt-get update && apt-get install -y \
gcc g++ build-essential libpq-dev
# Install packages (compiles psycopg2, lxml, etc.)
RUN pip install -r requirements.txt
# Build tools remain in final image
Optimized Multi-Stage Approach:
# Builder: Compile dependencies
FROM python:3.13-slim AS builder
RUN apt-get update && apt-get install -y \
gcc g++ libpq-dev libxml2-dev
# Install to user directory
RUN pip install --user -r requirements.txt
# Production: Copy compiled packages
FROM python:3.13-slim
# Only runtime libraries needed
RUN apt-get update && apt-get install -y \
libpq5 libxml2
# Copy pre-compiled Python packages
COPY --from=builder /root/.local /root/.local
Size Comparison: - With build tools: 1.8 GB - Without build tools: 450 MB - Savings: 75%
Using uv for Faster Builds¶
uv is a blazing-fast Python package installer and resolver:
FROM python:3.13-slim AS base
# Install uv
ADD --chmod=755 https://astral.sh/uv/install.sh /install.sh
RUN /install.sh && rm /install.sh
ENV PATH="/root/.local/bin:$PATH"
# Generate lockfile (in your project)
# uv pip compile requirements.in -o requirements.txt
# Install with uv (10-100x faster than pip)
COPY requirements.txt .
RUN --mount=type=cache,target=/root/.cache/uv \
uv pip install --system -r requirements.txt
uv Benefits: - 10-100x faster than pip - Better dependency resolution - Automatic caching - Compatible with pip requirements files
ECR Push Strategies¶
Tagging Conventions¶
Image tags identify specific versions:
Tagging Strategies:
# Commit SHA (immutable, traceable)
docker tag myapp:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:abc123def
# Branch name (mutable)
docker tag myapp:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:main
# Semantic version (release tags)
docker tag myapp:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.2.3
# Latest (convenience pointer)
docker tag myapp:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:latest
# Environment-specific
docker tag myapp:latest 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:staging-abc123
Best Practice: Multiple Tags for One Image
# Build once
docker build -t myapp:build .
# Tag with multiple identifiers
docker tag myapp:build $ECR_REGISTRY/myapp:${COMMIT_SHA}
docker tag myapp:build $ECR_REGISTRY/myapp:${BRANCH_NAME}
docker tag myapp:build $ECR_REGISTRY/myapp:latest
# Push all tags
docker push $ECR_REGISTRY/myapp:${COMMIT_SHA}
docker push $ECR_REGISTRY/myapp:${BRANCH_NAME}
docker push $ECR_REGISTRY/myapp:latest
Tag Selection for Deployment:
- Development: Use branch tags (main, develop)
- Staging: Use commit SHA tags (immutable, traceable)
- Production: Use commit SHA tags (never use latest)
ECR Authentication¶
Authenticate Docker to ECR:
# Get ECR login password and pipe to docker login
aws ecr get-login-password --region us-east-1 | \
docker login --username AWS --password-stdin \
123456789012.dkr.ecr.us-east-1.amazonaws.com
ECR Authentication in Scripts:
#!/bin/bash
set -e
# Variables
AWS_REGION="us-east-1"
ECR_REGISTRY="123456789012.dkr.ecr.us-east-1.amazonaws.com"
IMAGE_NAME="myapp"
COMMIT_SHA=$(git rev-parse --short HEAD)
# Authenticate
echo "Logging into ECR..."
aws ecr get-login-password --region $AWS_REGION | \
docker login --username AWS --password-stdin $ECR_REGISTRY
# Build
echo "Building image..."
docker build \
--tag ${IMAGE_NAME}:${COMMIT_SHA} \
--tag ${IMAGE_NAME}:latest \
.
# Tag for ECR
echo "Tagging for ECR..."
docker tag ${IMAGE_NAME}:${COMMIT_SHA} ${ECR_REGISTRY}/${IMAGE_NAME}:${COMMIT_SHA}
docker tag ${IMAGE_NAME}:latest ${ECR_REGISTRY}/${IMAGE_NAME}:latest
# Push
echo "Pushing to ECR..."
docker push ${ECR_REGISTRY}/${IMAGE_NAME}:${COMMIT_SHA}
docker push ${ECR_REGISTRY}/${IMAGE_NAME}:latest
echo "Build complete: ${ECR_REGISTRY}/${IMAGE_NAME}:${COMMIT_SHA}"
Authentication Token Expiration: - ECR tokens expire after 12 hours - Re-authenticate before long build processes - GitHub Actions automatically re-authenticates per run
Parallel Push Strategy¶
Push multiple images simultaneously:
# Build multiple images
docker build --target production -t app:prod .
docker build --target worker -t worker:prod .
docker build --target scheduler -t scheduler:prod .
# Tag all images
for image in app worker scheduler; do
docker tag ${image}:prod $ECR_REGISTRY/${image}:$COMMIT_SHA
done
# Push in parallel (bash background jobs)
docker push $ECR_REGISTRY/app:$COMMIT_SHA &
docker push $ECR_REGISTRY/worker:$COMMIT_SHA &
docker push $ECR_REGISTRY/scheduler:$COMMIT_SHA &
# Wait for all pushes to complete
wait
Build Caching¶
GitHub Actions Cache¶
Cache Build Layers:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build with cache
uses: docker/build-push-action@v5
with:
context: .
file: ./Dockerfile
push: true
tags: ${{ env.ECR_REGISTRY }}/myapp:${{ github.sha }}
cache-from: type=gha,scope=build-prod
cache-to: type=gha,mode=max,scope=build-prod
How GitHub Actions Cache Works:
- First Build:
- Builds all layers normally
- Uploads layers to GitHub cache
-
Cache key:
scope=build-prod -
Subsequent Builds:
- Downloads cached layers from GitHub
- Only rebuilds changed layers
- Build time: 10 minutes → 2 minutes
Cache Scope:
- scope: Namespace for cache
- Use different scopes for different images
- Example: build-prod, build-worker, build-test
Cache Modes:
- mode=max: Cache all layers (slower upload, faster builds)
- mode=min: Cache only final layers (faster upload, slower builds)
GitHub Cache Limits: - 10 GB per repository - Automatically evicts oldest caches - Caches persist for 7 days without access
Registry Cache Strategy¶
Cache in ECR:
- name: Build with ECR cache
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ env.ECR_REGISTRY }}/myapp:${{ github.sha }}
cache-from: |
type=registry,ref=${{ env.ECR_REGISTRY }}/myapp:buildcache
type=registry,ref=${{ env.ECR_REGISTRY }}/myapp:latest
cache-to: type=registry,ref=${{ env.ECR_REGISTRY }}/myapp:buildcache,mode=max
Registry Cache Benefits: - No size limits (unlike GitHub cache) - Shared across workflows and repositories - Persists indefinitely - Closer to build environment (faster downloads)
Registry Cache Disadvantages: - Must authenticate to ECR - Counts toward ECR storage costs - Requires separate repository or tag
Local Build Cache¶
Docker BuildKit Local Cache:
# Build with cache export
docker buildx build \
--cache-from type=local,src=/tmp/buildx-cache \
--cache-to type=local,dest=/tmp/buildx-cache,mode=max \
--tag myapp:latest \
.
# Cache persists in /tmp/buildx-cache
# Next build reuses cached layers
Useful for: - Local development builds - Self-hosted CI runners - Offline builds
Security Scanning¶
Scanning During Build¶
Scan for vulnerabilities before pushing:
- name: Build image
uses: docker/build-push-action@v5
with:
context: .
load: true
tags: myapp:scan
- name: Scan for vulnerabilities
uses: aquasecurity/trivy-action@master
with:
image-ref: myapp:scan
format: sarif
output: trivy-results.sarif
severity: CRITICAL,HIGH
- name: Upload scan results
uses: github/codeql-action/upload-sarif@v2
with:
sarif_file: trivy-results.sarif
- name: Fail on critical vulnerabilities
run: |
trivy image --exit-code 1 --severity CRITICAL myapp:scan
Vulnerability Scanning Tools: - Trivy: Fast, comprehensive, supports many formats - Grype: Anchore's vulnerability scanner - Snyk: Commercial with free tier - ECR Scanning: AWS-native scanning (Clair or Inspector)
Base Image Selection¶
Choose minimal, maintained base images:
Official Python Images:
# Full image (Debian-based): ~1 GB
FROM python:3.13
# Slim image (Debian with minimal packages): ~200 MB
FROM python:3.13-slim
# Alpine image (Alpine Linux): ~50 MB (⚠️ compatibility issues)
FROM python:3.13-alpine
Recommendations:
| Image | Size | Use Case | Notes |
|---|---|---|---|
python:3.13 |
~1 GB | Never | Too large |
python:3.13-slim |
~200 MB | Production | Recommended |
python:3.13-alpine |
~50 MB | Constrained environments | Musl libc issues |
ubuntu:22.04 |
~80 MB | Custom builds | More control |
Alpine Considerations: - Uses musl libc instead of glibc - Binary packages may not work (psycopg2, numpy) - Longer build times (compile everything) - Only use if image size critical
Dependency Scanning¶
Scan Python dependencies for vulnerabilities:
FROM python:3.13-slim AS security-scan
COPY requirements.txt .
RUN pip install pip-audit
# Fail build if vulnerabilities found
RUN pip-audit -r requirements.txt --strict
In CI Pipeline:
- name: Install dependencies
run: pip install -r requirements.txt
- name: Scan dependencies
run: |
pip install pip-audit
pip-audit --strict
Tools:
- pip-audit: Official PyPA tool
- safety: Commercial with free tier
- snyk: Comprehensive security scanning
Build Performance Benchmarks¶
Measuring Build Performance¶
Time Each Build Stage:
Docker Build Output:
[+] Building 124.5s (15/15) FINISHED
=> [base 1/3] FROM python:3.13-slim 5.2s
=> [base 2/3] RUN apt-get update 12.3s
=> [builder 1/5] RUN apt-get install build 8.7s
=> [builder 2/5] COPY requirements.txt 0.1s
=> [builder 3/5] RUN pip install 95.4s
=> [production 1/2] COPY --from=builder 0.3s
=> [production 2/2] COPY . /app 2.5s
Identify Bottlenecks: - Longest steps are optimization candidates - Focus on steps that run on every build - Cache frequently accessed dependencies
Optimization Results¶
Example Optimization Journey:
| Optimization | Build Time | Image Size | Notes |
|---|---|---|---|
| Initial single-stage | 12m 30s | 2.1 GB | Baseline |
| Multi-stage build | 11m 45s | 580 MB | Remove dev deps |
| Layer optimization | 8m 20s | 580 MB | Better cache hits |
| BuildKit cache mounts | 2m 15s | 580 MB | Persistent caches |
| uv instead of pip | 1m 30s | 580 MB | Faster installs |
| Parallel stages | 1m 10s | 580 MB | Build stages in parallel |
Incremental Build (cached): - Initial: 12m 30s - Optimized: 1m 10s - Code-only change: 25s
Justfile Integration¶
Build Commands¶
# justfile
# Variables
ECR_REGISTRY := "123456789012.dkr.ecr.us-east-1.amazonaws.com"
AWS_REGION := "us-east-1"
IMAGE_NAME := "myapp"
# Authenticate to ECR
ecr-login:
@echo "🔐 Logging into ECR..."
aws ecr get-login-password --region {{AWS_REGION}} | \
docker login --username AWS --password-stdin {{ECR_REGISTRY}}
# Build production image
build-production: ecr-login
@echo "🏗️ Building production image..."
DOCKER_BUILDKIT=1 docker build \
--target production \
--tag {{IMAGE_NAME}}:latest \
--tag {{ECR_REGISTRY}}/{{IMAGE_NAME}}:$(git rev-parse --short HEAD) \
--cache-from {{ECR_REGISTRY}}/{{IMAGE_NAME}}:latest \
.
# Build development image
build-dev:
@echo "🏗️ Building development image..."
DOCKER_BUILDKIT=1 docker build \
--target development \
--tag {{IMAGE_NAME}}:dev \
.
# Build and push to ECR
push: build-production
@echo "⬆️ Pushing to ECR..."
docker push {{ECR_REGISTRY}}/{{IMAGE_NAME}}:$(git rev-parse --short HEAD)
docker push {{ECR_REGISTRY}}/{{IMAGE_NAME}}:latest
@echo "✅ Pushed {{ECR_REGISTRY}}/{{IMAGE_NAME}}:$(git rev-parse --short HEAD)"
# Build testing image
build-test:
@echo "🏗️ Building test image..."
DOCKER_BUILDKIT=1 docker build \
--target testing \
--tag {{IMAGE_NAME}}:test \
.
# Run tests in container
test: build-test
docker run --rm {{IMAGE_NAME}}:test
# Clean up Docker artifacts
clean:
@echo "🧹 Cleaning up Docker artifacts..."
docker system prune -f
docker volume prune -f
Usage:
# Build production image
just build-production
# Build and push
just push
# Run tests
just test
# Clean up
just clean
Next Steps¶
After optimizing container builds:
- ECS Deployment: Deploy optimized containers to AWS ECS
- GitHub Actions: Automate builds in CI/CD pipeline
- Monitoring: Monitor container performance in production
- Security: Implement comprehensive security practices
Measure First, Optimize Second
Always measure build times before optimizing. Focus on the slowest steps first. Use time docker build . to establish baselines.
Related Resources¶
Internal Documentation: - CI/CD Overview: Pipeline architecture and philosophy - Docker Development: Local Docker development - Devcontainers: VS Code devcontainer setup - Security Best Practices: Secure development
External Resources: - Docker Multi-Stage Builds - BuildKit Documentation - ECR User Guide - uv Documentation - Trivy Scanner