Logging Best Practices¶

Logging is the primary mechanism for observing application behavior in production. Well-designed logging provides insights into system health, user behavior, error patterns, and performance characteristics without requiring active debugging sessions.

This guide covers modern logging practices for Django applications running on AWS infrastructure with CloudWatch, structured logging, and Sentry integration.

Philosophy

Logs are event streams that tell the story of your application. They should be structured, contextual, and actionable. Every log entry should answer: What happened? When? Where? To whom? Why does it matter?

Logging Fundamentals¶

Log Levels¶

Python's logging framework provides five standard log levels, each serving a distinct purpose:

import logging

logger = logging.getLogger(__name__)

# DEBUG: Detailed diagnostic information
logger.debug("Processing user query with parameters: %s", params)

# INFO: Confirmation that things are working as expected
logger.info("User %s logged in successfully", user.email)

# WARNING: Something unexpected happened, but the application continues
logger.warning("API rate limit at 80%% for tenant %s", tenant_id)

# ERROR: An error occurred, but the application can continue
logger.error("Failed to send email to %s: %s", recipient, error)

# CRITICAL: A serious error, the application may not continue
logger.critical("Database connection pool exhausted")

When to use each level:

Level	When to Use	Example	Production Logging
DEBUG	Development diagnostics	"SQL query took 0.032s"	Disabled
INFO	Normal operations	"User login successful"	Enabled
WARNING	Degraded state	"Cache miss, falling back to database"	Enabled
ERROR	Recoverable failures	"API request failed, will retry"	Enabled
CRITICAL	System instability	"Cannot connect to database"	Enabled + Alert

Structured Logging¶

Structured logging outputs logs as JSON objects rather than plain text strings. This enables machine parsing, filtering, and analysis.

# ❌ Unstructured: Hard to parse and query
logger.info(f"User {user.email} created order {order.id} for ${order.total}")

# ✅ Structured: Easy to query and analyze
logger.info(
    "Order created",
    extra={
        "event_type": "order_created",
        "user_id": user.id,
        "user_email": user.email,
        "order_id": order.id,
        "order_total": float(order.total),
        "order_currency": "USD",
        "tenant_id": tenant.id
    }
)

Benefits of structured logging:

Queryable: Filter logs by specific fields
Aggregatable: Calculate metrics from log data
Correlatable: Link related log entries via common IDs
Machine-readable: Automated monitoring and alerting

Logger Naming¶

Use module-based logger naming for clear log sources:

# Get logger for the current module
logger = logging.getLogger(__name__)

# Results in hierarchical logger names:
# - app.views.api.users
# - app.services.email
# - app.integrations.stripe

Logger hierarchy benefits:

graph TB
    A[root logger] --> B[app]
    B --> C[app.views]
    B --> D[app.services]
    C --> E[app.views.api]
    C --> F[app.views.admin]
    D --> G[app.services.email]
    D --> H[app.services.payments]

    style A fill:#e1f5ff
    style B fill:#e8f5e9
    style C fill:#fff9e1
    style D fill:#fff9e1

Configure logging levels hierarchically:

LOGGING = {
    "loggers": {
        "": {"level": "INFO"},                    # Root: INFO and above
        "app.integrations": {"level": "DEBUG"},   # Integration debugging
        "django.db.backends": {"level": "WARNING"}  # Reduce database noise
    }
}

Django Logging Configuration¶

Basic Configuration¶

Django uses Python's standard logging configuration format:

# settings/base.py
LOGGING = {
    "version": 1,
    "disable_existing_loggers": False,
    "formatters": {
        "json": {
            "()": "pythonjsonlogger.jsonlogger.JsonFormatter",
            "format": "%(asctime)s %(name)s %(levelname)s %(message)s %(pathname)s %(lineno)d"
        },
        "console": {
            "format": "%(levelname)s %(asctime)s %(name)s %(message)s"
        }
    },
    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
            "stream": "ext://sys.stdout",
            "formatter": "json"
        }
    },
    "root": {
        "handlers": ["console"],
        "level": "INFO"
    },
    "loggers": {
        "django": {
            "handlers": ["console"],
            "level": "INFO",
            "propagate": False
        },
        "django.request": {
            "handlers": ["console"],
            "level": "ERROR",
            "propagate": False
        }
    }
}

Environment-Specific Configuration¶

Configure logging differently per environment:

# settings/development.py
LOGGING["formatters"]["console"]["format"] = "%(levelname)s %(asctime)s %(name)s %(message)s"
LOGGING["handlers"]["console"]["formatter"] = "console"
LOGGING["root"]["level"] = "DEBUG"

# settings/production.py
LOGGING["handlers"]["console"]["formatter"] = "json"
LOGGING["root"]["level"] = "INFO"

Configuration differences:

graph LR
    A[Base Config] --> B[Development]
    A --> C[Production]

    B --> D[Human-readable format]
    B --> E[DEBUG level]
    B --> F[Colorized output]

    C --> G[JSON format]
    C --> H[INFO level]
    C --> I[CloudWatch destination]

    style B fill:#e8f5e9
    style C fill:#ffebee

Custom Log Handlers¶

Create custom handlers for specific use cases:

# commons/logging/custom_handler.py
import logging
import os

class EnvironmentAwareHandler(logging.Handler):
    """Route logs differently based on environment."""

    def emit(self, record):
        if os.getenv("IN_AWS_FARGATE"):
            # CloudWatch captures stdout
            print(self.format(record))
        else:
            # Local file for development
            with open(f"logs/{record.name}.log", "a") as f:
                f.write(self.format(record) + "\n")

CloudWatch Integration¶

ECS CloudWatch Configuration¶

AWS ECS automatically captures stdout/stderr and sends to CloudWatch:

{
  "containerDefinitions": [{
    "name": "web",
    "logConfiguration": {
      "logDriver": "awslogs",
      "options": {
        "awslogs-group": "/ecs/your-app",
        "awslogs-region": "us-east-1",
        "awslogs-stream-prefix": "web"
      }
    }
  }]
}

Log stream organization:

CloudWatch Log Groups
└── /ecs/your-app
    ├── web/task-id-1
    ├── web/task-id-2
    ├── worker/task-id-1
    └── worker/task-id-2

CloudWatch Log Insights Queries¶

Query structured logs with CloudWatch Log Insights:

-- Find all errors for a specific user
fields @timestamp, message, error_message
| filter user_id = "12345"
| filter level = "ERROR"
| sort @timestamp desc
| limit 100

-- Count errors by type
fields @timestamp, error_type
| filter level = "ERROR"
| stats count() by error_type

-- Calculate API response times
fields @timestamp, duration_ms, endpoint
| filter event_type = "api_request"
| stats avg(duration_ms), max(duration_ms), count() by endpoint

-- Find slow database queries
fields @timestamp, query, duration_ms
| filter event_type = "database_query"
| filter duration_ms > 1000
| sort duration_ms desc

-- Track user activity
fields @timestamp, user_id, event_type, request_id
| filter user_id = "12345"
| sort @timestamp desc

CloudWatch Alarms¶

Create alarms for critical log patterns:

{
  "MetricFilters": [{
    "FilterName": "ErrorCount",
    "FilterPattern": "{ $.level = \"ERROR\" }",
    "MetricTransformations": [{
      "MetricName": "ApplicationErrors",
      "MetricNamespace": "YourApp",
      "MetricValue": "1"
    }]
  }],
  "Alarms": [{
    "AlarmName": "HighErrorRate",
    "MetricName": "ApplicationErrors",
    "Threshold": 10,
    "Period": 300,
    "EvaluationPeriods": 1,
    "ComparisonOperator": "GreaterThanThreshold"
  }]
}

Sentry Integration¶

Sentry Setup¶

Sentry provides error tracking and performance monitoring:

# settings/base.py
import sentry_sdk
from sentry_sdk.integrations.django import DjangoIntegration
from sentry_sdk.integrations.celery import CeleryIntegration

sentry_sdk.init(
    dsn=get_parameter("/app/sentry/dsn"),
    integrations=[
        DjangoIntegration(),
        CeleryIntegration(),
    ],
    environment=os.getenv("ENVIRONMENT_NAME", "development"),
    release=os.getenv("GIT_SHA"),
    traces_sample_rate=0.1,  # 10% of transactions
    profiles_sample_rate=0.1,
    send_default_pii=False,  # Don't send PII
)

Sentry Logging Integration¶

Connect Python logging to Sentry:

LOGGING["handlers"]["sentry"] = {
    "level": "ERROR",
    "class": "sentry_sdk.integrations.logging.EventHandler",
}

LOGGING["root"]["handlers"].append("sentry")

Logging flow:

graph LR
    A[Application Logs] --> B{Log Level}
    B -->|DEBUG/INFO/WARNING| C[CloudWatch]
    B -->|ERROR/CRITICAL| D[CloudWatch + Sentry]

    D --> E[Sentry Dashboard]
    E --> F[Alert: Slack/Email]

    style A fill:#e1f5ff
    style C fill:#e8f5e9
    style D fill:#ffebee
    style E fill:#fff9e1

Error Context¶

Add context to Sentry errors:

from sentry_sdk import capture_exception, set_user, set_tag, set_context

def process_payment(user, amount):
    try:
        # Set user context
        set_user({
            "id": user.id,
            "email": user.email,
            "tenant_id": user.tenant_id
        })

        # Add tags for filtering
        set_tag("payment_gateway", "stripe")
        set_tag("payment_type", "subscription")

        # Add custom context
        set_context("payment", {
            "amount": amount,
            "currency": "USD"
        })

        # Process payment
        result = stripe_client.charge(amount)

    except Exception as e:
        # Exception automatically sent to Sentry with all context
        capture_exception(e)
        raise

Context Management¶

Request Context¶

Add request-specific context to all logs during a request:

# middleware/logging_context.py
import logging
import uuid

class LoggingContextMiddleware:
    """Add request context to all logs."""

    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        # Generate request ID
        request.id = str(uuid.uuid4())

        # Create logger with request context
        logger = logging.getLogger(__name__)

        # All logs in this request include context
        logger.info(
            "Request started",
            extra={
                "request_id": request.id,
                "method": request.method,
                "path": request.path,
                "user_id": getattr(request.user, "id", None),
                "ip_address": self.get_client_ip(request)
            }
        )

        response = self.get_response(request)

        logger.info(
            "Request completed",
            extra={
                "request_id": request.id,
                "status_code": response.status_code,
                "duration_ms": self.get_duration(request)
            }
        )

        return response

    def get_client_ip(self, request):
        """Extract client IP from request."""
        x_forwarded_for = request.META.get("HTTP_X_FORWARDED_FOR")
        if x_forwarded_for:
            return x_forwarded_for.split(",")[0]
        return request.META.get("REMOTE_ADDR")

User Context¶

Include user information in logs:

def get_logger_with_user_context(user):
    """Create logger with user context."""
    logger = logging.getLogger(__name__)

    # Create adapter with user context
    return logging.LoggerAdapter(logger, {
        "user_id": user.id,
        "user_email": user.email,
        "tenant_id": user.tenant_id
    })

# Usage
logger = get_logger_with_user_context(request.user)
logger.info("User updated profile")  # Automatically includes user context

Tenant Context¶

For multi-tenant applications, include tenant information:

class TenantLoggerAdapter(logging.LoggerAdapter):
    """Add tenant context to all log messages."""

    def process(self, msg, kwargs):
        """Add tenant_id to all logs."""
        if "extra" not in kwargs:
            kwargs["extra"] = {}

        kwargs["extra"]["tenant_id"] = self.extra.get("tenant_id")
        kwargs["extra"]["account_code"] = self.extra.get("account_code")

        return msg, kwargs

# Usage in views
class PlanionPeopleViewSet(viewsets.ModelViewSet):
    def get_queryset(self):
        account = self.get_account(self.request)

        logger = TenantLoggerAdapter(
            logging.getLogger(__name__),
            {"tenant_id": account.id, "account_code": account.code}
        )

        logger.info("Fetching people for tenant")
        return People.objects.using(f"{account.upper()}_RO").all()

What to Log¶

Essential Log Events¶

Application lifecycle:

# Application startup
logger.info(
    "Application starting",
    extra={
        "environment": settings.ENVIRONMENT_NAME,
        "version": settings.VERSION,
        "python_version": sys.version
    }
)

# Configuration loaded
logger.info(
    "Configuration loaded",
    extra={
        "database_host": settings.DATABASES["default"]["HOST"],
        "cache_backend": settings.CACHES["default"]["BACKEND"]
    }
)

User authentication:

# Successful login
logger.info(
    "User login successful",
    extra={
        "user_id": user.id,
        "email": user.email,
        "ip_address": request.META.get("REMOTE_ADDR"),
        "user_agent": request.META.get("HTTP_USER_AGENT")
    }
)

# Failed login
logger.warning(
    "Login attempt failed",
    extra={
        "email": email,
        "ip_address": request.META.get("REMOTE_ADDR"),
        "failure_reason": "invalid_credentials"
    }
)

API requests:

# API request started
logger.info(
    "API request",
    extra={
        "request_id": request.id,
        "endpoint": request.path,
        "method": request.method,
        "user_id": request.user.id
    }
)

# API request completed
logger.info(
    "API response",
    extra={
        "request_id": request.id,
        "status_code": response.status_code,
        "duration_ms": duration,
        "response_size_bytes": len(response.content)
    }
)

Database operations:

# Slow query warning
logger.warning(
    "Slow database query",
    extra={
        "query": query,
        "duration_ms": duration,
        "database": "planion_ro",
        "threshold_ms": 1000
    }
)

# Database connection pool
logger.info(
    "Database connection pool status",
    extra={
        "active_connections": pool.active,
        "idle_connections": pool.idle,
        "max_connections": pool.max
    }
)

Integration events:

# External API call
logger.info(
    "External API request",
    extra={
        "service": "stripe",
        "endpoint": "/v1/charges",
        "method": "POST"
    }
)

# External API failure
logger.error(
    "External API request failed",
    extra={
        "service": "stripe",
        "endpoint": "/v1/charges",
        "status_code": response.status_code,
        "error": response.text,
        "will_retry": True
    }
)

Business events:

# Order created
logger.info(
    "Order created",
    extra={
        "order_id": order.id,
        "user_id": user.id,
        "total_amount": float(order.total),
        "currency": "USD",
        "item_count": order.items.count()
    }
)

# Payment processed
logger.info(
    "Payment processed",
    extra={
        "payment_id": payment.id,
        "order_id": order.id,
        "amount": float(payment.amount),
        "payment_method": payment.method,
        "transaction_id": payment.transaction_id
    }
)

What NOT to Log¶

Never log sensitive information:

# ❌ NEVER log these
logger.info(f"User password: {password}")  # Passwords
logger.info(f"Credit card: {card_number}")  # Payment information
logger.info(f"SSN: {ssn}")  # Personal identifiable information
logger.info(f"API key: {api_key}")  # Secrets and credentials
logger.info(f"Session token: {token}")  # Authentication tokens

# ✅ Log safely
logger.info(
    "User authentication",
    extra={
        "user_id": user.id,
        "auth_method": "password"  # Method, not the actual password
    }
)

logger.info(
    "Payment processed",
    extra={
        "payment_id": payment.id,
        "card_last_four": payment.card_last_four,  # Only last 4 digits
        "amount": payment.amount
    }
)

Avoid excessive logging:

# ❌ Don't log inside tight loops
for item in items:  # Could be thousands of items
    logger.debug(f"Processing item {item.id}")  # Creates log spam

# ✅ Log summary instead
logger.info(
    "Batch processing started",
    extra={"item_count": len(items)}
)
# ... process items ...
logger.info(
    "Batch processing completed",
    extra={
        "processed": processed_count,
        "failed": failed_count,
        "duration_ms": duration
    }
)

Performance Considerations¶

Log Level Filtering¶

Use log level filtering to reduce overhead:

# ❌ Expensive even when DEBUG is disabled
logger.debug(f"User data: {expensive_serialization(user)}")

# ✅ Only serialize if DEBUG logging is enabled
if logger.isEnabledFor(logging.DEBUG):
    logger.debug("User data: %s", expensive_serialization(user))

# ✅ Or use lazy formatting
logger.debug("User data: %s", lambda: expensive_serialization(user))

Sampling¶

Sample high-volume logs:

import random

def log_with_sampling(logger, message, extra, sample_rate=0.01):
    """Log only a percentage of messages."""
    if random.random() < sample_rate:
        logger.info(message, extra=extra)

# Log 1% of cache hits
log_with_sampling(
    logger,
    "Cache hit",
    {"key": key, "ttl": ttl},
    sample_rate=0.01
)

Async Logging¶

For high-throughput applications, use queue-based logging:

LOGGING["handlers"]["async"] = {
    "class": "logging.handlers.QueueHandler",
    "queue": queue.Queue(-1),  # Unlimited queue
}

LOGGING["handlers"]["async_listener"] = {
    "class": "logging.handlers.QueueListener",
    "handlers": ["console", "sentry"],
}

Testing and Development¶

Development Logging¶

Configure verbose logging for development:

# settings/development.py
LOGGING["formatters"]["console"]["format"] = (
    "%(levelname)s %(asctime)s [%(name)s] %(message)s"
)

# Enable SQL query logging
LOGGING["loggers"]["django.db.backends"] = {
    "level": "DEBUG",
    "handlers": ["console"]
}

# Colorize output (optional)
LOGGING["formatters"]["colored"] = {
    "()": "colorlog.ColoredFormatter",
    "format": "%(log_color)s%(levelname)s%(reset)s %(asctime)s %(name)s %(message)s"
}

Testing Logs¶

Test that your code logs correctly:

import logging
from django.test import TestCase

class LoggingTestCase(TestCase):
    def test_user_login_logs(self):
        """Test that user login is logged."""
        with self.assertLogs("app.views.auth", level="INFO") as logs:
            response = self.client.post("/login/", {
                "username": "test",
                "password": "test123"
            })

            self.assertEqual(response.status_code, 200)
            self.assertIn("User login successful", logs.output[0])

    def test_error_logging(self):
        """Test that errors are logged with context."""
        logger = logging.getLogger("app.services")

        with self.assertLogs(logger, level="ERROR") as logs:
            try:
                raise ValueError("Test error")
            except ValueError:
                logger.error(
                    "Service error",
                    extra={"service": "payment", "error_type": "validation"},
                    exc_info=True
                )

            self.assertIn("Service error", logs.output[0])

Security Considerations¶

Ensure logs don't contain personally identifiable information:

class PIISafeLogFilter(logging.Filter):
    """Filter out PII from log records."""

    PII_FIELDS = {"password", "ssn", "credit_card", "api_key"}

    def filter(self, record):
        """Remove PII from log extra fields."""
        if hasattr(record, "extra"):
            for field in self.PII_FIELDS:
                if field in record.extra:
                    record.extra[field] = "[REDACTED]"
        return True

LOGGING["filters"]["pii_safe"] = {
    "()": "app.logging.PIISafeLogFilter"
}

LOGGING["handlers"]["console"]["filters"] = ["pii_safe"]

Log Rotation and Retention¶

Configure CloudWatch log retention:

{
  "logGroupName": "/ecs/your-app",
  "retentionInDays": 30  // 30 days for general logs
}

{
  "logGroupName": "/ecs/your-app/audit",
  "retentionInDays": 365  // 1 year for audit logs
}

Audit Logging¶

Create separate audit logs for compliance:

# Create dedicated audit logger
audit_logger = logging.getLogger("audit")

# Log security-relevant events
audit_logger.info(
    "User permission changed",
    extra={
        "timestamp": datetime.utcnow().isoformat(),
        "actor_id": request.user.id,
        "target_user_id": target_user.id,
        "permission_before": old_permissions,
        "permission_after": new_permissions,
        "ip_address": request.META.get("REMOTE_ADDR")
    }
)

Monitoring and Alerting¶

Log-Based Metrics¶

Create CloudWatch metrics from logs:

# Log metrics in structured format
logger.info(
    "API request completed",
    extra={
        "metric_name": "api_request_duration",
        "metric_value": duration_ms,
        "metric_unit": "milliseconds",
        "endpoint": request.path,
        "status_code": response.status_code
    }
)

Alert Patterns¶

Common alerting patterns:

-- High error rate
fields @timestamp
| filter level = "ERROR"
| stats count() as error_count by bin(5m)
| filter error_count > 10

-- Slow requests
fields @timestamp, duration_ms, endpoint
| filter duration_ms > 5000
| stats count() as slow_request_count by bin(5m)

-- Failed authentication attempts
fields @timestamp, ip_address
| filter event_type = "login_failed"
| stats count() as failure_count by ip_address
| filter failure_count > 5

Best Practices Summary¶

Use appropriate log levels: DEBUG for development, INFO for production operations, ERROR for failures
Structure your logs: Use JSON format with consistent field names
Add context: Include request IDs, user IDs, tenant IDs
Never log secrets: Filter out passwords, API keys, tokens, PII
Log to stdout: Let the platform handle log routing and storage
Sample high-volume logs: Reduce costs and noise
Test your logging: Verify logs contain expected information
Monitor log metrics: Create alerts on error rates and patterns
Use Sentry for errors: Centralize error tracking and debugging
Document log format: Maintain a schema for structured logs

Next Steps¶

Review your current logging configuration
Implement structured logging with JSON formatter
Set up CloudWatch Log Insights queries for common operations
Configure Sentry integration for error tracking
Create CloudWatch alarms for critical log patterns
Document your log schema and field conventions
Train team on proper logging practices