Logging Best Practices¶
Logging is the primary mechanism for observing application behavior in production. Well-designed logging provides insights into system health, user behavior, error patterns, and performance characteristics without requiring active debugging sessions.
This guide covers modern logging practices for Django applications running on AWS infrastructure with CloudWatch, structured logging, and Sentry integration.
Philosophy
Logs are event streams that tell the story of your application. They should be structured, contextual, and actionable. Every log entry should answer: What happened? When? Where? To whom? Why does it matter?
Logging Fundamentals¶
Log Levels¶
Python's logging framework provides five standard log levels, each serving a distinct purpose:
import logging
logger = logging.getLogger(__name__)
# DEBUG: Detailed diagnostic information
logger.debug("Processing user query with parameters: %s", params)
# INFO: Confirmation that things are working as expected
logger.info("User %s logged in successfully", user.email)
# WARNING: Something unexpected happened, but the application continues
logger.warning("API rate limit at 80%% for tenant %s", tenant_id)
# ERROR: An error occurred, but the application can continue
logger.error("Failed to send email to %s: %s", recipient, error)
# CRITICAL: A serious error, the application may not continue
logger.critical("Database connection pool exhausted")
When to use each level:
| Level | When to Use | Example | Production Logging |
|---|---|---|---|
| DEBUG | Development diagnostics | "SQL query took 0.032s" | Disabled |
| INFO | Normal operations | "User login successful" | Enabled |
| WARNING | Degraded state | "Cache miss, falling back to database" | Enabled |
| ERROR | Recoverable failures | "API request failed, will retry" | Enabled |
| CRITICAL | System instability | "Cannot connect to database" | Enabled + Alert |
Structured Logging¶
Structured logging outputs logs as JSON objects rather than plain text strings. This enables machine parsing, filtering, and analysis.
# ❌ Unstructured: Hard to parse and query
logger.info(f"User {user.email} created order {order.id} for ${order.total}")
# ✅ Structured: Easy to query and analyze
logger.info(
"Order created",
extra={
"event_type": "order_created",
"user_id": user.id,
"user_email": user.email,
"order_id": order.id,
"order_total": float(order.total),
"order_currency": "USD",
"tenant_id": tenant.id
}
)
Benefits of structured logging:
- Queryable: Filter logs by specific fields
- Aggregatable: Calculate metrics from log data
- Correlatable: Link related log entries via common IDs
- Machine-readable: Automated monitoring and alerting
Logger Naming¶
Use module-based logger naming for clear log sources:
# Get logger for the current module
logger = logging.getLogger(__name__)
# Results in hierarchical logger names:
# - app.views.api.users
# - app.services.email
# - app.integrations.stripe
Logger hierarchy benefits:
graph TB
A[root logger] --> B[app]
B --> C[app.views]
B --> D[app.services]
C --> E[app.views.api]
C --> F[app.views.admin]
D --> G[app.services.email]
D --> H[app.services.payments]
style A fill:#e1f5ff
style B fill:#e8f5e9
style C fill:#fff9e1
style D fill:#fff9e1
Configure logging levels hierarchically:
LOGGING = {
"loggers": {
"": {"level": "INFO"}, # Root: INFO and above
"app.integrations": {"level": "DEBUG"}, # Integration debugging
"django.db.backends": {"level": "WARNING"} # Reduce database noise
}
}
Django Logging Configuration¶
Basic Configuration¶
Django uses Python's standard logging configuration format:
# settings/base.py
LOGGING = {
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"json": {
"()": "pythonjsonlogger.jsonlogger.JsonFormatter",
"format": "%(asctime)s %(name)s %(levelname)s %(message)s %(pathname)s %(lineno)d"
},
"console": {
"format": "%(levelname)s %(asctime)s %(name)s %(message)s"
}
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"stream": "ext://sys.stdout",
"formatter": "json"
}
},
"root": {
"handlers": ["console"],
"level": "INFO"
},
"loggers": {
"django": {
"handlers": ["console"],
"level": "INFO",
"propagate": False
},
"django.request": {
"handlers": ["console"],
"level": "ERROR",
"propagate": False
}
}
}
Environment-Specific Configuration¶
Configure logging differently per environment:
# settings/development.py
LOGGING["formatters"]["console"]["format"] = "%(levelname)s %(asctime)s %(name)s %(message)s"
LOGGING["handlers"]["console"]["formatter"] = "console"
LOGGING["root"]["level"] = "DEBUG"
# settings/production.py
LOGGING["handlers"]["console"]["formatter"] = "json"
LOGGING["root"]["level"] = "INFO"
Configuration differences:
graph LR
A[Base Config] --> B[Development]
A --> C[Production]
B --> D[Human-readable format]
B --> E[DEBUG level]
B --> F[Colorized output]
C --> G[JSON format]
C --> H[INFO level]
C --> I[CloudWatch destination]
style B fill:#e8f5e9
style C fill:#ffebee
Custom Log Handlers¶
Create custom handlers for specific use cases:
# commons/logging/custom_handler.py
import logging
import os
class EnvironmentAwareHandler(logging.Handler):
"""Route logs differently based on environment."""
def emit(self, record):
if os.getenv("IN_AWS_FARGATE"):
# CloudWatch captures stdout
print(self.format(record))
else:
# Local file for development
with open(f"logs/{record.name}.log", "a") as f:
f.write(self.format(record) + "\n")
CloudWatch Integration¶
ECS CloudWatch Configuration¶
AWS ECS automatically captures stdout/stderr and sends to CloudWatch:
{
"containerDefinitions": [{
"name": "web",
"logConfiguration": {
"logDriver": "awslogs",
"options": {
"awslogs-group": "/ecs/your-app",
"awslogs-region": "us-east-1",
"awslogs-stream-prefix": "web"
}
}
}]
}
Log stream organization:
CloudWatch Log Groups
└── /ecs/your-app
├── web/task-id-1
├── web/task-id-2
├── worker/task-id-1
└── worker/task-id-2
CloudWatch Log Insights Queries¶
Query structured logs with CloudWatch Log Insights:
-- Find all errors for a specific user
fields @timestamp, message, error_message
| filter user_id = "12345"
| filter level = "ERROR"
| sort @timestamp desc
| limit 100
-- Count errors by type
fields @timestamp, error_type
| filter level = "ERROR"
| stats count() by error_type
-- Calculate API response times
fields @timestamp, duration_ms, endpoint
| filter event_type = "api_request"
| stats avg(duration_ms), max(duration_ms), count() by endpoint
-- Find slow database queries
fields @timestamp, query, duration_ms
| filter event_type = "database_query"
| filter duration_ms > 1000
| sort duration_ms desc
-- Track user activity
fields @timestamp, user_id, event_type, request_id
| filter user_id = "12345"
| sort @timestamp desc
CloudWatch Alarms¶
Create alarms for critical log patterns:
{
"MetricFilters": [{
"FilterName": "ErrorCount",
"FilterPattern": "{ $.level = \"ERROR\" }",
"MetricTransformations": [{
"MetricName": "ApplicationErrors",
"MetricNamespace": "YourApp",
"MetricValue": "1"
}]
}],
"Alarms": [{
"AlarmName": "HighErrorRate",
"MetricName": "ApplicationErrors",
"Threshold": 10,
"Period": 300,
"EvaluationPeriods": 1,
"ComparisonOperator": "GreaterThanThreshold"
}]
}
Sentry Integration¶
Sentry Setup¶
Sentry provides error tracking and performance monitoring:
# settings/base.py
import sentry_sdk
from sentry_sdk.integrations.django import DjangoIntegration
from sentry_sdk.integrations.celery import CeleryIntegration
sentry_sdk.init(
dsn=get_parameter("/app/sentry/dsn"),
integrations=[
DjangoIntegration(),
CeleryIntegration(),
],
environment=os.getenv("ENVIRONMENT_NAME", "development"),
release=os.getenv("GIT_SHA"),
traces_sample_rate=0.1, # 10% of transactions
profiles_sample_rate=0.1,
send_default_pii=False, # Don't send PII
)
Sentry Logging Integration¶
Connect Python logging to Sentry:
LOGGING["handlers"]["sentry"] = {
"level": "ERROR",
"class": "sentry_sdk.integrations.logging.EventHandler",
}
LOGGING["root"]["handlers"].append("sentry")
Logging flow:
graph LR
A[Application Logs] --> B{Log Level}
B -->|DEBUG/INFO/WARNING| C[CloudWatch]
B -->|ERROR/CRITICAL| D[CloudWatch + Sentry]
D --> E[Sentry Dashboard]
E --> F[Alert: Slack/Email]
style A fill:#e1f5ff
style C fill:#e8f5e9
style D fill:#ffebee
style E fill:#fff9e1
Error Context¶
Add context to Sentry errors:
from sentry_sdk import capture_exception, set_user, set_tag, set_context
def process_payment(user, amount):
try:
# Set user context
set_user({
"id": user.id,
"email": user.email,
"tenant_id": user.tenant_id
})
# Add tags for filtering
set_tag("payment_gateway", "stripe")
set_tag("payment_type", "subscription")
# Add custom context
set_context("payment", {
"amount": amount,
"currency": "USD"
})
# Process payment
result = stripe_client.charge(amount)
except Exception as e:
# Exception automatically sent to Sentry with all context
capture_exception(e)
raise
Context Management¶
Request Context¶
Add request-specific context to all logs during a request:
# middleware/logging_context.py
import logging
import uuid
class LoggingContextMiddleware:
"""Add request context to all logs."""
def __init__(self, get_response):
self.get_response = get_response
def __call__(self, request):
# Generate request ID
request.id = str(uuid.uuid4())
# Create logger with request context
logger = logging.getLogger(__name__)
# All logs in this request include context
logger.info(
"Request started",
extra={
"request_id": request.id,
"method": request.method,
"path": request.path,
"user_id": getattr(request.user, "id", None),
"ip_address": self.get_client_ip(request)
}
)
response = self.get_response(request)
logger.info(
"Request completed",
extra={
"request_id": request.id,
"status_code": response.status_code,
"duration_ms": self.get_duration(request)
}
)
return response
def get_client_ip(self, request):
"""Extract client IP from request."""
x_forwarded_for = request.META.get("HTTP_X_FORWARDED_FOR")
if x_forwarded_for:
return x_forwarded_for.split(",")[0]
return request.META.get("REMOTE_ADDR")
User Context¶
Include user information in logs:
def get_logger_with_user_context(user):
"""Create logger with user context."""
logger = logging.getLogger(__name__)
# Create adapter with user context
return logging.LoggerAdapter(logger, {
"user_id": user.id,
"user_email": user.email,
"tenant_id": user.tenant_id
})
# Usage
logger = get_logger_with_user_context(request.user)
logger.info("User updated profile") # Automatically includes user context
Tenant Context¶
For multi-tenant applications, include tenant information:
class TenantLoggerAdapter(logging.LoggerAdapter):
"""Add tenant context to all log messages."""
def process(self, msg, kwargs):
"""Add tenant_id to all logs."""
if "extra" not in kwargs:
kwargs["extra"] = {}
kwargs["extra"]["tenant_id"] = self.extra.get("tenant_id")
kwargs["extra"]["account_code"] = self.extra.get("account_code")
return msg, kwargs
# Usage in views
class PlanionPeopleViewSet(viewsets.ModelViewSet):
def get_queryset(self):
account = self.get_account(self.request)
logger = TenantLoggerAdapter(
logging.getLogger(__name__),
{"tenant_id": account.id, "account_code": account.code}
)
logger.info("Fetching people for tenant")
return People.objects.using(f"{account.upper()}_RO").all()
What to Log¶
Essential Log Events¶
Application lifecycle:
# Application startup
logger.info(
"Application starting",
extra={
"environment": settings.ENVIRONMENT_NAME,
"version": settings.VERSION,
"python_version": sys.version
}
)
# Configuration loaded
logger.info(
"Configuration loaded",
extra={
"database_host": settings.DATABASES["default"]["HOST"],
"cache_backend": settings.CACHES["default"]["BACKEND"]
}
)
User authentication:
# Successful login
logger.info(
"User login successful",
extra={
"user_id": user.id,
"email": user.email,
"ip_address": request.META.get("REMOTE_ADDR"),
"user_agent": request.META.get("HTTP_USER_AGENT")
}
)
# Failed login
logger.warning(
"Login attempt failed",
extra={
"email": email,
"ip_address": request.META.get("REMOTE_ADDR"),
"failure_reason": "invalid_credentials"
}
)
API requests:
# API request started
logger.info(
"API request",
extra={
"request_id": request.id,
"endpoint": request.path,
"method": request.method,
"user_id": request.user.id
}
)
# API request completed
logger.info(
"API response",
extra={
"request_id": request.id,
"status_code": response.status_code,
"duration_ms": duration,
"response_size_bytes": len(response.content)
}
)
Database operations:
# Slow query warning
logger.warning(
"Slow database query",
extra={
"query": query,
"duration_ms": duration,
"database": "planion_ro",
"threshold_ms": 1000
}
)
# Database connection pool
logger.info(
"Database connection pool status",
extra={
"active_connections": pool.active,
"idle_connections": pool.idle,
"max_connections": pool.max
}
)
Integration events:
# External API call
logger.info(
"External API request",
extra={
"service": "stripe",
"endpoint": "/v1/charges",
"method": "POST"
}
)
# External API failure
logger.error(
"External API request failed",
extra={
"service": "stripe",
"endpoint": "/v1/charges",
"status_code": response.status_code,
"error": response.text,
"will_retry": True
}
)
Business events:
# Order created
logger.info(
"Order created",
extra={
"order_id": order.id,
"user_id": user.id,
"total_amount": float(order.total),
"currency": "USD",
"item_count": order.items.count()
}
)
# Payment processed
logger.info(
"Payment processed",
extra={
"payment_id": payment.id,
"order_id": order.id,
"amount": float(payment.amount),
"payment_method": payment.method,
"transaction_id": payment.transaction_id
}
)
What NOT to Log¶
Never log sensitive information:
# ❌ NEVER log these
logger.info(f"User password: {password}") # Passwords
logger.info(f"Credit card: {card_number}") # Payment information
logger.info(f"SSN: {ssn}") # Personal identifiable information
logger.info(f"API key: {api_key}") # Secrets and credentials
logger.info(f"Session token: {token}") # Authentication tokens
# ✅ Log safely
logger.info(
"User authentication",
extra={
"user_id": user.id,
"auth_method": "password" # Method, not the actual password
}
)
logger.info(
"Payment processed",
extra={
"payment_id": payment.id,
"card_last_four": payment.card_last_four, # Only last 4 digits
"amount": payment.amount
}
)
Avoid excessive logging:
# ❌ Don't log inside tight loops
for item in items: # Could be thousands of items
logger.debug(f"Processing item {item.id}") # Creates log spam
# ✅ Log summary instead
logger.info(
"Batch processing started",
extra={"item_count": len(items)}
)
# ... process items ...
logger.info(
"Batch processing completed",
extra={
"processed": processed_count,
"failed": failed_count,
"duration_ms": duration
}
)
Performance Considerations¶
Log Level Filtering¶
Use log level filtering to reduce overhead:
# ❌ Expensive even when DEBUG is disabled
logger.debug(f"User data: {expensive_serialization(user)}")
# ✅ Only serialize if DEBUG logging is enabled
if logger.isEnabledFor(logging.DEBUG):
logger.debug("User data: %s", expensive_serialization(user))
# ✅ Or use lazy formatting
logger.debug("User data: %s", lambda: expensive_serialization(user))
Sampling¶
Sample high-volume logs:
import random
def log_with_sampling(logger, message, extra, sample_rate=0.01):
"""Log only a percentage of messages."""
if random.random() < sample_rate:
logger.info(message, extra=extra)
# Log 1% of cache hits
log_with_sampling(
logger,
"Cache hit",
{"key": key, "ttl": ttl},
sample_rate=0.01
)
Async Logging¶
For high-throughput applications, use queue-based logging:
LOGGING["handlers"]["async"] = {
"class": "logging.handlers.QueueHandler",
"queue": queue.Queue(-1), # Unlimited queue
}
LOGGING["handlers"]["async_listener"] = {
"class": "logging.handlers.QueueListener",
"handlers": ["console", "sentry"],
}
Testing and Development¶
Development Logging¶
Configure verbose logging for development:
# settings/development.py
LOGGING["formatters"]["console"]["format"] = (
"%(levelname)s %(asctime)s [%(name)s] %(message)s"
)
# Enable SQL query logging
LOGGING["loggers"]["django.db.backends"] = {
"level": "DEBUG",
"handlers": ["console"]
}
# Colorize output (optional)
LOGGING["formatters"]["colored"] = {
"()": "colorlog.ColoredFormatter",
"format": "%(log_color)s%(levelname)s%(reset)s %(asctime)s %(name)s %(message)s"
}
Testing Logs¶
Test that your code logs correctly:
import logging
from django.test import TestCase
class LoggingTestCase(TestCase):
def test_user_login_logs(self):
"""Test that user login is logged."""
with self.assertLogs("app.views.auth", level="INFO") as logs:
response = self.client.post("/login/", {
"username": "test",
"password": "test123"
})
self.assertEqual(response.status_code, 200)
self.assertIn("User login successful", logs.output[0])
def test_error_logging(self):
"""Test that errors are logged with context."""
logger = logging.getLogger("app.services")
with self.assertLogs(logger, level="ERROR") as logs:
try:
raise ValueError("Test error")
except ValueError:
logger.error(
"Service error",
extra={"service": "payment", "error_type": "validation"},
exc_info=True
)
self.assertIn("Service error", logs.output[0])
Security Considerations¶
PII and GDPR Compliance¶
Ensure logs don't contain personally identifiable information:
class PIISafeLogFilter(logging.Filter):
"""Filter out PII from log records."""
PII_FIELDS = {"password", "ssn", "credit_card", "api_key"}
def filter(self, record):
"""Remove PII from log extra fields."""
if hasattr(record, "extra"):
for field in self.PII_FIELDS:
if field in record.extra:
record.extra[field] = "[REDACTED]"
return True
LOGGING["filters"]["pii_safe"] = {
"()": "app.logging.PIISafeLogFilter"
}
LOGGING["handlers"]["console"]["filters"] = ["pii_safe"]
Log Rotation and Retention¶
Configure CloudWatch log retention:
{
"logGroupName": "/ecs/your-app",
"retentionInDays": 30 // 30 days for general logs
}
{
"logGroupName": "/ecs/your-app/audit",
"retentionInDays": 365 // 1 year for audit logs
}
Audit Logging¶
Create separate audit logs for compliance:
# Create dedicated audit logger
audit_logger = logging.getLogger("audit")
# Log security-relevant events
audit_logger.info(
"User permission changed",
extra={
"timestamp": datetime.utcnow().isoformat(),
"actor_id": request.user.id,
"target_user_id": target_user.id,
"permission_before": old_permissions,
"permission_after": new_permissions,
"ip_address": request.META.get("REMOTE_ADDR")
}
)
Monitoring and Alerting¶
Log-Based Metrics¶
Create CloudWatch metrics from logs:
# Log metrics in structured format
logger.info(
"API request completed",
extra={
"metric_name": "api_request_duration",
"metric_value": duration_ms,
"metric_unit": "milliseconds",
"endpoint": request.path,
"status_code": response.status_code
}
)
Alert Patterns¶
Common alerting patterns:
-- High error rate
fields @timestamp
| filter level = "ERROR"
| stats count() as error_count by bin(5m)
| filter error_count > 10
-- Slow requests
fields @timestamp, duration_ms, endpoint
| filter duration_ms > 5000
| stats count() as slow_request_count by bin(5m)
-- Failed authentication attempts
fields @timestamp, ip_address
| filter event_type = "login_failed"
| stats count() as failure_count by ip_address
| filter failure_count > 5
Best Practices Summary¶
- Use appropriate log levels: DEBUG for development, INFO for production operations, ERROR for failures
- Structure your logs: Use JSON format with consistent field names
- Add context: Include request IDs, user IDs, tenant IDs
- Never log secrets: Filter out passwords, API keys, tokens, PII
- Log to stdout: Let the platform handle log routing and storage
- Sample high-volume logs: Reduce costs and noise
- Test your logging: Verify logs contain expected information
- Monitor log metrics: Create alerts on error rates and patterns
- Use Sentry for errors: Centralize error tracking and debugging
- Document log format: Maintain a schema for structured logs
Next Steps¶
- Review your current logging configuration
- Implement structured logging with JSON formatter
- Set up CloudWatch Log Insights queries for common operations
- Configure Sentry integration for error tracking
- Create CloudWatch alarms for critical log patterns
- Document your log schema and field conventions
- Train team on proper logging practices
Further Reading
- Django Logging Documentation
- CloudWatch Log Insights Query Syntax
- Sentry Django Integration
- 12-Factor Logs - Treat logs as event streams