Watchtower: Python CloudWatch Logging¶
Watchtower is a log handler for Amazon Web Services CloudWatch Logs.
CloudWatch Logs is a log management service built into AWS. It is conceptually similar to services like Splunk, Datadog, and Loggly, but is more lightweight, cheaper, and tightly integrated with the rest of AWS.
Watchtower, in turn, is a lightweight adapter between the Python logging system and CloudWatch Logs. It uses the boto3 AWS SDK, and lets you plug your application logging directly into CloudWatch without the need to install a system-wide log collector like awscli-cwlogs and round-trip your logs through the instance’s syslog. It aggregates logs into batches to avoid sending an API request per each log message, while guaranteeing a delivery deadline (60 seconds by default).
Installation¶
pip install watchtower
Synopsis¶
Install awscli and set your AWS credentials (run aws configure
).
import watchtower, logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
logger.addHandler(watchtower.CloudWatchLogHandler())
logger.info("Hi")
logger.info(dict(foo="bar", details={}))
After running the example, you can see the log output in your AWS console under the watchtower log group.
IAM permissions¶
The process running watchtower needs to have access to IAM credentials to call the CloudWatch Logs API. The standard
procedure for loading and configuring credentials is described in the
Boto3 Credentials documentation.
When running Watchtower on an EC2 instance or other AWS compute resource, boto3 automatically loads credentials from
instance metadata (IMDS) or
container credentials provider (AWS_WEB_IDENTITY_TOKEN_FILE or AWS_CONTAINER_CREDENTIALS_FULL_URI). The easiest way to
grant the right permissions to the IAM role associated with these credentials is by attaching an AWS
managed IAM policy to the
role. While AWS provides no generic managed CloudWatch Logs writer policy, we recommend that you use the
arn:aws:iam::aws:policy/AWSOpsWorksCloudWatchLogs
managed policy, which has just the right permissions without being
overly broad.
Log Group Tagging¶
Watchtower supports the tagging of log groups. This can be done by adding the parameter log_group_tags
to the
CloudWatchLogHandler
constructor. This parameter should be a dictionary of tags to apply to the log group.
If you want to add tags to the log group you will need to add permission for the logs:TagResource
action to your
policy. This will need to be in addition to the AWSOpsWorksCloudWatchLogs policy. (Note: the older logs:TagLogGroup
permission is for the tag_log_group()
call which is on the path to deprecation and is not used by Watchtower.)
Example: Flask logging with Watchtower¶
Use the following configuration to send Flask logs to a CloudWatch Logs stream called “loggable”:
import watchtower, flask, logging
logging.basicConfig(level=logging.INFO)
app = flask.Flask("loggable")
handler = watchtower.CloudWatchLogHandler(log_group_name=app.name)
app.logger.addHandler(handler)
logging.getLogger("werkzeug").addHandler(handler)
@app.route('/')
def hello_world():
return 'Hello World!'
if __name__ == '__main__':
app.run()
(See also http://flask.pocoo.org/docs/errorhandling/.)
Example: Django logging with Watchtower¶
This is an example of Watchtower integration with Django. In your Django project, add the following to settings.py
:
import boto3
AWS_REGION_NAME = "us-west-2"
boto3_logs_client = boto3.client("logs", region_name=AWS_REGION_NAME)
LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'root': {
'level': 'DEBUG',
# Adding the watchtower handler here causes all loggers in the project that
# have propagate=True (the default) to send messages to watchtower. If you
# wish to send only from specific loggers instead, remove "watchtower" here
# and configure individual loggers below.
'handlers': ['watchtower', 'console'],
},
'handlers': {
'console': {
'class': 'logging.StreamHandler',
},
'watchtower': {
'class': 'watchtower.CloudWatchLogHandler',
'boto3_client': boto3_logs_client,
'log_group_name': 'YOUR_DJANGO_PROJECT_NAME',
# Decrease the verbosity level here to send only those logs to watchtower,
# but still see more verbose logs in the console. See the watchtower
# documentation for other parameters that can be set here.
'level': 'DEBUG'
}
},
'loggers': {
# In the debug server (`manage.py runserver`), several Django system loggers cause
# deadlocks when using threading in the logging handler, and are not supported by
# watchtower. This limitation does not apply when running on production WSGI servers
# (gunicorn, uwsgi, etc.), so we recommend that you set `propagate=True` below in your
# production-specific Django settings file to receive Django system logs in CloudWatch.
'django': {
'level': 'DEBUG',
'handlers': ['console'],
'propagate': False
}
# Add any other logger-specific configuration here.
}
}
Using this configuration, logs from Django will be sent to Cloudwatch in the log group YOUR_DJANGO_PROJECT_NAME
.
To supply AWS credentials to this configuration in development, set your
AWS CLI profile settings with
aws configure
. To supply credentials in production or when running on an EC2 instance,
assign an IAM role to your instance, which will cause boto3 to automatically ingest IAM role credentials from
instance metadata.
(See also the Django logging documentation.)
Examples: Querying CloudWatch logs¶
This section is not specific to Watchtower. It demonstrates the use of awscli and jq to read and search CloudWatch logs on the command line.
For the Flask example above, you can retrieve your application logs with the following two commands:
aws logs get-log-events --log-group-name watchtower --log-stream-name loggable | jq '.events[].message'
aws logs get-log-events --log-group-name watchtower --log-stream-name werkzeug | jq '.events[].message'
In addition to the raw get-log-events API, CloudWatch Logs supports
extraction of your logs into an S3 bucket,
log analysis with a query language,
and alerting and dashboards based on metric filters, which are pattern
rules that extract information from your logs and feed it to alarms and dashboard graphs. If you want to make use of
these features on the command line, the author of Watchtower has published an open source CLI toolkit called
aegea that includes the commands aegea logs
and aegea grep
to easily
access the S3 Export and Insights features.
Examples: Python Logging Config¶
The Python logging.config
module has the ability to provide a configuration file that can be loaded in order to
separate the logging configuration from the code.
The following are two example YAML configuration files that can be loaded using PyYAML. The resulting dict
object
can then be loaded into logging.config.dictConfig
. The first example is a basic example that relies on the default
configuration provided by boto3
:
# Default AWS Config
version: 1
disable_existing_loggers: False
handlers:
console:
class: logging.StreamHandler
level: DEBUG
stream: ext://sys.stdout
logfile:
class: logging.handlers.RotatingFileHandler
level: DEBUG
filename: watchtower.log
maxBytes: 1000000
backupCount: 3
watchtower:
class: watchtower.CloudWatchLogHandler
level: DEBUG
log_group_name: watchtower
log_stream_name: "{logger_name}-{strftime:%y-%m-%d}"
send_interval: 10
create_log_group: False
root:
level: DEBUG
propagate: True
handlers: [console, logfile, watchtower]
loggers:
botocore:
level: INFO
urllib3:
level: INFO
The above works well if you can use the default boto3 credential configuration, or rely on environment variables.
However, sometimes one may want to use different credentials for logging than used for other functionality;
in this case the boto3_profile_name
option to Watchtower can be used to provide a boto3 profile name:
# AWS Config Profile
version: 1
...
handlers:
...
watchtower:
boto3_profile_name: watchtowerlogger
...
Finally, the following shows how to load the configuration into the working application:
import logging.config
import flask
import yaml
app = flask.Flask("loggable")
@app.route('/')
def hello_world():
return 'Hello World!'
if __name__ == '__main__':
with open('logging.yml') as log_config:
config_yml = log_config.read()
config_dict = yaml.safe_load(config_yml)
logging.config.dictConfig(config_dict)
app.run()
Log stream naming¶
For high volume logging applications that utilize process pools, it is recommended that you keep the default log stream
name ({machine_name}/{program_name}/{logger_name}/{process_id}
) or otherwise make it unique per source using a
combination of these template variables. Because logs must be submitted sequentially to each log stream, independent
processes sending logs to the same log stream will encounter sequence token synchronization errors and spend extra resources
automatically recovering from them. As the number of processes increases, this overhead will grow until logs fail to
deliver and get dropped (causing a warning on stderr). Partitioning logs into streams by source avoids this contention.
Boto3/botocore/urllib3 logs¶
Because watchtower uses boto3 to send logs, the act of sending them generates a number of DEBUG level log messages
from boto3’s dependencies, botocore and urllib3. To avoid generating a self-perpetuating stream of log messages,
watchtower.CloudWatchLogHandler
attaches a
filter to itself which drops all DEBUG
level messages from these libraries, and drops all messages at all levels from them when shutting down (specifically,
in watchtower.CloudWatchLogHandler.flush()
and watchtower.CloudWatchLogHandler.close()
). The filter does not
apply to any other handlers you may have processing your messages, so the following basic configuration will cause
botocore debug logs to print to stderr but not to Cloudwatch:
import watchtower, logging
logging.basicConfig(level=logging.DEBUG)
logger = logging.getLogger()
logger.addHandler(watchtower.CloudWatchLogHandler())
AWS Lambda¶
Watchtower is not suitable or necessary for applications running on AWS Lambda. All AWS Lambda logs (i.e. all lines printed to stderr by the runtime in the Lambda) are automatically sent to CloudWatch Logs, into log groups under the /aws/lambda/ prefix.
AWS Lambda suspends (freezes) all processes in its execution environment once the invocation is complete and until the next invocation, if any. This means any asynchronous background processes and threads, including watchtower, will be suspended and inoperable, so watchtower cannot function correctly in this execution model.
Links¶
Bugs¶
Please report bugs, issues, feature requests, etc. on GitHub.
Versioning¶
This package follows the Semantic Versioning 2.0.0 standard. To control changes, it is recommended that application developers pin the package version and manage it using pip-tools or similar. For library developers, pinning the major version is recommended.
License¶
Licensed under the terms of the Apache License, Version 2.0.
API documentation¶
- class watchtower.CloudWatchLogFormatter(*args, json_serialize_default=None, add_log_record_attrs=None, **kwargs)[source]¶
Log formatter for CloudWatch messages. Transforms logged message into a message compatible with the CloudWatch API. This is the default formatter for CloudWatchLogHandler.
This log formatter is designed to accommodate structured log messages by correctly serializing them as JSON, which is automatically recognized, parsed, and indexed by CloudWatch Logs. To use this feature, pass a dictionary input to the logger instead of a plain string:
logger = logging.getLogger(__name__) logger.addHandler(watchtower.CloudWatchLogHandler()) logger.critical({"request": "hello", "metadata": {"size": 9000}})
If the optional add_log_record_attrs attribute or keyword argument is set, it enables the forwarding of specified LogRecord attributes with the message. In this mode, if the message is not already a dictionary, it is converted to one with the original message under the msg key:
logger = logging.getLogger(__name__) handler = watchtower.CloudWatchLogHandler() handler.formatter.add_log_record_attrs=["levelname", "filename", "process", "thread"] logger.addHandler(handler) logger.critical({"request": "hello", "metadata": {"size": 9000}})
The resulting raw CloudWatch Logs event will look like this:
{"timestamp": 1636868049692, "message": '{"request": "hello", "metadata": {"size": 9000}, "levelname": "CRITICAL", "filename": "/path/to/app.py", "process": 74542, "thread": 4659336704}', "ingestionTime": 1636868050028}
This enables sending log message metadata as structured log data instead of relying on string formatting. See LogRecord attributes for the full list of available attributes.
- Parameters:
json_serialize_default (Callable | None) – The ‘default’ function to use when serializing dictionaries as JSON. See the JSON module documentation for more details about the ‘default’ parameter. By default, watchtower uses a serializer that formats datetime objects into strings using the datetime.isoformat() method, and uses repr() to represent all other objects.
add_log_record_attrs (Tuple)
- format(message)[source]¶
Format the specified record as text.
The record’s attribute dictionary is used as the operand to a string formatting operation which yields the returned string. Before formatting the dictionary, a couple of preparatory steps are carried out. The message attribute of the record is computed using LogRecord.getMessage(). If the formatting string uses the time (as determined by a call to usesTime(), formatTime() is called to format the event time. If there is exception information, it is formatted using formatException() and appended to the message.
- class watchtower.CloudWatchLogHandler(log_group_name='watchtower', log_stream_name='{machine_name}/{program_name}/{logger_name}/{process_id}', use_queues=True, send_interval=60, max_batch_size=1048576, max_batch_count=10000, boto3_client=None, boto3_profile_name=None, create_log_group=True, log_group_tags={}, json_serialize_default=None, log_group_retention_days=None, create_log_stream=True, max_message_size=262144, log_group=None, stream_name=None, *args, **kwargs)[source]¶
Create a new CloudWatch log handler object. This is the main entry point to the functionality of the module. See the CloudWatch Logs developer guide and the Python logging module documentation for more information.
- Parameters:
log_group_name (str) – Name of the CloudWatch log group to write logs to. By default, the name of this module is used.
log_stream_name (str) – Name of the CloudWatch log stream to write logs to. By default, a string containing the machine name, the program name, and the name of the logger that processed the message is used. Accepts the following format string parameters: {machine_name}, {program_name}, {logger_name}, {process_id}, {thread_name}, and {strftime:%m-%d-%y}, where any strftime string can be used to include the current UTC datetime in the stream name. The strftime format string option can be used to sort logs into streams on an hourly, daily, or monthly basis.
use_queues (bool) – If True (the default), logs will be queued on a per-stream basis and sent in batches. To manage the queues, a queue handler thread will be spawned. You can set this to False to make it easier to debug threading issues in your application. Setting this to False in production is not recommended, since it will cause performance issues due to the synchronous sending of one CloudWatch API request per log message.
send_interval (int) – Maximum time (in seconds, or a timedelta) to hold messages in queue before sending a batch.
max_batch_size (int) – Maximum size (in bytes) of the queue before sending a batch. From CloudWatch Logs documentation: The maximum batch size is 1,048,576 bytes, and this size is calculated as the sum of all event messages in UTF-8, plus 26 bytes for each log event.
max_batch_count (int) – Maximum number of messages in the queue before sending a batch. From CloudWatch Logs documentation: The maximum number of log events in a batch is 10,000.
boto3_client (BaseClient) –
Client object for sending boto3 logs. Use this to pass custom session or client parameters. For example, to specify a custom region:
CloudWatchLogHandler(boto3_client=boto3.client("logs", region_name="us-west-2"))
See the boto3 session reference for details about the available session and client options.
boto3_profile_name (str | None) – Name of the boto3 configuration profile to use. This option is provided for situations where the logger should use a different AWS client configuration from the rest of the system, but declarative configuration via a static dictionary or config file is desired.
create_log_group (bool) – Create CloudWatch Logs log group if it does not exist. True by default.
log_group_retention_days (int | None) – Sets the retention policy of the log group in days. None by default.
log_group_tags (Dict[str, str]) – Tag the log group with the specified tags and values. There is no provision for removing tags. {} by default.
create_log_stream (bool) – Create CloudWatch Logs log stream if it does not exist. True by default.
json_serialize_default (Callable | None) –
The ‘default’ function to use when serializing dictionaries as JSON. See the JSON module documentation for more details about the ‘default’ parameter. By default, watchtower uses a serializer that formats datetime objects into strings using the datetime.isoformat() method, and uses repr() to represent all other objects.
max_message_size (int) – Maximum size (in bytes) of a single message.
- close()[source]¶
Send any queued messages to CloudWatch and prevent further processing of messages. This method does nothing if
use_queues
is set to False.
Change log¶
- Release Notes
- Changes for v3.3.1 (2024-08-20)
- Changes for v3.3.0 (2024-08-19)
- Changes for v3.2.0 (2024-04-19)
- Changes for v3.1.0 (2024-03-10)
- Changes for v3.0.1 (2023-01-29)
- Changes for v3.0.0 (2022-01-26)
- Changes for v2.1.1 (2022-01-07)
- Changes for v2.1.0 (2022-01-07)
- Changes for v2.0.1 (2021-11-29)
- Changes for v2.0.0 (2021-11-13)
- Changes for v1.0.6 (2021-01-17)
- Changes for v1.0.5 (2021-01-13)
- Changes for v1.0.4 (2021-01-01)
- Changes for v1.0.3 (2021-01-01)
- Changes for v1.0.2 (2020-12-31)
- Changes for v1.0.1 (2020-12-21)
- Changes for v1.0.0 (2020-10-28)
- Changes for v0.8.0 (2020-06-28)
- Changes for v0.7.3 (2019-08-27)
- Changes for v0.7.2 (2019-08-26)
- Changes for v0.7.1 (2019-08-26)
- Changes for v0.7.0 (2019-08-26)
- Changes for v0.6.0 (2019-05-22)
- Changes for v0.5.5 (2019-01-22)
- Changes for v0.5.4 (2018-11-02)
- Changes for v0.5.3 (2018-04-16)
- Changes for v0.5.2 (2017-11-09)
- Changes for v0.5.1 (2017-11-09)
- Changes for v0.5.0 (2017-11-09)
- Changes for v0.4.1 (2017-09-20)
- Changes for v0.4.0 (2017-08-11)
- Changes for v0.3.3 (2016-09-15)
- Changes for v0.3.2 (2016-09-15)
- Changes for v0.3.1 (2016-09-15)
- Changes for v0.3.0 (2016-09-15)