Tracing Requests over Multiple Services and Accounts

correlated-logs?1

If you are using serverless micro-services and AWS Lambda and you need to trace and debug requests, then you may have struggled to track requests as they progress across multiple services perhaps traversing different AWS regions and accounts.

If so, then SenseDeep correlated logs may be the solution for you.

Serverless applications often have multiple Lambda functions or services that cooperate to respond to a single client request. Requests that originate with a single client request may traverse through many services and these services may be in different AWS regions or accounts.

A single request will often traverse through multiple AWS services such as: API Gateway, one more more Lambda functions, Kinesis streams, SQS queues, SNS messages, EventBridge events and other AWS services. The request may fan out to multiple Lambdas and results may be combined back into a single response.

Unfortunately, this means that the log data for a single request is scattered over multiple AWS CloudWatch log groups that are potentially in different regions or accounts.

Consequently, diagnosing a single request can be like searching for a needle in a haystack. Tracing a request over multiple log groups can be difficult and time consuming.

SenseDeep addresses this issue via correlated (meta) logs that combine multiple AWS CloudWatch log groups in real-time into a unified correlated log view.

CloudWatch Insights

Using CloudWatch Insights, you can search for a single request, however this method has some key limitations -- it will only work within a single AWS region or account.

For example, you can use the following CloudWatch query to retrieve log events from multiple log groups. First you select each of the logs for the query, then filter the log entries that match the request ID.

fields @timestamp, @log, @message, x-correlation-id
| filter @message like /MENkHjqOIAMESfg=/

This extracts the log messages that contain the given message pattern string. While this manual method works, is slow to setup, isn't very scalable and has a few major limitations.

First and foremost, it can be slow, ... very slow. CloudWatch insights often takes 20+ seconds to fetch results and can take up to 15 minutes if the event you are searching for an event in the past that did not happen very recently.

Second, after finding the matching log events, you cannot see the logs either just before or after the event. If the root cause of the request failure is in the log events just prior to the request, you cannot see that failure.

Third, the logs must be in the same account and region. You cannot correlate requests across different AWS accounts or regions. If your services are delivered from different AWS accounts or regions, you are out of luck.

X-Ray

Another potential solution is AWS X-Ray. X-Ray will trace requests across AWS services, but it too has some critical limitations.

X-Ray only samples 1 request per second and 5% of additional requests. So the percentage of sampled requests is low and the probablity that your request error is not sampled and monitored is high. X-Ray is useful for diagnosing complete service failures or repeated service failures, but not single request issues.

X-Ray also requires a lot of additional code that bloat your lambdas. Furthermore your code must be modified to initialize and configure X-Ray. This is intrusive and impacts the performance of your lambdas.

SenseDeep addresses these issues via correlated meta Logs and does not suffer from these limitations.

Meta Logs

A SenseDeep meta log is a correlated view over multiple CloudWatch log groups that can be in any AWS region or account. From a meta log, you can view log events ordered in sequence regardless of the AWS service or log group. This log view behaves like any other log view, except it combines log groups and streams from multiple sources regardless of AWS account or region.

From the meta log, you can immediately locate and isolate any specific request by searching for a request ID or pattern of your choosing to isolate the complete request trace.

Creating Meta Logs

To create a meta log, select "Logs" from the side menu and then click "Add". Enter your meta log name and select the logs to combine.

meta

You can explicitly select the individual contributing logs by entering a regular expression pattern to dynamically match contributing logs. Using a regular expression pattern is preferable if you have a changing set of log group names.

Alternatively, you can select specific logs via the log list combo box.

Log Viewer

Once created, you can use the SenseDeep log viewer to display a unified view of the logs for the selected services and events.

Here is a sample view of a request that flows through two lambdas, an EventBridge bus and a final lambda.

meta-viewer

Filtering by Request IDs

From the log viewer, you can filter by a correlation ID by double clicking on that column which will display the view configuration panel.

meta-filtering

The correlation ID will be automatically extracted from the log event and be pasted into the Match Events form field. Click Run and the specified request will be selected and all other events will filtered out.

High Cardinality IDs

To get the most out of meta logs, you should utilize a unique request ID that is passed to your Lambda functions and propagated to any downstream AWS services. This correlation ID should be emitted in all log events. This is called a High Cardinality ID. Using this ID, you can filter log events using this ID and display only those events for that request ID.

We use the use the SenseLogs library to manage correlation IDs. SenseLogs will extract trace IDs including the Lambda and X-Ray trace IDs from the request context. It will map and emit these as a x-correlation-id log property.

For example:

const log = new SenseLogs()

exports.handler = async (event, context) => {
    log.addTraceIds(event, context)

    log.info('Request start', {body, event, context})
}

For more details about SenseLogs, see:

Summary

For effective serverless development and debugging, you need to be able to quickly correlate and isolate requests across multiple cooperating AWS services such as API Gateway, Lambda, SQS and SNS. SenseDeep provides a real-time, fast meta log facility to create a unified view of a request as it passes through multiple AWS services.

More?

Try the SenseDeep Serverless studio with a free developer license at SenseDeep App.

Comments Closed

{{comment.name || 'Anon'}} said ...

{{comment.message}}
{{comment.date}}

Try SenseDeep

Start your free 14 day trial of the SenseDeep Developer Studio.

© SenseDeep® LLC. All rights reserved. Privacy Policy and Terms of Use.

Consent

This web site uses cookies to provide you with a better viewing experience. Without cookies, you will not be able to view videos, contact chat or use other site features. By continuing, you are giving your consent to cookies being used.

OK