Logging

Editor’s Note: Like most of Fugue’s features, the logging feature is a work in progress, as is this document.

CLI Logging

The Fugue CLI keeps a running log of all CLI activity, including commands entered, API calls sent, errors encountered, and output printed to the screen. The log also tracks configuration information. The log file, fuguecli.log, is created in the same directory in which the Fugue CLI is run. This log is helpful for troubleshooting.

Sample log snippet for a suspend command:

2016-09-27T19:07:30.09 [ fugue ] DEBUG - -------------------------------Configuration Info-------------------------------
2016-09-27T19:07:30.09 [ fugue ] DEBUG - Fugue CLI Version: 0.16.5-1074-7e190cdbf126c8e012b0285cbe746153f56e4097
2016-09-27T19:07:30.09 [ fugue ] DEBUG - Configuration File Path (symbolic links followed): /fugue/fugue.yaml
2016-09-27T19:07:30.09 [ fugue ] DEBUG - Conductor Region: us-east-1
2016-09-27T19:07:30.09 [ fugue ] DEBUG - Compositions Bucket: fugue-compositions-<account>
2016-09-27T19:07:30.09 [ fugue ] DEBUG - Request Queue: fugue-demarc-requests
2016-09-27T19:07:30.09 [ fugue ] DEBUG - Response Queue: fugue-cli-resp-hplybcarhsdlnalhoeoo
2016-09-27T19:07:30.09 [ fugue ] DEBUG - --------------------------------------------------------------------------------
2016-09-27T19:07:30.09 [ fugue ] DEBUG - parsing args "['-y', 'mycomposition']" for suspend
2016-09-27T19:07:30.09 [ fugue.screen ] INFO - [ fugue suspend ] Suspending process with Alias: mycomposition
2016-09-27T19:07:30.09 [ botocore.credentials ] INFO - Found credentials in environment variables.
2016-09-27T19:07:30.09 [ botocore.vendored.requests.packages.urllib3.connectionpool ] INFO - Starting new HTTPS connection (1): queue.amazonaws.com
2016-09-27T19:07:31.09 [ botocore.credentials ] INFO - Found credentials in environment variables.
2016-09-27T19:07:31.09 [ botocore.vendored.requests.packages.urllib3.connectionpool ] INFO - Starting new HTTPS connection (1): cloudformation.us-east-1.amazonaws.com
2016-09-27T19:07:31.09 [ botocore.vendored.requests.packages.urllib3.connectionpool ] INFO - Starting new HTTPS connection (1): ec2.us-east-1.amazonaws.com
2016-09-27T19:07:32.09 [ fugue.screen ] INFO - Requesting the Conductor to suspend process ...
2016-09-27T19:07:34.09 [ fugue.screen ] INFO - [ DONE ] Process with Alias: mycomposition is being suspended.
2016-09-27T19:07:34.09 [ fugue.screen ] INFO - [ HELP ] Run the 'fugue status' command to view details and status of this process suspension.

The above example includes a section called “Configuration Info” that includes debug-level details about the CLI version, the fugue.yaml file path, the Conductor region, and Conductor resource names (S3 bucket, SQS queues). This is followed by a series of internal messages related to carrying out the suspend command, ending with user-facing messages output to the screen.

For information on using the CLI log to facilitate debugging, see Troubleshooting.

CloudWatch Logs

At present, the best way to find out what is going on in Fugue is to use the CLI‘s status or history command or read the CLI log. However, you can also get a look at the logs for Fugue through the AWS CloudWatch logs service. This chapter will give you some useful guidance on how to find relevant logs, although it remains the case that they are somewhat technical and opaque for now.

Fugue CloudWatch Log Format

Fugue logs are all output in JSON serialization format with events delineated by line breaks. The log file itself creates an LDJSON stream.

Each log event contains the following information:

Field Details
timestamp Displays the timestamp information in UTC.
component Identifies the name of the component.
log_level Provides a severity level from 0 to 7.
message Contains the primary message for the log entry.
error_detail (optional) In the event of an error, this field provides diagnostic details.
fid (if applicable) Reports the associated Fugue process ID.
job_id (if applicable) Reports the associated component-assigned job ID.

How to Find Fugue Logs in CloudWatch

Most relevant log data for any details you want to find about Fugue are found in the /fugue/conductor log group in CloudWatch.

The ~/fugue/conductor~ log group.

The /fugue/conductor log group.

The two most relevant logs are those for the Manager, which controls the planning components of Fugue, and the Broker, which interacts with infrastructure provider APIs. These can each be found, respectively, in the manager and fugue-broker log streams.

The ~fugue-broker~ log stream.

The fugue-broker log stream.

In either case, the most valuable thing to do is to filter by FID. The FID is the “Fugue ID” of the process, and is returned to you after a fugue run, or in a fugue status command.

As an example, here is a filter that can be applied to either log: { $.fid = "c0bc1b09-c0c1-403a-bcee-d3f56bba8741" } Of course, you’ll need to substitute your own subject FID between the quotes.

Common Log Messages and Patterns

Here are a few message types to look for to help you find out what’s going on with Fugue. For now, we’ll just focus on looking at processes that are in the Running state.

Manager Log

Planned Actions

When viewing the Manager log filtered by FID as shown above, you can look at actions Fugue has planned to take. These won’t differ much from what you see in the Broker, although you can get a good idea of how Fugue “thinks” by tracking planned instructions in this log.

{
  "account_id": "",
  "component": "manager",
  "fid": "04cf823e-1217-4aaf-b220-697ab4c0ac84",
  "guid": "04cf823e-1217-4aaf-b220-697ab4c0ac84.3db9a301-bd1d-55e7-96d5-80804348ab63",
  "job_id": "1465231683",
  "layer": "emit-instructions",
  "log_level": "debug",
  "message": "Command aws.ec2.create_vpc for resource 04cf823e-1217-4aaf-b220-697ab4c0ac84.3db9a301-bd1d-55e7-96d5-80804348ab63 in account ID  region us-west-2 added by go-planner on layer emit-instructions",
  "params": "{\"CidrBlock\":\"10.0.0.0/16\",\"InstanceTenancy\":\"default\"}",
  "planner": "go-planner",
  "region": "us-west-2",
  "request_type": "aws.ec2.create_vpc",
  "timestamp": "2016-06-06T16:48:08.637564"
}

Note the message field. The plan is generally a sequence of “commands,” so describes one step in the plan.

Broker Log

API Requests

When you’re looking at the Broker log filtered by FID, you should see lots of messages like this if you ran a composition that defines any infrastructure:

{
  "timestamp": "2016-06-06T16:48:11.592",
  "component": "broker.job",
  "log_level": "INFO",
  "message": "Issuing [<botocore.client.EC2 object at 0x7f8088359358>.create_vpc] with [{'InstanceTenancy': 'default', 'CidrBlock': '10.0.0.0/16'}]",
  "fid": "04cf823e-1217-4aaf-b220-697ab4c0ac84",
  "job_id": "1465231683"
}

Note the message field. A message like this (Issuing...) indicates specific API calls that Fugue is making to AWS.

Metrics and Alarms

Performance metrics

By default, Fugue provides two general types of metrics, both of which are visible under Metrics in the CloudWatch section of the AWS Console. The first type of metrics is general peformance metrics for components of the Conductor, available under the Custom Metrics section.

Custom Fugue metrics

Custom Fugue metrics.

Note: Fugue has the ability to support additional metrics; however, these are typically only enabled as part of troubleshooting. Contact us with any questions - support@fugue.co!

Health Checks and Alarms

The Fugue Conductor’s internal health checker is a component that monitors the health of all other Conductor components. If any components go down or report issues, these details are sent via logs to CloudWatch and trigger any necessary alarms.

This internal health checker monitors two specific statuses for the Conductor:

  • if the Conductor is alive and reporting data, and;
  • if the Conductor (and all components) are healthy.

All status information, alarms, and logs related to this monitoring are viewable through the AWS Console.

Conductor Alive

The Conductor alive function tracks the Conductor’s internal health checker to determine that it is up and reporting data (i.e., value = 0 is OK and value = x is INSUFFICIENT_DATA). If the internal health checker fails to report data for more than 3 minutes, a notification is triggered to update the status from OK to INSUFFICIENT_DATA. Unlike the component health checker, the alive function monitors data reporting and is either receiving data (value = 0, alive, OK) or is not receiving data (value = x, no data, INSUFFICIENT_DATA).

Conductor Health Check

The Conductor’s internal health checker is also designed to perform regular health checks and provide metrics on individual components. The CloudWatch alarm is not currently separated into individual alarms for each component; instead, when “healthy,” the AWS Console reports instances as OK. If issues arise with any Conductor component, the internal health checker triggers a CloudWatch alarm with the status ALARM.

  • An alarm triggers when an instance reports with a value = 1 that persists for more than 3 minutes (value = 0 is healthy)
  • Details about the specifics of the alarm are viewable in the CloudWatch section of the AWS Console in both the Alarms section and the Logs section

Note: the AWS CloudWatch console contains all of the data and logs related to any potential issues with Fugue or the Conductor; however, due to the level of detail and complexity we recommend that you reach out to support@fugue.co for assistance with any troubleshooting.

The log stream alarms

Fugue Conductor alarms.