Troubleshooting

Having trouble? We’re here to help.

First, contact support@fugue.co.

You may be instructed to use the fugue support command. If so, check out the support documentation.

Below, we’ve provided guidance for possible issues, along with some general troubleshooting tips.

Troubleshooting fugue init

Error: Invalid AMI

In this case the fugue init command generates the error `invalid AMI`.

Error message, Fugue version 2017.05.23 and older

$ fugue init ami-xxxxxxxx
[ fugue init ] Initializing Fugue project with the following configuration:

Fugue Conductor AMI ID: ami-xxxxxxxx
AWS Credentials: Environment variables

Validating Fugue Conductor AMI ID ...
[ ERROR ] ami-xxxxxxxx is not a valid AMI.

Error message, Fugue current release

$> fugue init --ami ami-xxxxxxxx us-east-1
[ fugue init ] Initializing Fugue project with the following configuration:

Fugue Conductor AMI ID: ami-xxxxxxxx
AWS Credentials: Environment variables

Validating Fugue Conductor AMI ID ...
[ ERROR ] No BASIC or TEAM conductor registered in ami-xxxxxxxx.

Explanation:

  • Your account has not been whitelisted for access to the Fugue Conductor AMI.
  • You have set the environmental variables AWS_DEFAULT_PROFILE and AWS_DEFAULT_REGION to an account that is not whitelisted or to a region other than us-east-1 or us-gov-west-1.
    • The AWS_DEFAULT_PROFILE and AWS_DEFAULT_REGION environment variables will override both the default profile and any profile you specify on the command line of fugue init. If the profile specified in AWS_DEFAULT_PROFILE is not whitelisted, then you have will not have access to the AMI. If the region specified in AWS_DEFAULT_REGION is not us-east-1 or us-gov-west-1 then you will also not have access to the AMI.

/Note: All of our publicly accessible Fugue Conductor AMIs are only available in us-east-1 and us-gov-west-1, as we currently only support the Fugue Conductor in us-east-1 and us-gov-west-1./

  • For v2017.05.23 and older: Your IAM policy may be missing the required permissions.

Solution(s):

  • Update your account access to ensure it has access to the appropriate Fugue Conductor AMI.
  • Update your environmental variables to reflect the appropriate whitelisted account or update your specified region to us-east-1 or us-gov-west-1.
  • For v2017.05.23 and older: See the minimum required permissions in our public Github repo, or upgrade to the current version of Fugue, which automatically generates the required policies.

You can also check out Hello World, Part 1: Fugue Quick Setup for additional information on setting up your account and the associated environmental variables.

Troubleshooting fugue install

Error: `fugue.yaml` doesn’t exist

In this case the fugue install command generates the error `=fugue.yaml= doesn’t exist`.

*Error message:*

$> fugue install

[ ERROR ] config file /Users/test/fugue.yaml does not exist.
You can run "fugue init" to initialize your project.

Explanation:

The fugue.yaml was not created because you haven’t yet run fugue init to specify the profile and Fugue Conductor AMI you wanted to use.

Solution:

Run fugue init to specify your profile and AMI, and then run fugue install.

Error: `Can’t describe cloudformation stack`

In this case the fugue install command generates the error `Can’t describe cloudformation stack`.

*Error message:*

$ fugue install

[ ERROR ] There was a problem executing this command.

   Reason: An error occurred (AccessDenied) when calling the DescribeStacks operation: User: arn:aws:iam::[AWS_ACCOUNT_NUMBER]:user/test_user is not authorized to perform: cloudformation:DescribeStacks on resource: arn:aws:cloudformation:us-east-1:[AWS_ACCOUNT_NUMBER]:stack/fugue/*

Explanation:

In this case, the profile you are using does not have the appropriate access to Cloudformation. Fugue requires Cloudformation access to create the infrastructure needed to run the Fugue Conductor.

Error: `AWS CloudFormation stack creation failed`

In this case the fugue install command generates the error ‘AWS CloudFormation stack creation failed’.

*Error message:*

$ fugue install

[ fugue install ] Installing Fugue Conductor
Install Details:
   Conductor AMI ID: ami-xxxxxxxx
   AWS Account: test_user/[ACCOUNT NUMBER]
[ WARN ] Would you like to proceed with installing? [y/N]: y
Installing the Fugue Conductor into AWS account test_user/[ACCOUNT NUMBER].
FugueSubnet1                         Working...
FugueVpc                             Working...
FugueSubnet2RouteTableAssociation    Working...
...
FugueVpcGatewayAttachment            Working...
FugueAutoScalingGroup                Working...

-----------------------------------------------
Overall Progress  [#........................]    6%
[ HELP ] Exiting the install command while in progress (CTRL+C) will only stop progress tracking and *not* the install itself.
[ ERROR ] AWS CloudFormation stack creation failed.

Explanation:

  • In this second case (AWS CloudFormation stack creation failed), the AWS account you are using does not have sufficient permissions to install the infrastructure needed for the Fugue Conductor. In addition to Cloudformation, the account you use to install Fugue must have permissions to create, delete, and update all the infrastructure needed for the Fugue Conductor which includes any selected services and infrastructure you want Fugue to manage.

*Solution:*

Update the profile you are using to install Fugue with the appropriate permissions for the services you are attempting to access or create (e.g. for Cloudformation you need to include: Describe, Create, Delete and Update)

Note: Details about Fugue’s policy on AWS Permissions IAM are available here.

Error: `A previous Conductor installation failed to uninstall`

In this case the error when using the fugue install command is `A previous Conductor installation failed to uninstall`.

*Error message:*

$ fugue install -y

[ ERROR ] There was a problem executing this command.

   Reason: A previous Conductor installation failed to uninstall.

   Details: The following resource(s) failed to delete: [FugueResourceEventsTopic].

Explanation:

Sometimes Cloudformation doesn’t delete correctly on uninstall, leaving the stack running in us-east-1.

Solution:

Access the AWS console and go to Cloudformation in us-east-1 and delete the stack named FUGUE.

Troubleshooting fugue run

Error: `(403) Forbidden`

In this case the fugue run command generates a `(403) Forbidden` error.

*Error message:*

$ fugue run alarm.lw -a alarm
[ fugue run ] Running alarm.lw
Run Details:
    Alias: alarm

Compiling Ludwig file /Users/test/fugue/alarm.lw

[ OK ] Successfully compiled. No errors.

Uploading compiled Ludwig composition to S3...

[ ERROR ] There was a problem executing this command.

   Reason: An error occurred (403) when calling the HeadBucket operation: Forbidden

Explanation:

  • This error occurs when the account you are using doesn’t have access to the S3 bucket configured under compositionBucket in the fugue.yaml file.
    • This can occur for a number of reasons. It can occur when the fugue.yaml file is configured to use your default AWS profile instead of a named profile. If you switch the default profile in ~/.aws/credentials and then use the same folder and fugue.yaml file to install fugue in the new default account it will not generate a new bucket name and will attempt to use the old bucket.
    • Another less common scenario is one where the bucket name auto-generated by Fugue already exists and is owned by a different account.

*Solution:*

Create a new S3 bucket in the account you are using and then change the fugue.yaml file to point to that S3 bucket.

Error: `Fugue has timed out`

In this case the fugue staus command indicates a `Fugue has timed out` error.

*Error message:*

$ fugue status

[ ERROR ] Fugue has timed out waiting for a response from the server.

Explanation:

  • Fugue Conductor and Client versions are matched pairs which is why we release them together. If your Conductor and Client are from different releases you may experience this error.
  • This also may be caused by delays in AWS queues and signifies nothing.

Solution:

Use the fugue --version command to determine your versions of the client and CLI, along with the version of your Fugue Conductor AMI. This will allow you to determine if you are working with a matched pair. You can verify the pair via the Download Portal, or for additional assistance reach out to support@fugue.co.

Troubleshooting fugue upgrade

Error: “The Conductor is in the process of installing”

In this case, running fugue upgrade produces the error “The Conductor is in the process of installing.” (Applies to Fugue v2015.05.23 and earlier)

Error message:

$ fugue upgrade ami-xxxxxxxx
[ fugue upgrade ] Upgrading Conductor

[ ERROR ] There was a problem executing this command.
   Reason: The Conductor is in the process of installing.

Explanation: The Conductor has not completed its installation process, and its CloudFormation stack is still in the CREATING or CREATED stage. This can also happen if the Conductor instance has been terminated and its Auto Scaling group is in the process of launching a new Conductor.

One other possibility is that a previous Conductor was not uninstalled properly, leaving behind some artifacts.

Solution: The Conductor takes 5-15 minutes to boot once its instance has been created. To find out whether the Conductor is up and running, follow the instructions here. Once the Conductor is finished booting, you may execute upgrade.

If the Conductor still hasn’t booted after 15 minutes and you are still receiving this error message, you may need to remove artifacts left behind from a previous Conductor. To effectively remove any such artifacts, execute the following command:

fugue uninstall --force

Note: If you suspect that the previous Conductor installation has left infrastructure running, please email Fugue Support (support@fugue.co) before taking any further action. It is important to address these running workloads before attempting to uninstall a previous installation.

Other Possible Issues

Here are some other possible issues and error messages you may encounter when using Fugue.

Error: “This client and the Conductor are incompatible” on any CLI command

Error message:

[ ERROR ] This client and the Conductor are incompatible: client API version 4.0.4 is incompatible with server API version 3.2.4.

Explanation: The Fugue CLI and the Fugue Conductor form a matched set. If you’ve installed a version of the CLI that is not compatible with your current version of the Conductor, or vice versa, you’ll see this error message upon executing any CLI command.

Solution: Visit the Download Portal to confirm that you have installed the correct version of the CLI and the correct AMI ID of the Conductor. Recall that to upgrade Fugue, the new CLI needs to be installed before you execute the upgrade command. If you encounter problems, reach out to support@fugue.co.

Error: “Fugue requires an English UTF-8 console environment” on any CLI command

Error message:

Fugue requires an English UTF-8 console environment.

Explanation: This error indicates that the character encoding for the shell in use is not U.S. English UTF-8. In order to execute any Fugue command, the encoding must be U.S. English UTF-8.

Solution: You can change the default character encoding in your shell by executing export LANG=en_US.UTF-8, which changes the environmental variable $LANG to use U.S. English UTF-8. You can also add the line export LANG=en_US.UTF-8 to your .bash_profile or .bashrc file, or wherever you keep your shell configuration. This will automatically set your shell’s character encoding to U.S. English UTF-8.

Error: “Command is not supported” on any CLI command

Error message:

I'm sorry Dave. I'm afraid I can't do that.
[ ERROR ] This command is not supported by the Conductor currently installed.

Explanation: This error indicates a mismatched Fugue CLI and Fugue Conductor. The CLI and Conductor form a matched set, so if you upgraded the CLI to a newer version but did not upgrade the Conductor to the corresponding AMI ID, the CLI may support a command that the Conductor does not. In this case, attempting to run an unsupported command produces the above error message.

Solution: Visit the Fugue Download Portal to find the corresponding Conductor AMI ID for your version of the CLI, and then run fugue upgrade <ami_id> with that AMI to upgrade the Conductor. See upgrade for more details.

Error: “AWS CloudFormation stack creation failed” on Fugue installation

Error message:

Installing the Fugue Conductor into AWS account <user>/<account number>.

FugueInternetRoute                   Working...
FugueSubnet2                         Working...
FugueSubnet1RouteTableAssociation    Working...
FugueSubnet2RouteTableAssociation    Working...
FugueAutoScalingGroup                Working...
FugueVpcSecurityGroup                Working...
FugueIam                             Complete
FugueSubnet1                         Working...
FugueResourceEventsTopic             Complete
FugueHealthCheckDb                   Working...
FugueVpcGatewayAttachment            Working...
FugueVpc                             Complete
FugueRouteTable                      Complete
FugueLaunchConfiguration             Working...
FugueVpcGateway                      Complete
FugueInstanceProfile                 Working...
-----------------------------------------------
Overall Progress  [#######..................]   31%

[ HELP ] Exiting the install command while in progress (CTRL+C) will only stop progress tracking and *not* the install itself.

[ ERROR ] AWS CloudFormation stack creation failed

In this case, you may have encountered a faulty AZ availability indication – a limitation of the AWS API that can “fool” Fugue’s installation command. To confirm you have this problem, take a look at the fugue CloudFormation stack event log after the failed installation. You should see a message about an invalid AvailabilityZone parameter.

CloudFormation events typical of a Fugue installation where available AZs were incorrectly reported by AWS.

Example of erroneous events in this case.

To resolve this issue, follow the simple steps in this example. If you see a different error or are unsure if you are seeing this one, contact us at support@fugue.co.

Troubleshooting Tips

This section covers generalized troubleshooting in the Fugue system. Let’s take a look at a sample error message so we can discuss troubleshooting techniques. Suppose you get the following error when attempting init:

[ fugue init ] Initializing Fugue project with the following configuration:

Fugue Conductor AMI ID: ami-00000000
AWS Credentials: Environment variables

Checking your AWS Credentials for Fugue CLI use ...
[ OK ] Authorized.
Checking your AWS Credentials for Fugue Conductor installation ...
[ OK ] Authorized.

Validating Fugue Conductor AMI ID ...
[ ERROR ] ami-00000000 is not a valid AMI.

There are a few ways to start digging into the issue.

Check the CLI Log

A good place to start troubleshooting is the CLI log. There, you’ll find the full error message, along with the context surrounding it. In this case, we’ve given the CLI an AMI ID that does not exist, and we can confirm this by checking the very end of fuguecli.log:

2016-09-28T18:38:01.09 [ fugue.screen ] ERROR - ami-00000000 is not a valid AMI.
Traceback (most recent call last):
  File "site-packages/fugue_cli/utils.py", line 181, in validate_ami
  File "site-packages/botocore/client.py", line 159, in _api_call
  File "site-packages/botocore/client.py", line 494, in _make_api_call
botocore.exceptions.ClientError: An error occurred (InvalidAMIID.NotFound) when calling the DescribeImages operation: The image id '[ami-00000000]' does not exist

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "site-packages/fugue_cli/cli.py", line 157, in invoke
  File "site-packages/click/core.py", line 1060, in invoke
  File "site-packages/click/core.py", line 889, in invoke
  File "site-packages/click/core.py", line 534, in invoke
  File "site-packages/click/decorators.py", line 27, in new_func
  File "site-packages/fugue_cli/commands/init.py", line 96, in init
  File "site-packages/fugue_cli/utils.py", line 189, in validate_ami
fugue_cli.common.FugueError: ami-00000000 is not a valid AMI.

Here, we can see that the CLI encountered the error InvalidAMIID.NotFound while calling the DescribeImages operation. By cross-referencing AWS’s EC2 error code documentation, we can confirm that this error occurred because the AMI we specified during init does not exist. AWS provides some tips:

The specified AMI does not exist. Check the AMI ID, and ensure that you specify the region in which the AMI is located, if it’s not in the default region. This error may also occur if you specified an incorrect kernel ID when launching an instance.

Should this particular error happen to you, it’s worth checking the Fugue Download Portal to make sure you have the latest AMI ID. If the AMI ID is correct, check the region field in fugue.yaml. AMI IDs are only valid for one region, so if the Conductor is configured to run in a different region than your AMI, the same error is triggered.

Increase CLI Verbosity

The CLI log doesn’t append extra information for all errors, but there’s another way to find more information. Any Fugue command can be executed in verbose mode by using the global -v option. The option increases the verbosity incrementally, so -vvv shows more detail than -v. With increased verbosity, the CLI displays considerably more data that may be useful in troubleshooting, including the precise messages the CLI sends to the Conductor and the responses AWS returns.

Email Us

Finally, you can always send an email to support@fugue.co describing your issue, and we’ll be happy to help you.

Recovery From Provider Downtimes

In the rare event that provider platforms we support experience downtimes, we will place instructions on recovery from those downtimes here. We designed Fugue to be resilient and fail gracefully in the event of dependent service outages. In short, the system is designed to keep running if possible, and save state and “wait out the storm” if it is not. In the latter case, some human intervention may be required to judge when it is safe to resume normal operation.

Entries here should be helpful not only for a particular incident, but also similar incidents that might occur on a smaller scale.

February 28th, 2017: AWS S3 Increased Error Rates

AWS experienced high error rates for the S3 service in the us-east-1 region, which impacted several other services as well.

We found that, as designed, Fugue Conductors put all processes into the Suspended state during this outage. This state means that the Conductor retains all process information, but remains hands-off. This is the safest response to a provider control plane failure, since we cannot be sure of infrastructure state or the consistency of changes we might try to make. You can read more about Fugue’s process model here.

Recovery from this state is simple. You can just use the resume command like this:

$ fugue resume -y (<alias> | <FID>)

This will resume the processes one at time. If you wish, it is safe to resume all processes at once with the following command:

$ for fid in $(fugue status --json | jq -r .[].fid); do fugue resume -y $fid; done

As S3 service recovered, some processes we resumed re-entered suspension due to continued AWS API errors. If this happens in any future incidents, we recommend you wait at least a few minutes — or until there is some more news about the incident — and try again.