Troubleshooting

Having trouble? We’re here to help.

First, contact support@fugue.co.

You may be instructed to use the fugue support command. If so, check out the support documentation.

Below, we’ve provided guidance for possible issues, along with some general troubleshooting tips.

You can also find known issues in our release notes.

Troubleshooting fugue init

Error: Invalid AMI

In this case the fugue init command generates the error `invalid AMI`.

Error message, Fugue version 2017.05.23 and older

$ fugue init ami-xxxxxxxx
[ fugue init ] Initializing Fugue project with the following configuration:

Fugue Conductor AMI ID: ami-xxxxxxxx
AWS Credentials: Environment variables

Validating Fugue Conductor AMI ID ...
[ ERROR ] ami-xxxxxxxx is not a valid AMI.

Error message, Fugue current release

$> fugue init --ami ami-xxxxxxxx us-east-1
[ fugue init ] Initializing Fugue project with the following configuration:

Fugue Conductor AMI ID: ami-xxxxxxxx
AWS Credentials: Environment variables

Validating Fugue Conductor AMI ID ...
[ ERROR ] No BASIC or TEAM conductor registered in ami-xxxxxxxx.

Explanation:

  • Your account has not been whitelisted for access to the Fugue Conductor AMI.
  • You have set the environmental variables AWS_DEFAULT_PROFILE and AWS_DEFAULT_REGION to an account that is not whitelisted or to an unsupported region.
    • The AWS_DEFAULT_PROFILE and AWS_DEFAULT_REGION environment variables will override both the default profile and any profile you specify on the command line of fugue init. If the profile specified in AWS_DEFAULT_PROFILE is not whitelisted, then you have will not have access to the AMI. If the region specified in AWS_DEFAULT_REGION is not supported then you will also not have access to the AMI.
  • For v2017.05.23 and older: Your IAM policy may be missing the required permissions.

Solution(s):

  • Update your account access to ensure it has access to the appropriate Fugue Conductor AMI.
  • Update your environmental variables to reflect the appropriate whitelisted account or update your specified region to a supported region.
  • For v2017.05.23 and older: See the minimum required permissions in our public Github repo, or upgrade to the current version of Fugue, which automatically generates the required policies.

You can also check out Hello World, Part 1: Fugue Quick Setup for additional information on setting up your account and the associated environmental variables.

Important Note:

The Fugue Conductor may only be installed in the following regions:

  • us-east-1
  • us-east-2
  • us-west-2
  • eu-west-1
  • us-gov-west-1

Troubleshooting fugue install

Error: `fugue.yaml` doesn’t exist

In this case the fugue install command generates the error `=fugue.yaml= doesn’t exist`.

*Error message:*

$> fugue install

[ ERROR ] config file /Users/test/fugue.yaml does not exist.
You can run "fugue init" to initialize your project.

Explanation:

The fugue.yaml was not created because you haven’t yet run fugue init to specify the profile and Fugue Conductor AMI you wanted to use.

Solution:

Run fugue init to specify your profile and AMI, and then run fugue install.

Error: `Can’t describe cloudformation stack`

In this case the fugue install command generates the error `Can’t describe cloudformation stack`.

*Error message:*

$ fugue install

[ ERROR ] There was a problem executing this command.

   Reason: An error occurred (AccessDenied) when calling the DescribeStacks operation: User: arn:aws:iam::[AWS_ACCOUNT_NUMBER]:user/test_user is not authorized to perform: cloudformation:DescribeStacks on resource: arn:aws:cloudformation:us-east-1:[AWS_ACCOUNT_NUMBER]:stack/fugue/*

Explanation:

In this case, the profile you are using does not have the appropriate access to Cloudformation. Fugue requires Cloudformation access to create the infrastructure needed to run the Fugue Conductor.

Error: `AWS CloudFormation stack creation failed`

In this case the fugue install command generates the error ‘AWS CloudFormation stack creation failed’.

*Error message:*

$ fugue install

[ fugue install ] Installing Fugue Conductor

Install Details:
   Conductor AMI ID: ami-xxxxxxxx
   AWS Account: test_user/[ACCOUNT NUMBER]
   Region: us-east-1

[ WARN ] Would you like to proceed with installing? [y/N]: y

Installing the Fugue Conductor into AWS account test_user/[ACCOUNT NUMBER].

FugueSubnet1                         Working...
FugueVpc                             Working...
FugueSubnet2RouteTableAssociation    Working...
...
FugueVpcGatewayAttachment            Working...
FugueAutoScalingGroup                Working...

-----------------------------------------------
Overall Progress  [#........................]    6%
[ HELP ] Exiting the install command while in progress (CTRL+C) will only stop progress tracking and *not* the install itself.
[ ERROR ] AWS CloudFormation stack creation failed.

Note: While the CLI currently indicates that (CTRL+C) will not stop the installation, we do not recommend using this command as it may interrupt the successful creation of credentials. In the event (CTRL+C) is used you can manually create your credentials using fugue support reset-secret. These recommendations will be updated in a future release.

Explanation:

  • In this second case (AWS CloudFormation stack creation failed), the AWS account you are using does not have sufficient permissions to install the infrastructure needed for the Fugue Conductor. In addition to Cloudformation, the account you use to install Fugue must have permissions to create, delete, and update all the infrastructure needed for the Fugue Conductor which includes any selected services and infrastructure you want Fugue to manage.

*Solution:*

Update the profile you are using to install Fugue with the appropriate permissions for the services you are attempting to access or create (e.g. for Cloudformation you need to include: Describe, Create, Delete and Update)

Note: Details about Fugue’s policy on AWS Permissions IAM are available here.

Error: `A previous Conductor installation failed to uninstall`

In this case the error when using the fugue install command is `A previous Conductor installation failed to uninstall`.

*Error message:*

$ fugue install -y

[ ERROR ] There was a problem executing this command.

   Reason: A previous Conductor installation failed to uninstall.

   Details: The following resource(s) failed to delete: [FugueResourceEventsTopic].

Explanation:

Sometimes Cloudformation doesn’t delete correctly on uninstall, leaving the stack running in us-east-1.

Solution:

Access the AWS console and go to Cloudformation in us-east-1 and delete the stack named FUGUE.

Troubleshooting fugue run

Error: `(403) Forbidden`

In this case the fugue run command generates a `(403) Forbidden` error.

*Error message:*

$ fugue run alarm.lw -a alarm
[ fugue run ] Running alarm.lw
Run Details:
    Alias: alarm

Compiling Ludwig file /Users/test/fugue/alarm.lw

[ OK ] Successfully compiled. No errors.

Uploading compiled Ludwig composition to S3...

[ ERROR ] There was a problem executing this command.

   Reason: An error occurred (403) when calling the HeadBucket operation: Forbidden

Explanation:

  • This error occurs when the account you are using doesn’t have access to the S3 bucket configured under compositionBucket in the fugue.yaml file.
    • This can occur for a number of reasons. It can occur when the fugue.yaml file is configured to use your default AWS profile instead of a named profile. If you switch the default profile in ~/.aws/credentials and then use the same folder and fugue.yaml file to install fugue in the new default account it will not generate a new bucket name and will attempt to use the old bucket.
    • Another less common scenario is one where the bucket name auto-generated by Fugue already exists and is owned by a different account.

*Solution:*

Create a new S3 bucket in the account you are using and then change the fugue.yaml file to point to that S3 bucket.

Error: `Fugue has timed out`

In this case the fugue staus command indicates a `Fugue has timed out` error.

*Error message:*

$ fugue status

[ ERROR ] Fugue has timed out waiting for a response from the server.

Explanation:

  • Fugue Conductor and Client versions are matched pairs which is why we release them together. If your Conductor and Client are from different releases you may experience this error.
  • This also may be caused by delays in AWS queues and signifies nothing.

Solution:

Use the fugue --version command to determine your versions of the client and CLI, along with the version of your Fugue Conductor AMI. This will allow you to determine if you are working with a matched pair. You can verify the pair via the Download Portal, or for additional assistance reach out to support@fugue.co.

Troubleshooting fugue update

Error: “ECS Services cannot be modified in-place”

The following error message shows in the output of fugue status [alias|FID]. The fugue status output will show a “Last Message” of FAILED.

Error message:

LastMessage: "400 Encountered an error while planning [There were multiple errors:\n\
  \tError processing ECS Service [my-service]: ECS Services cannot be modified in-place.\
  \ Please remove the Service first, by removing it from the composition and updating\
  \ the Fugue process [cd553a6a-793f-4344-98f6-8217dc17c26f], then create it again.]:\
  \ Planner returned an error"

Explanation:

This error message is encountered after using fugue update to add an ELB to an existing ECS service. With the exception of the mutable fields taskDefinition and deploymentConfiguration, ECS service properties cannot be changed once the service has been created, so an ECS service’s load balancer must be declared at service creation time. Attempts to update any process with an ECS service by adding an ELB will fail with the above error.

Solution:

If you want to preserve the name of the ECS service:

  • Remove the service from the composition
  • Run fugue update on the process
  • Add the service back into the composition
  • Run fugue update on the process again

This will delete the ECS service and recreate it with the load balancer attached.

Note: You can comment out lines of Ludwig by prepending a # to each line.

If you don’t need to preserve the ECS service’s name:

  • Change the name of the service in the composition
  • Run fugue update on the process

This will delete the ECS service and recreate it with the new name, with the load balancer attached.

Troubleshooting fugue upgrade

Error: “The Conductor is in the process of installing”

In this case, running fugue upgrade produces the error “The Conductor is in the process of installing.” (Applies to Fugue v2015.05.23 and earlier)

Error message:

$ fugue upgrade ami-xxxxxxxx
[ fugue upgrade ] Upgrading Conductor

[ ERROR ] There was a problem executing this command.
   Reason: The Conductor is in the process of installing.

Explanation: The Conductor has not completed its installation process, and its CloudFormation stack is still in the CREATING or CREATED stage. This can also happen if the Conductor instance has been terminated and its Auto Scaling group is in the process of launching a new Conductor.

One other possibility is that a previous Conductor was not uninstalled properly, leaving behind some artifacts.

Solution: The Conductor takes 5-15 minutes to boot once its instance has been created. To find out whether the Conductor is up and running, follow the instructions here. Once the Conductor is finished booting, you may execute upgrade.

If the Conductor still hasn’t booted after 15 minutes and you are still receiving this error message, you may need to remove artifacts left behind from a previous Conductor. To effectively remove any such artifacts, execute the following command:

fugue uninstall --force

Warning

If you use the =–force= option with the =uninstall= command while processes are active/running you will receive a warning that any existing processes and infrastructure will be orphaned and will no longer be managed by Fugue.

In addition, if you suspect that the previous Conductor installation has left infrastructure running, please email Fugue Support (support@fugue.co) before taking any further action. It is important to address these running workloads before attempting to uninstall a previous installation.

Other Possible Issues

Here are some other possible issues and error messages you may encounter when using Fugue.

RBAC policy error after upgrade

Error message:

[ ERROR ] Due to an attached RBAC policy, command 'ops' is not allowed for user 'alice'.

Explanation: You may encounter RBAC errors if you attach a policy to the Fugue Conductor and then upgrade the Conductor. For example, if you attach a policy that enables alice to have access to allAccountActions, then upgrade to a Conductor AMI that supports newer actions (such as the ops command, above), alice will not have access to the new actions. The proper order for upgrading the Conductor is:

Solution: Re-attach the RBAC policy:

fugue policy rbac-attach <policy_file>

This recompiles and reapplies the RBAC permissions so that allAccountActions and other Fugue.System.Policy types include actions available on the new Conductor. As long as you attach the same policy as before, each user and secret will remain the same.

Error: “This client and the Conductor are incompatible” on any CLI command

Error message:

[ ERROR ] This client and the Conductor are incompatible: client API version 4.0.4 is incompatible with server API version 3.2.4.

Explanation: The Fugue CLI and the Fugue Conductor form a matched set. If you’ve installed a version of the CLI that is not compatible with your current version of the Conductor, or vice versa, you’ll see this error message upon executing any CLI command.

Solution: Visit the Download Portal to confirm that you have installed the correct version of the CLI and the correct AMI ID of the Conductor. Recall that to upgrade Fugue, the new CLI needs to be installed before you execute the upgrade command. If you encounter problems, reach out to support@fugue.co.

Error: “Fugue requires an English UTF-8 console environment” on any CLI command

Error message:

Fugue requires an English UTF-8 console environment.

Explanation: This error indicates that the character encoding for the shell in use is not U.S. English UTF-8. In order to execute any Fugue command, the encoding must be U.S. English UTF-8.

Solution: You can change the default character encoding in your shell by executing export LANG=en_US.UTF-8, which changes the environmental variable $LANG to use U.S. English UTF-8. You can also add the line export LANG=en_US.UTF-8 to your .bash_profile or .bashrc file, or wherever you keep your shell configuration. This will automatically set your shell’s character encoding to U.S. English UTF-8.

Error: “Command is not supported” on any CLI command

Error message:

I'm sorry Dave. I'm afraid I can't do that.
[ ERROR ] This command is not supported by the Conductor currently installed.

Explanation: This error indicates a mismatched Fugue CLI and Fugue Conductor. The CLI and Conductor form a matched set, so if you upgraded the CLI to a newer version but did not upgrade the Conductor to the corresponding AMI ID, the CLI may support a command that the Conductor does not. In this case, attempting to run an unsupported command produces the above error message.

Solution: Visit the Fugue Download Portal to find the corresponding Conductor AMI ID for your version of the CLI, and then run fugue upgrade <ami_id> with that AMI to upgrade the Conductor. See upgrade for more details.

Error: “AWS CloudFormation stack creation failed” on Fugue installation

Error message:

Installing the Fugue Conductor into AWS account <user>/<account number>.

FugueInternetRoute                   Working...
FugueSubnet2                         Working...
FugueSubnet1RouteTableAssociation    Working...
FugueSubnet2RouteTableAssociation    Working...
FugueAutoScalingGroup                Working...
FugueVpcSecurityGroup                Working...
FugueIam                             Complete
FugueSubnet1                         Working...
FugueResourceEventsTopic             Complete
FugueHealthCheckDb                   Working...
FugueVpcGatewayAttachment            Working...
FugueVpc                             Complete
FugueRouteTable                      Complete
FugueLaunchConfiguration             Working...
FugueVpcGateway                      Complete
FugueInstanceProfile                 Working...
-----------------------------------------------
Overall Progress  [#######..................]   31%

[ HELP ] Exiting the install command while in progress (CTRL+C) will only stop progress tracking and *not* the install itself.
[ ERROR ] AWS CloudFormation stack creation failed

In this case, you may have encountered a faulty AZ availability indication – a limitation of the AWS API that can “fool” Fugue’s installation command. To confirm you have this problem, take a look at the fugue CloudFormation stack event log after the failed installation. You should see a message about an invalid AvailabilityZone parameter.

Note: While the CLI currently indicates that (CTRL+C) will not stop the installation, we do not recommend using this command as it may interrupt the successful creation of credentials. These recommendations will be updated in a future release.

CloudFormation events typical of a Fugue installation where available AZs were incorrectly reported by AWS.

Example of erroneous events in this case.

To resolve this issue, follow the simple steps in this example. If you see a different error or are unsure if you are seeing this one, contact us at support@fugue.co.

Troubleshooting Tips

This section covers generalized troubleshooting in the Fugue system. Let’s take a look at a sample error message so we can discuss troubleshooting techniques. Suppose you get the following error when attempting init:

[ fugue init ] Initializing Fugue project with the following configuration:

Fugue Conductor AMI ID: ami-00000000
AWS Credentials: Environment variables

Checking your AWS Credentials for Fugue CLI use ...
[ OK ] Authorized.
Checking your AWS Credentials for Fugue Conductor installation ...
[ OK ] Authorized.

Validating Fugue Conductor AMI ID ...
[ ERROR ] ami-00000000 is not a valid AMI.

There are a few ways to start digging into the issue.

Check the CLI Log

A good place to start troubleshooting is the CLI log. There, you’ll find the full error message, along with the context surrounding it. In this case, we’ve given the CLI an AMI ID that does not exist, and we can confirm this by checking the very end of fuguecli.log:

2016-09-28T18:38:01.09 [ fugue.screen ] ERROR - ami-00000000 is not a valid AMI.
Traceback (most recent call last):
  File "site-packages/fugue_cli/utils.py", line 181, in validate_ami
  File "site-packages/botocore/client.py", line 159, in _api_call
  File "site-packages/botocore/client.py", line 494, in _make_api_call
botocore.exceptions.ClientError: An error occurred (InvalidAMIID.NotFound) when calling the DescribeImages operation: The image id '[ami-00000000]' does not exist

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "site-packages/fugue_cli/cli.py", line 157, in invoke
  File "site-packages/click/core.py", line 1060, in invoke
  File "site-packages/click/core.py", line 889, in invoke
  File "site-packages/click/core.py", line 534, in invoke
  File "site-packages/click/decorators.py", line 27, in new_func
  File "site-packages/fugue_cli/commands/init.py", line 96, in init
  File "site-packages/fugue_cli/utils.py", line 189, in validate_ami
fugue_cli.common.FugueError: ami-00000000 is not a valid AMI.

Here, we can see that the CLI encountered the error InvalidAMIID.NotFound while calling the DescribeImages operation. By cross-referencing AWS’s EC2 error code documentation, we can confirm that this error occurred because the AMI we specified during init does not exist. AWS provides some tips:

The specified AMI does not exist. Check the AMI ID, and ensure that you specify the region in which the AMI is located, if it’s not in the default region. This error may also occur if you specified an incorrect kernel ID when launching an instance.

Should this particular error happen to you, it’s worth checking the Fugue Download Portal to make sure you have the latest AMI ID. If the AMI ID is correct, check the region field in fugue.yaml. AMI IDs are only valid for one region, so if the Conductor is configured to run in a different region than your AMI, the same error is triggered.

Increase CLI Verbosity

The CLI log doesn’t append extra information for all errors, but there’s another way to find more information. Any Fugue command can be executed in verbose mode by using the global -v option. The option increases the verbosity incrementally, so -vvv shows more detail than -v. With increased verbosity, the CLI displays considerably more data that may be useful in troubleshooting, including the precise messages the CLI sends to the Conductor and the responses AWS returns.

Email Us

Finally, you can always send an email to support@fugue.co describing your issue, and we’ll be happy to help you.

Recovery From Provider Downtimes

In the rare event that provider platforms we support experience downtimes, we will place instructions on recovery from those downtimes here. We designed Fugue to be resilient and fail gracefully in the event of dependent service outages. In short, the system is designed to keep running if possible, and save state and “wait out the storm” if it is not. In the latter case, some human intervention may be required to judge when it is safe to resume normal operation.

Entries here should be helpful not only for a particular incident, but also similar incidents that might occur on a smaller scale.

February 28th, 2017: AWS S3 Increased Error Rates

AWS experienced high error rates for the S3 service in the us-east-1 region, which impacted several other services as well.

We found that, as designed, Fugue Conductors put all processes into the Suspended state during this outage. This state means that the Conductor retains all process information, but remains hands-off. This is the safest response to a provider control plane failure, since we cannot be sure of infrastructure state or the consistency of changes we might try to make. You can read more about Fugue’s process model here.

Recovery from this state is simple. You can just use the resume command like this:

$ fugue resume -y (<alias> | <FID>)

This will resume the processes one at time. If you wish, it is safe to resume all processes at once with the following command:

$ for fid in $(fugue status --json | jq -r .[].fid); do fugue resume -y $fid; done

As S3 service recovered, some processes we resumed re-entered suspension due to continued AWS API errors. If this happens in any future incidents, we recommend you wait at least a few minutes — or until there is some more news about the incident — and try again.