Note: If you’re looking for step-by-step instructions on installing Fugue, check out Hello World, Part 1: Fugue Quick Setup.
Like any operating system, Fugue is installed onto your computer – but not just the computer in front of you. Instead, Fugue is a cloud-native system that installs onto your cloud computer: your infrastructure provider account on Amazon Web Services.
The notion of installing Fugue like an operating system is somewhat metaphorical. In practice, installing Fugue in your account means that a few resources will be created in your AWS account that enable Fugue to run. These include at least one compute instance to host a Conductor, a VPC to host this instance, as well as queues, topics, object and key-value storage, and appropriate security roles and principals.
Fugue is installed into an Amazon Web Services account. The installation occupies one account, and one region within the account, although it can communicate with any region’s APIs once it is installed. This way, you can install Fugue in a single region, but manage infrastructure in many.
The Fugue Conductor may only be installed in the following regions:
Installation of Fugue is only necessary once per account region. A successful installation follows this outline of steps:
- You install the Fugue Client Tools, which includes the Fugue CLI, according to the directions here.
- You run
fugue installfrom a properly configured computer with sufficient permissions to install Fugue. The
fugue installcommand blocks at the command line until the rest of the steps are completed.
- The Fugue CLI uses CloudFormation to “bootstrap” the Fugue installation by creating a stack with the necessary components for the Conductor to boot.
- As the Conductor boots, it creates (if it does not find) necessary communication and storage resources to run processes and serve requests.
- Finally, the Conductor sends a “ready” signal back to the installing CLI client to signal success, and then the CLI will tell you Fugue is ready to use.
Major Component Architecture¶
The major components of Fugue in AWS are:
- An EC2 instance, hosting the Conductor. We also build a VPC dedicated to hosting Fugue.
- Several SQS queues and SNS topics, usually prefixed with
fugue-(with exceptions). For now, avoid blanket deletion of SQS queues.
- Several S3 buckets, usually prefixed with
fugue-and also containing your account number. The same caveats apply to avoid blanket deletion.
- Several DynamoDB tables, also usually prefixed with
fugue-, but the same caveats apply here as well.
- Several IAM roles, including those applied to the Conductor to allow it to do process work.
The structural components of the installation are shown here.
+ Local | Cloud | | | | | +---+ +---+ +-------------->SQS| |SNS| | | +-^-+ +-^-+ | | | | | | | | | | | | | | | | +---------------+ | | | | | | +------------+--+ | +-+---------+-+ | | | | | | | | AWS APIs | | $> fugue ... | | | Conductor +-------> for user | | | | | | | workloads | | Client PC | | +-+---------+-+ | | | | | | | | | +---------------+ | | | +---------------+ X X X X X X X X X | | | X X X X X X X X X | +-v--+ +-v-+ XXXXXXXXXXXXXXXXX | | S3 | |DDB| + +----+ +---+
As you can see, the major components of Fugue are all present in “the cloud,” specifically an infrastructure provider account that you own. Only the Fugue client is present on your local machine. The Fugue Client communicates with the Conductor indirectly, by way of asynchronous messaging. The Conductor has no open ports by default.
On The Client¶
There are three primary components on the client machine:
- The Fugue CLI, present in the form of the
fuguebinary. This is what forms all requests to, and parses all responses from, the Fugue Conductor.
- The Ludwig compiler, present in the form of
lwcbinary. In general, the
fuguebinary (specifically, in the
updatecommands) handles invoking the compiler with the proper switches and preparation, much like
makeand similar tools.
- The Fugue Libraries for Ludwig, which you can use to create compositions defining infrastructure. These libraries include both basic building blocks for infrastructure, as well as common infrastructure patterns generalized for reuse and abstracted for simplicity.
On The Conductor¶
The Conductor uses a service-oriented architecture, with significant application concerns separated among different components. There are many, and an exhaustive list is beyond our scope here, but there are three significant components to know about when considering how the system works (as well as deciphering logs):
- The Scheduler, which ensures that processes are executed efficiently;
- The Manager, which coordinates the work of components on the Conductor to plan work, and;
- The Broker, which coordinates with third-party APIs, such as that of AWS, to execute work.
These components are discussed in additional detail in the next chapter.
The CloudFormation Stack¶
When you execute
fugue install, Fugue deploys a CloudFormation stack
in order to install the Conductor in the target AWS account. The
CloudFormation stack creates the following resources:
- Auto Scaling Group
- DynamoDB Table
- IAM Role
- IAM Instance Profile
- EC2 Route
- Auto Scaling Group Launch Configuration
- SNS Topic
- EC2 Route Table
- 2 EC2 Subnets
- 2 EC2 Subnet Route Table Associations
- EC2 VPC
- EC2 VPC Gateway
- EC2 VPC Gateway Attachment
- EC2 Security Group
Setting the following fields in the
conductor block of the
fugue.yaml file or in environment variables
will affect the CloudFormation stack in different ways:
- The default region in which the CloudFormation stack is deployed
(and where the Conductor is installed) is
us-east-1. Specifying another value here changes that region. The Fugue Conductor may only be installed in the following regions:
us-gov-west-1. (Environment variable:
- Modifying this value changes the Conductor AMI that the
CloudFormation stack uses to launch the Conductor EC2 instance. The
AMI ID can also be set with
fugue init --ami <ami-id> <region>. (Environment variable:
- Modifying this value changes the instance type (in
<family><generation>.<size>format) used for the Conductor EC2 instance. The default type,
m4.large, is the only type Fugue formally supports, so change this value at your own risk. For more information on different instance types, see the AWS documentation. (Environment variable:
- By default, there are no open ports on the Conductor. However,
specifying an SSH keypair name here will change the stack in order
to create a security group ingress rule allowing access to the
Conductor on port 22 from the IP address where
fugue installis run. This enables you to log into the Conductor instance via SSH. (Environment variable:
- Specifying a bucket name here means that Fugue stores compiled
compositions in the given bucket, rather than creating a new bucket
in CloudFormation (named according to the
fugue-<account number>-<region>format). Note: Bucket names must be globally unique. (Environment variable:
- Specifying a bucket name here means that Fugue stores large Vars
values in the given bucket, rather than creating a new bucket in
CloudFormation (named according to the
fugue-large-value-<account number>-<region>format). Note: Bucket names must be globally unique. (Environment variable:
- In the region where the CloudFormation stack is deployed, Fugue
automatically selects two AZs for the Conductor’s Auto Scaling Group
to straddle. You may specify the two availability zones yourself
here. Setting your own AZs is useful if you encounter an “AWS
Cloudformation stack creation failed” error. (To read more about
this error, see the
guide.) (Environment variable:
AWS Permissions and the Fugue CLI¶
The Fugue CLI requires a particular set of AWS permissions. During the
install process, Fugue creates two IAM policies,
fugue-user-iam-policy, along with
corresponding IAM roles,
fugue-user-<region>. Together, these policies contain the minimum
required permissions to run the Fugue CLI:
- The installer policy allows only
- The user policy allows only
user, and certain features of