Cloud Guide
This page describes a particular configuration of the workspace service
that allows the cloud-client to operate out of the box. If you've never
configured the workspace service before, you should be able to follow this
page conceptually but it is not meant to be a replacement for the
administrator guide which will still need
to be consulted.
This page is for deployers of the cloud configuration to
learn about it and configure the workspace service for it. This is
not necessary for cloud users to read and understand. If you
are a cloud user just looking to understand how to launch and manage VMs
on an existing cloud, start at the clouds page.
Table of Contents
Overview
The service must be set up in
resource pool
mode, controlling any number of
VMM nodes. You may use the workspace
pilot to integrate with a
local resource scheduler. An image repository must be set up, this will
host workspace image files for each client. When a client runs a workspace,
the image to use is transferred from the repository to the VMM that will
be running it.
For the sake of discussion we will assume that the workspace service and
file repository setup are on different nodes. This does not necessarily
need to be the case but it is the recommended configuration because of the
heavy I/O traffic the repository can experience.
The workspace service should be installed as
normal on the service node and GridFTP
must be installed on the repository node.
The server addresses must be directly reachable from the Internet
or otherwise configured to deal with being NAT'd. The Globus container
(where the workspace service runs) and GridFTP can both be setup for
NAT or other port forwarding situations.
The diagram above depicts the basic setup.
-
A special workspace client called the "cloud-client" invokes operations
on the service and GridFTP server. A number of defaults are assumed
which makes this work out of the box (these defaults will be discussed
later).
-
Files are transferred from the cloud-client to a client-specific
directory on the repository node (manual or other types of GridFTP
based transfers are also possible if the user is comfortable with
using grid tools directly).
-
The service invokes commands on the VMMs to trigger file transfers
to/from the repository node, VM lifecycle events, and destruction/clean
up.
-
If the workspace state changes, the cloud-client will reflect this to
the screen (and log files) and depending on the change might also take
action in response.
User Experience
Working backwards from the user's cloud-client experience is a
good way to understand how the service needs to be setup.
Here is an abbreviated depiction of a simple user interaction with a cloud,
to give you an idea if you've never used it. This does not depict an
image transfer to the repository node but that is similarly brief.
-
A grid credential is needed, there is an embedded
grid-proxy-init program if that is necessary.
-
You can list what's in your repository directory:
$ ./bin/cloud-client.sh --list
Sample output:
[Image] 'base-cluster-01.gz' Read only
Modified: Jul 06 @ 17:34 Size: 578818017 bytes (~552 MB)
[Image] 'globus-002' Read only
Modified: Jun 12 @ 18:55 Size: 3758097408 bytes (~3584 MB)
[Image] 'hello-cloud' Read only
Modified: May 30 @ 14:16 Size: 524288000 bytes (~500 MB)
[Image] 'hello-cluster' Read only
Modified: Jun 30 @ 20:18 Size: 524288000 bytes (~500 MB)
-
And pick one to run (ignore the 'cluster' images for now)
$ ./bin/cloud-client.sh --run --name hello-cloud --hours 1
Sample output:
SSH public keyfile contained tilde:
- '~/.ssh/id_rsa.pub' --> '/home/guest/.ssh/id_rsa.pub'
Launching workspace.
Using workspace factory endpoint:
https://cloudurl.edu:8443/wsrf/services/WorkspaceFactoryService
Creating workspace "vm-023"... done.
IP address: 123.123.123.123
Hostname: ahostname.cloudurl.edu
Start time: Fri Feb 29 09:36:39 CST 2008
Shutdown time: Fri Feb 29 10:36:39 CST 2008
Termination time: Fri Feb 29 10:46:39 CST 2008
Waiting for updates.
Some time elapses as the image file is copied to the VMM node. Then
a running notification is printed:
State changed: Running
Running: 'vm-023'
-
The client had picked up your default public SSH key and sent it to
be installed on the fly into the VM's authorized_keys policy
for the root account. So after launching you can use the printed
hostname to log in as root:
$ ssh root@ahostname.cloudurl.edu
You can see an example of a cluster cloud-client deployment on the
one-click clusters page.
So how does this happen?
Assumptions and Defaults
A number of things go into making the cloud client work out of the box,
but it is in large part accomplished by giving the user a downloadable
package with a number of default configurations.
These defaults limit functionality options in some cases, but that is the
idea: eliminate decisions that need to be made and set working defaults.
There are avenues left open for experienced users to do more
(for example, by overriding the defaults or even switching over to the
regular workspace client).
In the previous section, the first thing that probably stands out is that
there are no contact addresses being entered on the command line.
The service and repository URLs are derived from a properties file
that is included in the toplevel "conf" directory of the cloud-client
package. An example file is this
cloud.properties file which
is currently distributed for the Nimbus cloud.
Note: How properties files and commandline overrides work is covered
in a later section in detail, it is all designed to be
flexible under the covers. If you don't want to follow the conventions
laid out in this current "assumptions" section, it will be important to
understand the later section to know how to change things for a good
client package or properties file(s) that your users can use. Continue
reading this section first, though, to get the basic ideas.
There are three main groups of assumptions and defaults. The first is the
contact and identity information of the workspace service and GridFTP
server (see above for configuration sample where this are specified).
The other two groups make up the rest of this "Assumptions" section:
Deriving per-user repository directories
For GridFTP based commands (like --list, --delete, and
--transfer) the server to contact is based on the contact in
the cloud properties file. The X509 identity to verify is in the
cloud properties file. If that property was missing, identity checks
would be based on hostname.
Remember that we are not going to discuss the various ways of
getting options in this "Assumptions" section.
When you transfer a local file, the target of the transfer is the same
filename in your personal repository directory. When you refer to the name
of a workspace to run, this name must correspond to a filename in your
personal repository directory.
We know where the repository comes from but how is that directory derived?
There are two other components to derive the directory used: the configured
base directory property and the hash of the caller's X509
Distinguished Name.
-
The configured base directory property. The default
configuration for the base directory on the repository node is
"/cloud".
-
A hash of the caller's X509 Distinguished Name is used as
the subdirectory of the base directory. The algorithm for this
is based on MD5. It produces a string of eight characters, for
example "31ceb17f". The credential being used for the
call is inspected to get the user's DN.
The directories for each user are created by the administrator. Any
(unlikely) hash collisions would be detected at this point. You can
see the hash of any "Globus style" DN with the --hash-print option
of the cloud client. For example:
$ ./bin/cloud-client.sh --hash-print "/DC=org/DC=agrid/OU=people/CN=John Q. Public"
Sample output:
DN: /DC=org/DC=agrid/OU=people/CN=John Q. Public
HASH: a9bad55
So with a hypothetical repository hostname "repository.cloudurl.edu",
"/cloud" base directory and DN hash of "a9bad55", the derived GridFTP URL
of the user's "my-workspace" file will be
gsiftp://repository.cloudurl.edu:2811//cloud/a9bad55/my-workspace
Note that there is a cloud-client option to input any name or local
file path and see what the derived URL is. See the --extrahelp
description of the --print-file-URL option.
Runtime assumptions
The second set of assumptions to cover is how a given image file is going
to actually work. There are many options that you can specify in regular
workspace requests. For example, the memory size, the number of network
interfaces to construct, the pool name(s) to lease network addresses from,
and the partition name the VM is expecting for the base partition.
Some fixed assumptions are made:
- There can be only one network interface
- The network interface is expecting its address via DHCP
-
There can be only one partition file, for the root partition,
configured with an ext2/ext3 filesystem. Other filesystems may not
work correctly (this has to do with the cloud's default kernel as well
as its ability to edit the image's files before boot).
The rest of the launch request is filled by default configurations,
here they are:
- Request 3584 MB of memory
- Request networking address from a pool named public
- Mount the partition to sda1
Necessary Configurations
The previous section summed up the defaults and main assumptions. Opting
to follow these conventions in your cloud leads to these configuration
conclusions:
-
Install the workspace service in
resource
pool mode.
-
Configure an
association
for addresses to lease and call it "public".
-
Create a cloud.properties file for your cloud with
the values in this example
file changed to reflect the correct URLs and identities.
-
If you need to adjust the default memory request, add a line of
text like so to the cloud.properties file you will
distribute: vws.memory.request=2560
-
Create a /cloud directory on the repository node.
-
For each user, take the hash of their DN (using --hash-print)
and create a directory for them under the /cloud base
directory.
Properties and Options
This section goes into more detail about the property file and commandline
configurations. This is especially important to understand if you want
to diverge from the defaults above.
All commands go through cloud-client.sh which in turn
invokes the actual cloud client program. The cloud client is written
in Java and installed at lib/globus/lib/workspace_client.jar.
Before calling this program, the script sets up some things:
-
../conf/cloud.properties is set as the user properties file
-
../lib/globus becomes the new GLOBUS_LOCATION (overriding
anything previously set)
-
../lib/certs is set as a directory to add to the trusted
X509 certificate directories for identity validations (the client
verifies it is talking to the right servers). Adding the CA cert(s)
of the workspace service and GridFTP host certificates to this
directory ensures that the user will not run into CA (trusted
certificates) problems.
The cloud client program respects settings from three different
places, listed here in the order of precedence:
-
Commandline arguments - If the client uses one of the
optional flags listed in ./bin/cloud-client.sh --extrahelp,
these values are used. Many things can be overriden this way,
including the service contacts.
-
User properties file - An example of this was given
above (the cloud.properties
file which is currently distributed for the
Nimbus cloud).
Note that you can include different properties files and have your
users switch between clouds using
./bin/cloud-client.sh --conf ./conf/some-file.
If no --conf argument is supplied, the default file
cloud.properties needs to exist. If you need to change
this in your client distribution for cosmetic reasons, you can
do so by editing the one relevant line at the top of
./bin/cloud-client.sh
-
Embedded properties file - A properties file lives inside
the workspace client jar (which is installed into
lib/globus/lib/workspace_client.jar). This controls all
the remaining configurations.
There are (intentionally) no fallback settings for the properties
found in that sample
cloud.properties file:
- ssh.pubkey (Path to SSH public key to log in with)
- vws.factory (Host+port of Virtal Workspace Service)
- vws.factory.identity (Virtal Workspace Service X509 identity)
- vws.repository (Host+port of image repository)
- vws.repository.identity (Image repository X509 identity)
See the configuration appendix for other, more
esoteric defaults that can be tampered with.
Clusters
To enable one-click clusters, you need to enable the context broker (see
the admin guide).
Security
The plugins page discusses the
"groupauthz" plugin which provides for many generally useful policies to
be enforced, but one in particular is necessary for the cloud configuration
to operate properly. The identity-hash based image subdirectories option
ensures that propagation source paths and unpropagation target paths are
specific to the caller using the hashing algorithm discussed above.
The workspace-control user account is empowered to run all workspaces,
so this authorization of specific requests is necessary before the "enactment"
command is sent out to workspace-control, work done on behalf of the client
but importantly not as the client.
For the repository node you currently need
GridFTP to handle remote transfers.
Each cloud user's DN must be in the GridFTP grid-mapfile (an access control
list that also maps each DN to a specific unix account). In order to
prevent users from maliciously overwriting each others files when talking
to GridFTP directly, currently each cloud user must be mapped to a unique
unix account which is part of a unique unix group on the repository node.
The umask of each user should be set to 0007 to allow for group read/write
permissions on the files. This allows one control account to be used for
propagation: in the workspace-control configuration file, there is a
setting to force it to propagate and unpropagate with one specific user.
That user needs to be a member of every cloud account's group. Thus, the
control user has access to read and write all files (for
propagation and unpropagation tasks) but each remote client may
only read and write their personal files.
Say that the base directory on the repository node is "/cloud", you will
need to create a directory for each DN based on the hash. For easy tracking
purposes, the recommendation for the unix account and group is to use the
hash for each.
Recognizing that this burdens the administrator, we are planning on adding
SAML based authorization support. In this configuration, the
cloud client will (transparent to the human user) get a SAML assertion
from the workspace service and present it to the GridFTP server for access.
The SAML authorization statement restricts that client's rights to only
reading and writing their personal files.
This allows for a lot less administrative overhead when adding cloud users.
Currently we are waiting on
this
GridFTP work to be completed.
Configuration Appendix
These are the embedded properties that are shipped with the cloud client,
they can also exist in the cloud properties files to override the defaults:
# Default ms between polls
vws.poll.interval=2000
# Default client behavior is to poll, not use asynchronous notifications
vws.usenotifications=false
# Default memory request
vws.memory.request=3584
# Image repository base directory
vws.repository.basedir=/cloud/
# CA hash of target cloud
vws.cahash=6045a439
# propagation setup for cloud
vws.propagation.scheme=scp
vws.propagation.keepport=false
# GridFTP transfer timeout, 0 is infinite
vws.gridftp.timeout=0
# Metadata defaults
vws.metadata.association=public
vws.metadata.mountAs=sda1
vws.metadata.nicName=eth0
vws.metadata.cpuType=x86
vws.metadata.vmmType=Xen
vws.metadata.vmmVersion=3
# Filename defaults for history directory
vws.metadata.fileName=metadata.xml
vws.depreq.fileName=deprequest.xml