cisTEM on the cloud
cisTEM is a software package that provides users with the tools necessary to analyze raw movies from the microscope, through all data processing until a refined 3D structure.
We have built a workflow that allows users to launch and run cisTEM jobs on AWS that utilizes cryoem-cloud-tools. Therefore, all steps outlined below assume that you have this software installed.
The basic premise of our approach for cisTEM on AWS is to utilize VNC remote desktop software, so that users launch cisTEM directly on AWS machines.
- Launch cistem_launch.py to start an instance on AWS with cisTEM pre-installed
- ssh into machine, leaving ssh connection 'open' (i.e. not logging out) while using cisTEM
- Open Real VNC on your local machine
- Connect to the host: localhost:1
NOTE: The software will move all data from the directory where the launcher was executed onto AWS
Set up before launching cisTEM on AWS
cistem_launch.py will move all data from the directory where you are launching the script to the instance on AWS.
This means that you should put all movies, micrographs, etc. into a single directory on AWS (symbolic links will work too), and then launch cistem_launch.py.
Launching instance on AWS with cisTEM preinstalled
After getting your data into a single directory, you are now ready to launch cisTEM on AWS. Users can request different instance types, by default the instance will be m4.4xlarge (16 vCPUs).
To launch cisTEM:
$ cistem_launch.py --run
You will see text printed to the terminal giving you updates on the status of launching the instance.
Once finished, you will see instructions on how to connect using VNC.
Connecting to instance via VNC
After the instance is launched, you will be prompted to connect using VNC viewer. This means downloading and installing software to run on your local computer.
Before being able to connect using VNC, you must log into the instance and leave the ssh connection connected. This provides a secure mode of accessing the remote desktop so it is protected by ssh encryption.
After opening the VNC viewer on your local machine, type in the search bar: localhost:1
After hitting 'enter', you will be prompted for a password. The password is the abbreviation for cryo-electron microscopy (no hyphen, no captilizations). This connection is secure because you have used ssh to connect, allowing only those with your keypair and IP address access to the machine.
Then, you will be connected to the desktop of the machine on AWS:
cisTEM expects the same file path structure on AWS as your local machine, therefore you will need to navigate to the directory on AWS that was created with the same path as that found on your local machine.
This means that if I launched cisTEM on AWS from this directory:
You will see this same directory structure on AWS.
NOTE: By default, if you open a new terminal, you will be directed to this location
Create/import cisTEM projects
In order for cisTEM on AWS to work seamlessly for local or cloud computing with cisTEM, you need to place the cisTEM database into the same directory as that was uploaded to AWS.
This means, for the above example, the cisTEM database will be :
Running cisTEM jobs
You will then run all cisTEM jobs using the local option within cisTEM, using all processors that are available.
IMPORTANT: Make sure to download all outputs back to your local machine. To do this, run this command from your local machine:
$ scp -i [path/to/keypair].pem ubuntu@[IP address]:/path/on/instance/Assets/* .
Once you are done with cisTEM on AWS, you will need to shut everything down:
- Check that the files you want are downloaded to your local machine
- Exit out of ssh connection that you used to connect via VNC
- Type cistem_terminate.py to launch terminate script.
For cistem_terminate.py, you will need to specify --save if you want your data to remain on AWS after your job completes.
Otherwise, to terminate instance and remove all of your data on AWS, you type this from the directory that you launched cisTEM: