Integration of Amazon Web Services into Relion-2.0

We have integrated AWS into Relion-2.0, which means that users interact with Relion-2.0 locally. This makes particle picking and other GUI-based tasks latency-free.

Once installed, users only need to do the following to launch jobs to AWS:

  1. Input & select job parameters as if you were doing it for a local job
  2. Submit to queue? Yes
  3. Select queue: qsub_aws
    • All job parameters are determined automatically, no need to decide or select
  4. Click Run now! to submit job to the cloud
    • All results are synced  back in real time to local machine
  5. Users visualize & interact with outputs on local machine
  6. When job finishes, instance will be automatically terminated.

Cryo-EM structure determination on AWS using only a laptop


Setup & installation

Software required on local machine:

  1. cryoem-cloud-tools
  2. Relion-2.0
    1. Option #1: cryoem-cloud-tools comes with a pre-compiled 'light' version of Relion-2.0 for Mac OSX. This allows users to open the GUI for submitting jobs to AWS
    2. Option #2: Complete installation via Relion-2.0 website

Benefits of AWS/Relion-2.0 integration

  1. Only requires users to understand Relion-2.0 workflow. All AWS commands are integrated without user interaction.
  2. Real time downloading of output files
    • Allows users to monitor output files on local machine as they are generated
    • Provides appearance of running job locally
  3. Data storage management:
    • Data are left on S3 and EBS volumes for user-defined periods of time (typically 2 - 3 weeks). This minimizes transfer time, allowing users to immediately start new analyses without waiting for files to upload (again).
  4. Terminate instance when job finishes
    1. Allows users to walk away / go to sleep after job is submitted

General overview

Workflow :

User: 

1. Select AWS queue in queue type (qsub_aws) and submit job through Relion-2.0 GUI from local machine

queue: qsub_aws

Run now! 

2. Execute Relion-2.0 command on instance (Automated back-end workflow, no user input required)

3. Sync results back to local machine every 10 second (Automated back-end workflow, no user input required)

-> Turn off instance when job finishes (Automated back-end workflow, no user input required)


Detailed description (expert)

 

AWS integration steps :

  1. Select queue type (qsub_aws) and submit job through Relion-2.0 GUI from local machine
    • queue: qsub_aws
    • Run now! 
  2. Data upload to S3 bucket using rclone
    • Multi-file uploads on 10G networking = ~500 MB/sec
    • Located in temporary locations named:
      • s3://rln-aws-{teamName}/{user}/
  3. Start virtual machine on AWS for GPU-based computations
    • Instance-type automatically determined based upon dataset size and job task
    • Restrict IP address access to instance to local machine IP address
  4. Download data from S3 to EBS volume using rclone
  5. Execute Relion-2.0 command on instance
  6. Sync results back to local machine every 10 seconds
    • When job finishes - instance is terminated, but S3 bucket and EBS volume remain on AWS