The following protocols assumes that cryoem-cloud-tools and Relion-2.0 are installed on local machine/laptop.


Important:

  1. Users do not need to use the command line to monitor resources or job status. All information will be displayed in the standard output window in the Relion-2.0 GUI.
  2. The GUI can be closed and the job will still continue on AWS and shutdown automatically.
  3. Aliased directories are NOT supported at this time. 

General overview

1. Enter job information

(Tabs: I/O, CTF, Optimisation, Sampling, Helix)

Provide all job information as you would have normally. This means, select your .star file, provide job information, etc.

Additional notes: 

  • .mrcs stacks are also accepted as inputs for 2D & 3D analysis

2. Skip 'Compute' tab

All job parameters will be determined automatically. So, don't worry about figuring out these parameters.

3. Specify queue to AWS

The only information needed to run your job on AWS is specifying a new queue submit command: qsub_aws.

Job information:

  • Number of MPI procs - IGNORE
  • Number of threads - IGNORE
  • Submit to queue? YES
  • Queue name - IGNORE
  • Queue submit command - qsub_aws
  • Standard submission script - /path/to/cryoem-cloud-tools/relion/relion_qsub.sh (Should be default)
  • Minimum dedicated cores per node - IGNORE
  • Additional arguments - [All arguments are passed onto the job running on AWS]

Example screenshot:

 

4. Submit!

After providing the above information, click: Run now!

5. Monitor job

After you submit your job, information will start to be displayed in the standard out window. This will include information regarding data upload to AWS, starting / stopping instances etc.

Be patient! The output will update as often as new process arrive, which sometimes means waiting minutes until you see the next phase of processing occur.

If anything goes wrong, there will be an error message displayed in the standard error window and the instance will be powered off.

As the job runs on AWS, you will see the same output information that you would've seen normally in the Relion-2.0 GUI:

  • Iteration number & processing times
  • Current resolution

All output files are transferred back to your local machine, so you can immediately start to interact with your data.

6. Finishing up

When your job completes successfully, the instance will be turned off.

To terminate your job (and instance) before it has completed. Open & edit the 'Note' for the job, including the text: "Kill job". This will terminate the job & instance. Learn more about this at the top of this webpage.

Monitoring job submissions to AWS

  • Once you submit the job by pressing Run now! All status updates related to your Relion job AND the AWS instance will be updated in the standard out window panel (which means the run.out file in the job directory).
    • This means that you will see when your data are being uploaded, when the instances is being turned on, etc:

  • All results will be synced back to your local machine in real time
    • This means that you can visualize any results (final, or intermediate) on your local machine using the Relion-2.0 GUI

Job termination on AWS

  • Every time a job completes successfully or crashes on the cloud, the instance will be turned off.
  • If a job dies, you cannot simply 'restart' on the same instance. You will have to submit a new job
  • To terminate a job BEFORE it has completed:
    • Edit the note for the run (e.g. Class3D/job005): Job actions > Edit Note
    • Type Kill job into note at bottom of document:

    • Click Save
    • This will terminate your job and you should see text written to the output window that the job is shutting down.
    • Full shutdown will take approximately 2  - 3 minutes

Monitoring AWS costs

  • After a job has finished (through completion, termination, or job crashing), a text file will be updated in your project directory: aws_relion_costs.txt
  • This file will provide information on the input directory name, output directory name, and cost.

Expert options

Include these options via the extra command input line in the RELION2 GUI:

--deleteaws To delete all data from EBS and S3 storage after job completes

--instance [instanceType] To specify the desired instance, overriding automated decision making

  • Note: Only choices of instances: p2.xlarge, p2.8xlarge, p2.16xlarge, g3.4xlarge, g3.8xlarge, g3.16xlarge

--spot [spot price] Specify spot price for instance on AWS, means that instance will only be fulfilled if spot price requested is above current price.

--az [availability zone] Specify region+availability zone name (e.g. us-east-2a)


Full processing on AWS

The following steps will walk users through the full processing pipeline utilizing AWS computing resources. Note that you can jump into any step of the pipeline to leverage AWS resources, you do NOT need to go in order.

Import

Import local movies into Relion GUI:

  • Input files: Micrographs/*.mrcs
  • Node type: 2D micrograph movies (*.mrcs)

Motion correction

Specify type of movie alignment to perform. Only MOTIONCORR and MOTIONCOR2 are available at this time. 

Include options as you specify normally for movie alignment and then submit to AWS (qsub_aws).

Running MOTIONCOR2:

  • During movie alignment with MOTIONCOR2, if you specify dose weighting, BOTH output files will be saved to the output directory: the non-dose weighted (.mrc) and the dose weighted (DW.mrc).
  • Resulting output files:
    • corrected_micrographs.star - non-dose weighted STAR File. Use this file for CTF estimation.
    • corrected_micrographs_doseWeighted.star - dose weighted STAR file. Use this file for particle extraction.

Additional information: 

  • Make sure to remove any special characters from the path for MOTIONCORR executable (e.g. ?/// ?)
  • We highly recommend using projects if you are doing movie alignments. It will help you keep track of your data on AWS. Read more about projects here.
  • All aligned micrographs will be returned to your local machine, however, if you choose to save aligned movies, they will remain on AWS only.
  • If you've already uploaded your movies as a part of a different movie alignment, you can include the project name and folder name as the input star file:
    • Locate project name using aws_list_projects (e.g. empiar-10061)
    • Local directory in project that has movies (e.g. motioncorr-job007)
    • Provide the project name with directory in Relion GUI, directory, using 's3-' as a prefix.
      • For the example above, the directory: empiar-10061/motioncorr-job007/ will be input as: s3-empiar-10061/motioncorr-job007/

CTF estimation

Due to its speed, we are only supporting Gctf at this time.

Provide microscope information as you would normally, and then submit the job to AWS (using qsub_aws). All output results will be synced back to your local machine.

Depending on the size of the job, your job will launch to either 1 or 8 GPUs.

Additional information: 

  • If you are using aligned micrographs from MOTIONCOR2 on AWS (from above), input: corrected_micrographs.star (non-dose weighted)
  • Leave Gctf executable field empty
    • Remove any special characters from the Gctf executable panel (e.g. '^m%^)

Manual picking

All manual picking will be performed using your local installation of Relion-2.0. This ensures that there is no network latency slowing down your interaction with the data.

Auto-picking

Workflow for auto-picking with AWS:

  1. Turn auto-picking parameters using LOCAL execution of command.
    • This is OK to run on laptops even, it will take a few minutes.
  2. Once you find parameters that you would like to run over all micrographs, submit to the cloud (qsub_aws). This will execute GPU-accelerated particle picking on AWS.

Particle extraction

Two choices for particle extraction:

  1. Local extraction: Extract particles without submitting to AWS. This should be OK in most cases.
  2. Extract on AWS: If desired, the extraction step can be performed on AWS, just specify Submit to queue? Yes and qsub_aws as the queue name.

Particle sorting

This analysis is currently not supported on AWS, and therefore must be performed locally (for the time being).

Subset selection

This is a task that must be performed locally (without AWS).

2D classification

Provide an input star file (or a .mrcs particle stack) for 2D classification. Input parameters for the job as you would normally, and then submit to qsub_aws.

Note: Datasets with more than 100,000 particles will be analyzed using the 16 GPU machine on AWS (p2.16xlarge). This is much faster than a 4 GPU machine. Please check out the benchmarking page for more information regarding run times.

Expert options:

  • Delete all data from AWS after job finishes: --deleteaws
    • By default, this program will leave data on S3 and EBS volumes after job finishes, so that users do not have to re-upload data for each run. This can be bypassed by specifying --deleteaws through the 'Additional commands' input line in the Relion GUI.
  • Change availability zone for job submission: --az {zonename}
    • To override default availability zone selection, users can provide the desired availability zone directly through the GUI through the option: --az {zonename}
    • Example: --az us-west-2c

3D classification

Provide an input star file (or a .mrcs particle stack) for 2D classification. Input parameters for the job as you would normally, and then submit to qsub_aws.

Note: Please check out the benchmarking page for more information regarding run times.

Expert options:

  • Delete all data from AWS after job finishes: --deleteaws
    • By default, this program will leave data on S3 and EBS volumes after job finishes, so that users do not have to re-upload data for each run. This can be bypassed by specifying --deleteaws through the 'Additional commands' input line in the Relion GUI.
  • Change availability zone for job submission: --az {zonename}
    • To override default availability zone selection, users can provide the desired availability zone directly through the GUI through the option: --az {zonename}
    • Example: --az us-west-2c

3D auto-refine

Provide an input star file (or a .mrcs particle stack) for 2D classification. Input parameters for the job as you would normally, and then submit to qsub_aws.

Note: All datasets will be analyzed using the 8 GPU machines on AWS (p2.8xlarge) since auto-refine requires more than 1 GPU. Please check out the benchmarking page for more information regarding run times.

Expert options:

  • Delete all data from AWS after job finishes: --deleteaws
    • By default, this program will leave data on S3 and EBS volumes after job finishes, so that users do not have to re-upload data for each run. This can be bypassed by specifying --deleteaws through the 'Additional commands' input line in the Relion GUI.
  • Change availability zone for job submission: --az {zonename}
    • To override default availability zone selection, users can provide the desired availability zone directly through the GUI through the option: --az {zonename}
    • Example: --az us-west-2c

Movie refinement

This job will be submitted to AWS. Input all required parameters and then submit using qsub_aws. During this task, first, multiple d2.8xlarge instances will be used for movie extraction, which will be followed by a single x1.32xlarge instance performing the 3D refinement step.

Note:

  • Increase your limit of d2.8xlarge instances to 8
  • 'Process micrographs in batches of ...' is ignored in addition to MPI job parameters. These are determined automatically.

Particle polishing

This job will be submitted to AWS. Input all required parameters and then submit using qsub_aws and processed using x1.32xlarge instances.

Important: This assumes that Movie refinement was performed on AWS.

Mask creation

This will run on your local machine.

Join star files

This will run on your local machine.

Particle subtraction

This job will be submitted to AWS. Input all required parameters and then submit using qsub_aws.

Post-processing

This will run on your local machine.

Local resolution

This will run on your local machine.