czwartek, 5 maja 2016

Run your R computing with RStudio in the cloud!

The goal of this tutorial is to present a quick jump-in into running R on Amazon Web Services (AWS) Elastic Computing Cloud (EC2) instances.


Two paths are discussed (1) Quick and (2) Secure.


The quick path does not support sufficient security for the server running R. Hence, this path can be also used for playing with the environment. The secure path includes building a tunnel with a SSH connection and offers an industry-standard security.


Please not that this is a tutorial for people making their first step in cloud computing. Users having experience with the AWS cloud can simply follow these three simple steps:


1. Start an EC2 instance using an AMI by http://www.louisaslett.com/RStudio_AMI/

2. Either open HTTP connection to the EC2 instance on port 80 (Quick path) or setup an SSH tunnel to the remote port (Secure path)

3. Connect with a web browser using default credentials: "rstudio" for login and password.

Prerequirements

In order to use R in the cloud you need to have an Amazon Web Services (AWS) account account. The detailed instructions on how to do that have been given in previous posts.

1 The quick path

The quick path required simply clicking through the AWS web page. Please follow these steps for your Gnu R in the clouds:

1. Login to your an Amazon EC2 account

2. From available services select EC2 - Virtual Servers in the cloud

3. Press the "Launch Instance" button

4. Click on the "Community AMIs" tab

5. In the search field type "www.louisaslett.com". This will list a procompiled Amazon Images by Louis Aslett described on http://www.louisaslett.com/RStudio_AMI/

6. Select an AMI with the latest RStudio version

7. In the "Step 2: Choose an Instance Type" select a hardware to run your AMI. For computational purposes you usually want c4.* instances. If you do not know what to choose please select c4.large.

8. Click "Review and Launch" in the right bottom corner

9. Now click "Edit security groups" link

10. Click "Add rule"

11. In the "Type" dropdown list select "HTTP"

12. Click "Review and Launch" in the right bottom corner

13. Click "Launch" in the right bottom corner

14. Select an existing key pair that will be used to connect to your instance or create a new key pair. If you select to create a key pair, download the key file and store it in a secure location. After a key pair is set click "Launch instances"

15. You will see a message "Your instances are now launching". Click on the instance name starting with "i-".

16. After clicking the link you will go to the "Instance list" screen

17. Copy the "Public DNS address" (in this example ec2-52-23-164-62.compute-1.amazonaws.com) into your web browser

18. Type "rstudio" for user and password to login. Carefully read the Welcome.R file and follow the instructions to change the password.

19. The setup is complete! Remember to shut down the instance after finishing your computations. In order to shut down an instance right click on it and set either "stop" (to continue using it later) or "terminate" (deletes the instance).

2 The secure path

The instructions provided in the previous Section use an unencrypted HTTP connection over the Internet and hence offer no security for the user – all transferred passwords and data can be easily accessed by a third party. The easiest solution to this problem is to provide a SSH tunnelling for your HTTP connection – the data transfer will be safely encrypted with the SSH protocol.

Prerequirements

For the secure connection you need.

• The key file discussed in the point 14 of the "quick" tutorial.

• putty.exe that can be obtained from http://www.putty.org/ or other SSH client.

In order to follow the secure path:

• In the step 11 do not allow for an HTTP connection

• Instead of that connect through an SSH tunnel. You can use either putty or the ssh.exe in order to set-up the SSH tunneling of HTTP protocol.



Brak komentarzy:

Prześlij komentarz