Speaker
Dr. Sounak Chakraborty
-
Online Short Course

Introduction to Cloud Computing with R in Google Cloud and Amazon Web Services

By Dr. Sounak Chakraborty, Associate Professor, Department of Statistics, University of Missouri, Columbia

Date: September 19, 2020 (Saturday), 9.00 am – 4 pm (attendance online only)

Location: Online Short Course

The attendance cap is set at 150. All course materials, software requirements, and installation guidelines will be provided one week before the day of the short course through e-mail to the registered participants.

​No refund will be issued after 09/11/2020, 4 pm (CST).

Course Description:

The course will provide a gentle introduction of running R software/codes in cloud computing platform using Google Cloud (GC) and Amazon Web Services (AWS) for basic statistical modeling and calculations.  R is one of the world's most popular open source statistical programming languages and is most widely used by academia and major companies.

Top Tier Companies using R and developing cloud-based solutions:

  1. Facebook - For behavior analysis related to status updates and profile pictures.
  2. Google - For advertising effectiveness and economic forecasting.
  3. Twitter - For data visualization and semantic clustering
  4. Microsoft - Acquired Revolution R company and use it for a variety of purposes.
  5. Uber - For statistical analysis
  6. Airbnb - Scale data science.
  7. IBM - Joined R Consortium Group
  8. ANZ - For credit risk modeling

Google Cloud (GC) and Amazon Web Services (AWS) include many different computational tools, ranging from storage systems and virtual servers to databases and analytical tools. For R-programmers, being familiar and experienced with these tools can be extremely beneficial in terms of efficiency, style, money-saving and more. The cloud computing resources provided by GC and AWS are either free or of very minimal cost.

Course topics will include:

  • What is cloud computing and what are the benefits of using a cloud computing platform over traditional desktops and expensive clusters.
  • A Basic Recap/Introduction of R
  • Basic introduction to running R codes in Google Cloud (GC)
  • Basic introduction to running R codes in Amazon Web Services (AWS)
  • Under both GC and AWS we will learn how to use R for a) reading, manipulating, and output data files; b) simple statistical analysis (descriptive statistics, t-test, ANOVA, regression and classification methods); c) generating some basic plots for visualization (scatterplots, histograms, etc)

The course will be completely online and self-paced, so the participants can learn all materials at their own convenience from the comfort of their home. There will a live Q&A interactive session where students can ask question to the instructor. Instructions for recommended software and other course materials will be made available before the course. All recommended software will be open source and can be downloaded free of cost.

Course Fee Structure:

University of Missouri System Students (undergrad and grad): 30$
University of Missouri System Faculty and Staff: $50
Non-University of Missouri System Academic: $100
General Admission Non-Academic: $250

Virtual “seats” are limited so register as soon as possible. To register click here.

About the Course Instructor:

Sounak Chakraborty

Dr. Sounak Chakraborty is an Associate Professor in the Department of Statistics, University of Missouri. His research interests are Bayesian machine learning, variable selection in high dimensional problems, data mining, non-linear models for complex data sets, statistical models for multi-platform data integrations, and Markov chain Monte Carlo driven computational models. He had worked on wide ranging applications of statistical models and tools in areas as bioinformatics, medical science, sociology, ecology, business analytics, biomedical engineering, nanotechnology and nanoscience. Prof. Chakraborty is highly proficient in R, Matlab, SAS, C++, and Python, and has worked extensively in developing and implementing complex statistical models using R, Matlab, C++, and Python in various practical areas.

Materials (To Be Distributed Beforehand through e-mail to all registered participants):

  • Download Instructions (Screenshot step-by-step guide)
  • Short Course Slides (PPT, PDF files)
  • R code files Modules (.txt or .R files)
  • Toy Datasets (.csv, .txt files)

Topics To be Covered & Tentative Schedule (All times in CST):

Session 1: Introduction to Cloud Computing

9.00 am - 9.30 am 

  • Overview 
  • Advantages of Cloud Computing
  • What is Google Cloud
  • What is Amazon Web Services
Break 9.30 am - 9.45 am  

Session 2: A Recap/Introduction to R

9.45 am - 11.00 am

  • What is R
  • Reading Data, Data Summaries, Descriptive Statistics 
  • Basic Statistical Analysis using R
  • Data visualization in R
Break 11.00 am - 11.15 am  

Session 3: Running R in Google Cloud Platform

11.15 am - 12.15 pm 

  • How to set up virtual machines in Google Cloud
  • Submitting R codes and running statistical analysis in Google Cloud
  • Parallel Computing in Google Cloud
Lunch 12.15 pm - 1.15 pm  

Session 3: Running R in Google Cloud Platform Continued

1.15 pm - 2.00 pm

  • How to set up virtual machines in Google Cloud
  • Submitting R codes and running statistical analysis in Google Cloud
  • Parallel Computing in Google Cloud

Session 4: Running R in Amazon Web Services Platform

2.00 pm - 3.00 pm 

  • How to set up virtual machines in Amazon Web Services
  • Submitting R codes and running statistical analysis in AWS
  • Special features and parallel computing in AWS
Break 3.00 pm - 3.15 pm  

Session 4: Running R in Amazon Web Services Platform Continued

3.15 pm - 4.00 pm 

  • How to set up virtual machines in Amazon Web Services
  • Submitting R codes and running statistical analysis in AWS
  • Special features and parallel computing in AWS