About The Course
This course is designed for professionals who aspire to learn ‘R’ language for Analytics. The course starts from the very basics like: Introduction to R programming, how to import various formats of Data, manipulate it, etc. to advanced topics like: Data Mining Technique, performing Predictive Analysis to find optimum results based on past data, Data Visualisation using R Commander, Deducer, etc.
After the completion of ‘Mastering Data Analytics with R’ at LearnChase, you should be able to:
1. Understand the fundamentals of ‘R’ programming
2. Explore data manipulation with functions like grepl(), sub(), apply(),etc.
3. Apply various Data Importing techniques in R
4. Perform exploratory Data Analysis
5. Learn where to use functions- cor(), llist(), hclust(), lm(), glm(), etc.
6. Apply Data Visualisation to create fancy plots
7. Understand Machine Learning (ML) Techniques
8. Apply Data Mining and understand Decision Trees and Random Forests
9. Implement k-means clustering algorithm to perform Text Analysis
10. Study Association Rule Mining to predict buyers’ next purchase
11. Explore and understand Sentiment Analysis
12. Understand the concept of Regression
13. Implement Linear and Logistic Regression and understand Anova
14. Apply Predictictive Analytics to predict outcomes
15. Work on a real life Poject, implementing R Analytics to create Business Insights
Who should go for this course?
This course is meant for all those students and professionals who are interested in working in analytics industry and are keen to enhance their technical skills with exposure to cutting-edge practices. This is a great course for all those who are ambitious to become ‘Data Analysts’ in near future. This is a must learn course for professionals from Mathematics, Statistics or Economics background and interested in learning Business Analytics.
Towards the end of the Course, you will be working on a live project. You can choose any of the following Problem Statements as your Project work :
Project Title: Census Data Analysis
Industry : Government Dataset
Description : Analyze the census data and predict whether the income exceeds $50K per year. Follow end to end modelling process involving:
1. Perform Exploratory Data Analysis and establish hypothesis of the data.
2. Test for Multicollinearity, handle outliers and treat missing data.
3. Create training and validation datasets using Stratified Random Sampling(SRS) of data.
4. Fit Classification model on training set (Logistic Regression/Decision Tree)
5. Perform validation of the models (ROC curve, Confusion Matrix)
6. Evaluate and freeze the final model.
Project Title: Sentiment Analysis of Twitter Data
Industry : Social Media
Description : A sports gear company is planning to brand themselves by putting their company logo on the jersey of an IPL team. We assume that any team which is more popular on twitter will give a good ROI. So, we evaluate two different teams of IPL based on their social media popularity and the team which is more popular on twitter will be chosen for brand endorsement. The data to be analyzed is streamed live from twitter and sentiment analysis is performed on the same. The final output involves a comparable visualization plot of both the teams, so that the clear winner can be seen. The following insights need to be calculated :
1. Setup connection with twitter using twitteR package. And perform authentication using handshake function.
2. Import tweets from the official twitter handle of the two teams using SearchTwitter function.
3. Prepare a sentiment function in R, which will take the arguments and find its negative or positive score.
4. Score against each tweet should be calculated.
5. Compare the scores of both the teams and visualize it.
Can I work on my own Use-Case?
Sure, you can. You can either choose one of the Use-Cases from the LearnChase Repository or create your own.
1. Introduction to Data Analytics
Learning Objectives – This module tells you what Business Analytics is and how R can play an important role in solving complex analytical problems. It tells you what is R and how it is used by the giants like Google, Facebook, Bank of America, etc.
Topics – Understand Business Analytics and R, Knowledge on the R language, community and ecosystem, Understand the use of ‘R’ in the industry, Compare R with other software in analytics, Install R and the packages useful for the course, Perform basic operations in R using command line, Learn the use of IDE R Studio and Various GUI, Use the ‘R help’ feature in R, Knowledge about the worldwide R community collaboration.
2. Introduction to R Programming
Learning Objectives – This module starts from the very basics of R programming like datatypes and functions. We present a scenario and let you think about the options to resolve it. E.g which datatype would you use to store the variable or which R function can help you in this scenario.
Topics – The various kinds of data types in R and its appropriate uses, The built-in functions in R like: seq(), cbind (), rbind(), merge(), Knowledge on the various Subsetting methods, Summarize data by using functions like: str(), class(), length(), nrow(), ncol(), Use of functions like head(), tail(), for inspecting data, Indulge in a class activity to summarize data.
3. Data Manipulation in R
Learning Objectives – In this module, we start with a sample of a dirty data set and perform Data Cleaning on it, resulting in a data set, which is ready for any analysis. Thus using and exploring the popular functions required to clean data in R.
Topics – The various steps involved in Data Cleaning, Functions used in Data Inspection, Tackling the problems faced during Data Cleaning, Uses of the functions like grepl(), grep(), sub(), Coerce the data, Uses of the apply() functions.
4. Data Import Techniques in R
Learning Objectives – This module tells you about the versatility and robustness of R which can take-up data in a variety of formats, be it from a csv file to the data scraped from a website. This module teaches you various data importing techniques in R.
Topics – Import data from spreadsheets and text files into R, Import data from other statistical formats like sas7bdat and spss, Packages installation used for database import, Connect to RDBMS from R using ODBC and basic SQL queries in R, Basics of Web Scraping.
5. Exploratory Data Analysis
Learning Objectives – In this module, you will learn that exploratory data analysis is an important step in the analysis. EDA is for seeing what the data can tell us beyond the formal modeling or hypothesis. You will also learn about the various tasks involved in a typical EDA process.
Topics – Understanding the Exploratory Data Analysis(EDA), Implementation of EDA on various datasets, Boxplots, Understanding the cor() in R, EDA functions like summarize(), llist(), Multiple packages in R for data analysis, The Fancy plots like Segment plot, HC plot in R.
6. Data Visualization in R
Leaning Objectives – In this module, you will learn that visualization is the USP of R. You will learn the concepts of creating simple as well as complex visualizations in R.
Topics – Understanding on Data Visualization, Graphical functions present in R, Plot various graphs like tableplot, histogram, boxplot, Customizing Graphical Parameters to improvise the plots, Understanding GUIs like Deducer and R Commander, Introduction to Spatial Analysis.
7. Data Mining: Clustering Techniques
Learning Objectives – This module lets you know about the various Machine Learning algorithms.The two Machine Learning types are Supervised Learning and Unsupervised Learning and the difference between the two types. We will also discuss ‘K-means Clustering’ and implement it in this module.
Topics – Introduction to Data Mining, Understanding Machine Learning, Supervised and Unsupervised Machine Learning Algorithms, K-means Clustering.
8. Data Mining: Association Rule Mining and Sentiment Analysis
Learning Objectives – This module discusses the very popular ‘Association Rule Mining’ Technique. The algorithm and various aspects of the same have been discussed in this module.We will also discuss what ‘Sentiment Analysis’ is and how we can fetch, extract and mine live data from twitter to find out the sentiment of the tweets.
Topics – Association Rule Mining, Sentiment Analysis.
9. Linear and Logistic Regression
Learning Objectives – This module touches the base with the ‘Regression Techniques’. Linear and logistic regression is explained from the very basics with the examples and it is implemented in R using two case studies dedicated to each type of Regression discussed.
Topics – Linear Regression, Logistic Regression.
10. Anova and Predictive Analysis
Learning Objectives – This module tells you about the Analysis of Variance (Anova) Technique. Another topic that is discussed in this module is Predictive Analysis.
Topics – Anova, Predictive Analysis.
11. Data Mining: Decision Trees and Random Forest
Learning Objectives – This module covers the concepts of Decision Trees and Random Forest.The Algorithm for creation of trees and forests is discussed in a step wise approach and explained with examples. At the end of the class, these are the concepts implemented on a real-life data set. The case studies are present in the LMS.
Topics – Decision Trees, Algorithm for creating Decision Trees,Greedy Approach: Entropy and Information Gain, Creating a Perfect Decision Tree, Classification Rules for Decision Trees, Concepts of Random Forest, Working of Random Forest, Features of Random Forest.
Learning Objectives – This module discusses the concepts taught throughout the course and their implementation in a Project.
Topics – Analyze Census Data to predict insights on the income of the people, based on the factors like : Age, education, work-class, occupation, etc.