What you'll learn
- Basic statistical skills such as fundamental theories and terminology
- Introduction to R, data cleaning, data visualisation and packages in R that can be used for data analysis
- How to use Excel for descriptive statistics
- Introduction to Tableau
- How to interpret and present results
Diploma in Data Analytics
Starting Your Data Analyst Journey
Your journey to becoming a data analyst will start by understanding more about data and the analysis thereof. This lesson is geared to help you understand why data analysis is an important skill as well as where it can be used to enhance your business making decisions. Each lesson in this module is carefully balanced between theory and practicals, and in this lesson, you will learn how to import and clean data using a variety of methods and tools. You will focus on logical checks to help guide you towards thinking about data in a more logical fashion.
In this lesson, you will begin to understand data in a bit more detail. The aim is to assist you in understanding the different data types (such as categorical vs numerical) as well as understanding graphically represented data. You will also learn how to describe data (i.e. descriptive statistics) and how to use them.
As your journey continues, you will learn how to install the Data Analysis Toolpak together with some descriptive stats. This lesson will touch briefly on the basics of probability (with specific reference to Bayes Theorem) and delve into the details of mean and variance of random variables. This topic very neatly ties together a concept that you have previously covered (the mean) with one you are yet to cover (variance).
Lesson 4 is all about distributing data. You will learn about the various data distributions (with reference to the Central Limit Theorem) and understand how to use mean, median and standard deviation to know how your data is distributed. Lastly, you will also learn about skewness and kurtosis.
How Confident Are You in the Sample?
Being confident in your sample is important. This lesson will focus on understanding the difference between a sample and a population as well as when to use variance or standard deviation for each. You will also cover confidence intervals in more detail, and by the end of this lesson, you will be well on your way to feeling more confident!
Hypothesising About the Outcome
Understanding what a hypothesis is is an important step in your journey. This lesson will expand on what a null and alternative hypothesis is and explore the difference between a Type 1 and Type 2 error. This lesson will also include more information on the Central Limit Theorem/the law of large numbers.
Testing for Differences: Categorical Vars
The penultimate lesson is focused on testing for differences (categorical vars). You will explore one-sample tests, the difference between two means of two populations as well as Chi-square tests.
Testing for Differences: Numerical Vars
This module will wrap up with an understanding of testing for differences (numerical vars). In this lesson, one-sample tests, the difference between two means of two populations, and T-tests will be covered. By the end of this lesson, you will have a firm and complete understanding of the basics of data and data analysis. However, the journey does not stop here and in Module 2 you can expect more complex concepts and a deeper understanding of the topic.
Intermediate in Data Analytics
In this lesson, we add a new tool to our data analyst toolkit, called R. We will go through the basic steps of downloading and installing the tool and start exploring some of the packages that are available today in R. We will end the lesson by introducing another common method to estimate population parameters, the maximum likelihood method.
The first topic will introduce the brilliant package tidyverse by hadley wickham, the chief data scientist at rstudio. Thereafter, we will use R to reproduce some of the exploratory data analysis we have done with the titanic dataset in excel in module 1. We will end this lesson with a short and sweet introduction to merging and joining datasets.
Introduction to Linear Regression
The first topic for this lesson will introduce linear regression, thereafter we will dive deeper into understanding the concept of correlation. We will end this lesson by going back to basics with vectors and factors in R.
Linear Regression Continued
This lesson will continue to broaden our understanding of linear regression and data frames. We will understand what it means for the model to fit the data well and gain some further insight into treating data in R. We will end the lesson by exploring some basics surrounding dates values in R.
Dates and Times
Lesson 5 will continue to broaden our understanding of dealing with dates and times in r. Many datasets contain dates and times and we need to make the step of data handling dates and times as simple and effective in our data analytics arsenal as possible. Therefore, we will continue building on dates and times data wrangling throughout a large part of this lesson. We will end the lesson by introducing time series analysis concepts.
Time Series Analysis
This lesson will delve deeper into time series analysis. We will further discuss the concepts surrounding time series analysis that we introduced in the previous lesson and add some new concepts to that. Thereafter we will break down and understand some of the time series models a bit better. We will end today’s lesson by looking at how we can apply these concepts learnt in R in a more practical sense.
Multiple Linear Regression
In this lesson, we will elaborate on the principles of multiple linear regression. We will talk more about the assumptions that accompany linear regression, how to simplify a multiple linear regression model, and problems that can occur when fitting a multiple linear regression model to the data. Thereafter, we will discuss what happens if your model does not fit a linear trend well, in other words, if the data is non-linear. The lesson will end with an introduction to logistic regression.
Introduction to Logistic Regression
In this lesson, we will elaborate on the introduction to logistic regression from lesson 7. We will better understand when to utilize this model and how to interpret the outcome. Thereafter we will elaborate on the model fit statistics we have briefly touched on in previous lessons, such as the AIC and BIC statistics. We will end the lesson by cementing in all the knowledge we have gained through module 2 with a practical demonstration.
Advanced in Data Analytics
Intro to Classification
The first lesson of Module 3 is aimed at introducing you to classification. This will cover what it is and what types of data it's used for.
Part one of logistic regression introduces linear and logistic models, GLM() in R and predication and odds ratio.
More on Logistic Regression
Part two of logistic regression explains probabilities and log odds ratios, confusion matrix as well as accuracy, sensitivity and specificity.
This lesson on skrinkage methods delves into lasso and ridge regression.
Dimension Reduction Methods
Lesson 5 is geared towards helping you grasp the concepts of principle component analysis and partial least squares.
We then move to subset selection where we explain stepwise selection as well as forward and backward stepwise regression in more detail.
Time Series Analysis
Nearing the end of the advanced module, explore time series analysis and more specifically manipulating time series data, autoregression and moving averages.
More On Time Series Analysis
The final lesson for this module details the ARIMA models and will help you to visualise time series data more effectively.
Proficient in Data Analytics
Intro to Tableau
In lesson 1 of the final module, you will be introduced to Tableau (including tips on how to install it).
Building a Business Savvy Dashboard
This lesson will take you through the steps and best practices of creating a business savvy dashboard for your specific needs.
Integrating R and Tableau
As you understand this program in more detail, we will help you to draw appropriate insights and models from R into Tableau.
Functions in Tableau
This lesson is focused on helping you to understand all the statistical functions available in Tableau, helping you get one step closer to mastering this program.
Segmentation and Cohort Analysis
Start to unpack segmentation and cohort analysis.
Scenario and What-if Analysis
In lesson 6, unpack scenarios and what-if analysis (if then else statements using Tableau).
Time Series and Predictive Analysis
Nearing the end of the final module, we will take a deep dive into time series and predictive analysis.
Presenting Your Findings and Tying It All Together
To finalise your diploma in Data Analytics, we bring all the learnings together and help you identify the best way to present your findings based on your audience.