Data Analyst
A data analyst sits between business intelligence and data science. They provide vital information to business stakeholders.
Data Management in SQL (PostgreSQL)
Data Analysis in SQL (PostgreSQL)
Exploratory Analysis Theory
Statistical Experimentation Theory
Free Certification : Data Analyst Certification
Data Scientist Associate
A data scientist is a professional responsible for collecting, analyzing and interpreting extremely large amounts of data.
R / Python Programming
Data Manipulation in R/Python
1.1 Calculate metrics to effectively report characteristics of data and relationships between
features
● Calculate measures of center (e.g. mean, median, mode) for variables using R or Python.
● Calculate measures of spread (e.g. range, standard deviation, variance) for variables
using R or Python.
● Calculate skewness for variables using R or Python.
● Calculate missingness for variables and explain its influence on reporting characteristics
of data and relationships in R or Python.
● Calculate the correlation between variables using R or Python.
1.2 Create data visualizations in coding language to demonstrate the characteristics of data
● Create and customize bar charts using R or Python.
● Create and customize box plots using R or Python.
● Create and customize line graphs using R or Python.
● Create and customize histograms graph using R or Python.
1.3 Create data visualizations in coding language to represent the relationships between
features
● Create and customize scatterplots using R or Python.
● Create and customize heatmaps using R or Python.
● Create and customize pivot tables using R or Python.
1.4 Identify and reduce the impact of characteristics of data
● Identify when imputation methods should be used and implement them to reduce the
impact of missing data on analysis or modeling using R or Python.
● Describe when a transformation to a variable is required and implement corresponding
transformations using R or Python.
● Describe the differences between types of missingness and identify relevant approaches
to handling types of missingness.
● Identify and handle outliers using R or Python.
Statistical Fundamentals in R/Python
2.1 Perform standard data import, joining and aggregation tasks
● Import data from flat files into R or Python.
● Import data from databases into R or Python
● Aggregate numeric, categorical variables and dates by groups using R or Python.
● Combine multiple tables by rows or columns using R or Python.
● Filter data based on different criteria using R or Python.
2.2 Perform standard cleaning tasks to prepare data for analysis
● Match strings in a dataset with specific patterns using R or Python.
● Convert values between data types in R or Python.
● Clean categorical and text data by manipulating strings in R or Python.
● Clean date and time data in R or Python.
2.3 Assess data quality and perform validation tasks
● Identify and replace missing values using R or Python.
● Perform different types of data validation tasks (e.g. consistency, constraints, range
validation, uniqueness) using R or Python.
● Identify and validate data types in a data set using R or Python.
2.4 Collect data from non-standard formats by modifying existing code
● Adapt provided code to import data from an API using R or Python.
● Identify the structure of HTML and JSON data and parse them into a usable format for
data processing and analysis using R or Python
Importing & Cleaning in R/Python
Machine Learning Fundamentals in R/Python
Free Certification : Data Science
Data Engineer
A data engineer collects, stores, and pre-processes data for easy access and use within an organization. Associate certification is available.
Data Management in SQL (PostgreSQL)
Exploratory Analysis Theory