Data Science Foundations¶

Cut through the deluge of data science material by focusing on the essentials. This course uses illustrations, code-based examples and case studies to demonstrate essential data science topics and the practical application of existing machine learning frameworks. By the end of the course, you will have trained and validated machine learning algorithms to make continuous-value as well as discrete-value predictions from data sources relevant in business and engineering. You will also be able to make statistically sound, data-driven decisions in business from sales and production data.

The breakdown for this course is as follows:

Data Topics
- Bias-variance tradeoff; regression: linear, logistic, and multivariate; regularization: L1 and L2; inferential statistics: moods median, t-tests, f-tests, ANOVA; descriptive statistics: mean, median, mode, kurtosis, skew; beyond regression coefficients: tree-based and resampling methods; unsupervised learning: clustering and dimensionality reduction
Software Topics
- Unit Tests
Sessions
- S1: Regression and Analysis
- S2: Inferential Statistics
- S3: Model Selection and Validation
- S4: Feature Engineering
- S5: Unsupervised Learning: Clustering and Dimensionality Reduction
- S6: Bagging: Decision Trees and Random Forests
- S7: Boosting: AdaBoost and XGBoost
Labs
- L1: Descriptive Statistics Data Hunt
- L2: Inferential Statistics Data Hunt
- L3: Feature Engineering
- L4: Supervised Learners
- L5: Writing Unit Tests
Project
- P1: Statistical Analysis of Tic-Tac-Toe Games
- P2: Heuristical Tic-Tac-Toe Agents
- P3: 1-Step Look Ahead Agents
- P4: N-Step Look Ahead Agents
Extras
- X1: Thinking Data
Reading
- JVDP chapter 5