An Ultimate Data science starter kit for beginners

Hello Everyone! This is my first blog in my personal website. since I have gotten many request and queries on how to start learning data science , I have decided to write this blog in much more practical way.

This blog will guide you in terms of what are the fundamental concepts to be learnt in the initial stage . If you learn these foundational skills it will be really helpful for you to build advanced skills in data science.

Data Science Starter Kit

Month 1 – Foundation phase

In the month 1 , concentrate more on listed foundational concepts. second month onwards we will start learning intermediate topics such as regression analysis , classification , time series forecasting

Tech – 1 : Python programming language
S.NoPython topic
1Python variables , statements ,Data types
2Arithmetic operations ,
logical and relational operations
3For , while loops , if conditions
4User defined functions
5python data structures strings
6Object orient programming – Classes and objects
7Numpy – Matrix operations
8Pandas – Data frame operations
9Seaborn,Matplotlib,plotly
10Scipy , Statsmodels
Tech – 2 : SQL and EXCEL
S.NoTopics to Learn
SQL – 1Create , insert , alter tables
2Aggregating, Manipulating and Filtering data
3Joins and Sub Queries
4Window functions
5Unions , Index , Regex
EXCEL -1Mathematical , Relational ,Logical operations
2Lookups , Index , match
3Charts and pivot tables
4Macros
5Dashboarding
Math -1 : Statistics and Probability
S.NoConcepts to learn
1Different types of data and visualizations
2Mean , Median , Frequency , Mode , Sample and Population
3Central tendency and Normal Distribution central limit theorem
4Point estimate and confidence interval , Sampling distribution
5Null and Alternate hypothesis , Hypothesis testing and distributions
6T tests , F tests , Z test , ANOVA , Chi Square
7Correlation and Covariance
8Non Parametric Hypothesis testing
9Probability and distributions
10Different Probability distributions
11Conditional probability and inverse probability
12Expectation and Variance
13Bayes Theorem
14Maximum likelihood
Business -1 – Understanding applications and case studies of Data science across different domains

Pharmaceuticals

Finance

Banking

Retail

CPG

Telecom

at the end of the first month , you will be having fundamental idea about what is data science , how to handle data , how to visualize data , how to manipulate and get findings from the data. Try to cross learn the concepts and practice the statistical lessons using python or excel. In this stage you will get the skillsets for becoming a data analyst / Business analyst role

Month 2 : Statistical and Machine learning algorithms

Math -1 : Introduction to Statistical learning and Machine learning
S.NoAlgorithms to learn
1Simple linear regression ,Multiple linear regression
2Penalized linear regression and non linear regression
3Logistic regression and Genarlised linear models
4Linear mixed effects models
5Decision trees , Random forests (Bagging)
6Gradient boosting , Adaptive Gradient Boosting
7Extreme Gradient Boosting , Cat Boosting , Light Gradient boosting
8Stacking and blending
9K Nearest Neighbors , Naïve Bayes Classification
10Support vector machine
11K Means clustering , DB Scan , Hierarchial clustering PCA
12Time series – ARIMA , AR , MA , SARIMA ,Exponential smoothing
13Basics of Neural Networks – Weights and Biases
14Back propagation and Multilayer perceptron
15Machine learning development lifecycle – MLOPs

Tech stack to be learnt – Month 2

Learn and Practice the following frameworks along with the listed algorithms

Benchmark Problems to practice :

Practice Regression , classification , clustering , forecasting problems using benchmark datasets

  1. Flower species classification – Iris data set
  2. Boston House price prediction – Boston dataset
  3. Clustering Different wine categories
  4. Time series forecasting

Kindly Don’t restrict with the algorithms explore more datasets and practice listed algorithms. learn like a machine ( i mean learn from your errors)

Month 3 : Projects and Portfolio Development

Based on the Learnings from the first 2 months , start building your own projects based on your own design.

Manage your projects in github , use online platforms like kaggle , colab for practicing and working along with the data science community

Start participating in the coding , problem solving competitions in online platforms like leetcode , hackerrank , hackerearth , techgig, analytics vidya,analytics india magazine

Create your portfolio with Various projects from regression , classification , clustering ,forecasting problems .

Prepare a data story flow deck and present them along with your portfolio

start applying for internships or jobs

Never stop learning – continue to learn advanced algorithms and methodologies

Implement the algorithms to serve a purpose / solve a problem in a Retail/Pharma/Banking/CPG/Telecom domain.

This will help you to integrate your business , tech and math skills .

Happy learning – I Hope this blog helps you

Thanks for Reading.

4 thoughts on “An Ultimate Data science starter kit for beginners

Leave a comment