DATA SCIENCE
Course Overview
Python Programming
MODULE 1 PYTHON BASICS
• Introduction of python
• Installation of Python and IDE
• Python objects
• Tokens in Python and Variables
MODULE 2 PYTHON DATA TYPES
• Basic data types in python
• Basics of List
• List: Object, methods
• Tuple: Object, methods
• Sets: Object, methods
• Dictionary: Object, methods
MODULE 3 PYTHON CONTROL STATEMENTS
• IF Conditional statement
• IF-ELSE
• NESTED IF
• Python Loops basics
• WHILE Statement
• FOR statements
• BREAK and CONTINUE statements
MODULE 4 PYTHON FUNCTIONS
• Functions basics
• Function Parameter passing
• Lambda functions
• Map, reduce, filter functions
MACHINE LEARNING ASSOCIATE
MODULE 1 MACHINE LEARNING INTRODUCTION
• What Is ML? ML Vs AI Vs DL
• Types of ML Learnings
• Supervised Vs Unsupervised Vs Reinforcement
MODULE 2 PYTHON NUMPY PACKAGE
• Introduction to Numpy Package
• Array as Data Structure
• Core Numpy functions
• Matrix Operations, Broadcasting in Arrays
MODULE 3 PYTHON PANDAS PACKAGE
• Introduction to Pandas package
• Series in Pandas
• Data Frame in Pandas
• File Reading in Pandas
• Operations performed in Data Frame
MODULE 4 VISUALIZATION WITH PYTHON - Matplotlib
• Visualization Packages (Matplotlib)
• Components Of A Plot, Sub-Plots
• Basic Plots: Line, Bar, Pie, Scatter, Histogram etc.
MODULE 5 PYTHON VISUALIZATION PACKAGE - SEABORN
• Seaborn: Basic Plot
• Advanced Python Data Visualizations
MODULE 6EVALUATION METRICS
• Types of Evaluation metrics
• Coding wise – Evaluation metrics
MODULE 7 ML ALGO: LINEAR REGRESSSION
• Introduction to Linear Regression
• How it works: Regression and Best Fit Line
• Modeling and Evaluation in Python
MODULE 8 ML ALGO: LOGISTIC REGRESSION
• Introduction to Logistic Regression
• How it works: Classification & Sigmoid Curve
• Modeling and Evaluation in Python
MODULE 9 ML ALGO: K MEANS CLUSTERING
• Understanding Clustering (Unsupervised)
• K Means Algorithm
• How it works: K Means theory
MODULE 10 ML ALGO: KNN
• Introduction to KNN
• How It Works: Nearest Neighbor Concept
• Modeling and Evaluation in Python
MACHINE LEARNING
MODULE 1 ML ALGO: SUPPORT VECTOR MACHINE (SVM)
• Introduction to SVM
• How It Works: SVM Concept, Kernel Trick
• Modeling and Evaluation of SVM in Python
MODULE 2 PRINCIPAL COMPONENT ANALYSIS (PCA)
• Building Blocks Of PCA
• How it works: Finding Principal Components
• Modeling PCA in Python
MODULE 3 ML ALGO: DECISION TREE
• Random Forest Ensemble technique
• How it works: Bagging Theory
• Modeling and Evaluation in Python
MODULE 4 ENSEMBLE TECHNIQUES - BAGGING
• Introduction to Ensemble technique and Bagging
• Modeling and Evaluation in Python
MODULE 5 ML ALGO: NAÏVE BAYES
• Introduction to Naive Bayes
• How it works: Bayes' Theorem
• Naive Bayes For Text Classification
• Modeling and Evaluation in Python
MODULE 6 GRADIENT BOOSTING, XGBOOST
• Introduction to Boosting and XGBoost
• How it works?
• Modeling and Evaluation of in Python
MODULE 7 ADVANCED ML CONCEPTS
• Adv Metrics (Roc_Auc, R2, Precision, Recall)
• K-Fold Cross validation
• Grid And Randomized Search CV In Sklearn
• Imbalanced Data Set : Smote Technique
• Feature Selection Techniques
MACHINE LEARNING EXPERT
MODULE 1 Statistics
• Population Vs Sample
• Central Tendencies
• Correlation vs Co variance
• CLT - theorem
• Measures of dispersion
• Hypothesis Testing
• Z statistic, t- statistic , p- value , Significance level
MODULE 2 TIME SERIES FORECASTING - ARIMA
• What is Time Series?
• Trend, Seasonality, cyclical and random
• Stationarity of Time Series
• Autoregressive Model (AR)
• Moving Average Model (MA)
• ARIMA and SARIMA model
• Autocorrelation and AIC
• Time Series Analysis in Python
MODULE 3 FEATURE ENGINEERING
• Introduction to Feature Selection & Methods
• Wrapper method: Forward selection, Backward Elimination, Exhaustive Selection
• Filter method & Types
• Filter method: Variance threshold, Correlation coefficient, Chi-Square
• Embedded method: Regularization, Treebased method
MODULE 4 REGULAR EXPRESSIONS WITH PYTHON
• Regex Introduction
• Regex codes
• Text extraction with Python Regex
MODULE 5 ML MODEL DEPLOYMENT WITH FLASK
• Introduction to Flask
• URL and App routing
• Flask application – ML Model deployment of a Project
MODULE 6 ADVANCED DATA ANALYSIS WITH MS EXCEL
• MS Excel core Functions
• Pivot Table
• Advanced Functions (VLOOKUP, INDIRECT..)
• Linear Regression with EXCEL
• Goal Seek Analysis
• Data Table
• Solving Data Equation with EXCEL
• Monte Carlo Simulation with MS EXCEL
MODULE 7 AWS CLOUD FOR DATA SCIENCE
• Introduction of cloud computing
• Difference between GCC, Azure, AWS
• AWS Service ( EC2 instance)
MODULE 8 INTRODUCTION TO DEEP LEARNING
• Introduction to Artificial Neural Network, Architecture
• Artificial Neural Network in Python
• Introduction to Convolutional Neural Network, Architecture
• Convolutional Neural Network in Python
MODULE 9 Deep Learning model with images
• Introduction to Image processing
• Pytorch Vs Tensorflow
• Dense Net, Vgg16, YOLO models in python
MODULE 10 Artificial Intelligence
• Introduction to Artificial Intelligence
• LLms model and OpenAi
• Langchain Applications
• RAG application , Text Summarization and Chatbot creation
DATABASE: SQL AND MONGODB
MODULE 1 DATABASE INTRODUCTION
• DATABASE Overview
• Key concepts of database management
• Relational Database Management System
MODULE 2 SQL BASICS
• Introduction to Databases
• Introduction to SQL
• SQL Commands
• MY SQL workbench installation
MODULE 3 DATA TYPES AND CONSTRAINTS
• Numeric, Character, date time data type
• Primary key, Foreign key, Not null
• Unique, Check, default, Auto increment
MODULE 4 DATABASES AND TABLES (MySQL)
• Create database
• Delete database
• Show and use databases
• Create table, Rename table
• Delete table, Delete table records
• Create new table from existing data types
• Insert into, Update records
• Alter table
MODULE 5 SQL JOINS
• Inner join
• Outer join
• Left join
• Right join
• Cross join
• Self join
• Windows functions: Over, Partition , Rank
MODULE 6 SQL COMMANDS AND CLAUSES
• Select, Select distinct
• Aliases, Where clause
• Relational operators, Logical
• Between, Order by, In
• Like, Limit, null/not null, group by
• Having, Sub queries
MODULE 7 DOCUMENT DB/NO-SQL DB
• Introduction of Document DB
• Document DB vs SQL DB
• Popular Document DBs
• MongoDB basics
• Data format and Key methods
VERSION CONTROL WITH GIT
MODULE 1 GIT INTRODUCTION
• Purpose of Version Control
• Popular Version control tools
• Git Distribution Version Control
• Terminologies
• Git Workflow
• Git Architecture
MODULE 2 GIT REPOSITORY and GitHub
• Git Repo Introduction
• Create New Repo with Init command
• Git Essentials: Copy & User Setup
• Mastering Git and GitHub
MODULE 3 COMMITS, PULL, FETCH AND PUSH
• Code commits
• Pull, Fetch and conflicts resolution
• Pushing to Remote Repo
MODULE 4 TAGGING, BRANCHING AND MERGING
• Organize code with branches
• Checkout branch
• Merge branches
• Editing Commits
• Commit command Amend flag
• Git reset and revert
MODULE 5 GIT WITH GITHUB AND BITBUCKET
• Editing Commits
• Commit command Amend flag
• Git reset and revert
MODULE 6 GIT WITH GITHUB AND BITBUCKET
• Creating GitHub Account
• Local and Remote Repo
• Collaborating with other developers
BIG DATA FOUNDATION
MODULE 1 BIG DATA INTRODUCTION
• Big Data Overview
• Five Vs of Big Data
• What is Big Data and Hadoop
• Introduction to Hadoop
• Components of Hadoop Ecosystem
• Big Data Analytics Introduction
MODULE 2 HDFS AND MAP REDUCE
• HDFS – Big Data Storage
• Distributed Processing with Map Reduce
• Mapping and reducing stages concepts
• Key Terms: Output Format, Partitioners,
• Combiners, Shuffle, and Sort
MODULE 3 PYSPARK FOUNDATION
• PySpark Introduction
• Spark Configuration
• Resilient distributed datasets (RDD)
• Working with RDDs in PySpark
• Aggregating Data with Pair RDDs
MODULE 4 SPARK SQL and HADOOP HIVE
• Introducing Spark SQL
• Spark SQL vs Hadoop Hive
CERTIFIED BI ANALYST
MODULE 1 POWER-BI BASICS
• Power BI Introduction
• Basics Visualizations
• Dashboard Creation
• Basic Data Cleaning
• Basic DAX FUNCTION
MODULE 2 DATA TRANSFORMATION TECHNIQUES
• Exploring Query Editor
• Data Cleansing and Manipulation:
• Creating Our Initial Project File
• Connecting to Our Data Source
• Editing Rows
• Changing Data Types
• Replacing Values
MODULE 3 CONNECTING TO VARIOUS DATA SOURCES
• Connecting to a CSV File
• Connecting to a Webpage
• Extracting Characters
• Splitting and Merging Columns
• Creating Conditional Columns
• Creating Columns from Examples
• Create Data Model