GitHub - rakshith1928/Rakshithn-elevateLabs1

Task 1: Data Cleaning & Preprocessing

📌 Objective

Learn how to clean and prepare raw data for Machine Learning.

🛠 Tools Used

Python

Pandas

NumPy

Matplotlib

Seaborn

📂 Dataset

You can use any dataset relevant to the task. Example: Titanic Dataset. Download Titanic Dataset

🚀 Steps Performed

Imported dataset and explored basic information (null values, data types).
Handled missing values using mean/median/imputation.
Converted categorical features into numerical using encoding techniques.
Normalized/standardized numerical features.
Visualized outliers using boxplots and handled them.

📊 What I Learned

Data cleaning

Handling null values

Encoding categorical variables

Feature scaling (normalization/standardization)

Outlier detection

❓ Interview Questions

What are the different types of missing data?
How do you handle categorical variables?
What is the difference between normalization and standardization?
How do you detect outliers?
Why is preprocessing important in ML?
What is one-hot encoding vs label encoding?
How do you handle data imbalance?
Can preprocessing affect model accuracy?

📌 Submission Guidelines

Created a GitHub repository for this task.

Added code,this README.md file. Task 1: Data Cleaning & Preprocessing

📌 Objective

Learn how to clean and prepare raw data for Machine Learning.

🛠 Tools Used

Python

Pandas

NumPy

Matplotlib

Seaborn

📂 Dataset

You can use any dataset relevant to the task. Example: Titanic Dataset. Download Titanic Dataset

🚀 Steps Performed

Imported dataset and explored basic information (null values, data types).
Handled missing values using mean/median/imputation.
Converted categorical features into numerical using encoding techniques.
Normalized/standardized numerical features.
Visualized outliers using boxplots and handled them.

📊 What I Learned

Data cleaning

Handling null values

Encoding categorical variables

Feature scaling (normalization/standardization)

Outlier detection

❓ Interview Questions

What are the different types of missing data?
How do you handle categorical variables?
What is the difference between normalization and standardization?
How do you detect outliers?
Why is preprocessing important in ML?
What is one-hot encoding vs label encoding?
How do you handle data imbalance?
Can preprocessing affect model accuracy?

📌 Submission Guidelines

Created a GitHub repository for this task.

Added code, dataset (if needed), and this README.md file.

👨‍💻 Author - Rakshith N

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
raktask1.ipynb		raktask1.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

rakshith1928/Rakshithn-elevateLabs1

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages