Data Hub

Curated Datasets

Download clean, pre-processed datasets used throughout our tutorials perfectly formatted for Scikit-Learn and Pandas.

Working with data is 80% of machine learning. The datasets provided below are exactly the same files referenced in our GitHub repository and Video Tutorials, so you can follow along perfectly without hunting for data.

California Housing PricesRegression

A classic dataset for Regression tasks. Contains metrics like median income, housing median age, and total rooms to predict the median house value for California districts.

Titanic Passenger SurvivalClassification

The famous introductory classification problem. Contains passenger details like age, sex, and ticket class, tasked with predicting whether the passenger survived the sinking.

Fetching Directly in Python

You can bypass downloading the files manually by fetching them directly in your Python code using Pandas. Replace the path with our raw GitHub dataset URLs:

import pandas as pd url = "https://raw.githubusercontent.com/SaryaDemir/ML-Tutorials/main/datasets/titanic.csv" df = pd.read_csv(url) print(df.head())