0% found this document useful (0 votes)
8 views2 pages

List of datasets for regression and classification

this is the list of popular datasets
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views2 pages

List of datasets for regression and classification

this is the list of popular datasets
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

datasets for classifications :

1) Adult Census Income :

48,842 samples with 14 features like age, education, and occupation.


A binary classification dataset predicting if income exceeds $50,000.

2) Breast Cancer Wisconsin (Diagnostic):

569 samples with 30 features.


Classification of tumors as benign or malignant based on features like radius,
texture, and smoothness.

3) Titanic Survival Prediction Dataset:

1,309 samples with features like age, sex, and ticket class.
A binary classification dataset predicting survival on the Titanic.

4)Iris dataset :

Size: 150 samples.


Features: 4 (sepal length, sepal width, petal length, petal width).
Task: Multi-class classification to classify flowers into one of three species:
Setosa, Versicolor, or Virginica.

datasets for regression :

1) Concrete Compressive Strength Dataset :

1,030 samples with 8 features related to the properties of concrete materials.


Predicts the compressive strength of concrete based on material proportions.

2) Medical Cost Personal Dataset :

1,338 samples with 7 features like age, sex, and BMI.


Predicts insurance costs based on health features.

3) California Housing Prices :

Size: 20,640 samples with 8 features.


Features: Median income, house age, latitude, longitude, etc.
Task: Predict house prices in California.

4) Auto MPG Dataset (UCI):

Size: 398 samples with 8 features.


Features: Cylinders, horsepower, weight, etc.
Task: Predict miles per gallon (MPG) for various cars
5)Boston Housing:
Size: 506 samples.
Features: 13 features (e.g., crime rate, average number of rooms, distance to
employment centers).
Task: predicting the median house prices in Boston neighborhoods.

/////////////////////////////////////////////// The New York City Airbnb Open Data


/////////////////////////////////////////

New York City Airbnb Open Data :

Size: 49,000 samples with 96 features.


Features: Neighborhood, room type, price, minimum nights, etc.

-> The New York City Airbnb Open Data can also be used for:

-----> Regression: To predict rental prices based on features like location, room
type, and number of reviews.
-----> Classification: To predict if a listing is booked frequently (e.g., high/low
demand) based on similar features.

//////////////////////////////////////////////////////////////////////////////////
//////////////////////////////////////////

final dataset :

For classification apart from wine data :


1) adult census
2) cerebral one
3) vehicle one backup

for regression :
1)house sales , county usa backup

You might also like