Dataset Documentation
Dataset Documentation
csv
2. Variables Descrip4on:
- age:
- type: Numerical
- Descrip:on: Age of the individual.
- Range: [29, 77]
- sex:
- Type: Categorical (Binary)
- Descrip:on: Gender of the individual.
- Categories:
- `0`: Female
- `1`: Male
- cp:
- Type: Categorical
- Descrip:on: Chest pain type experienced by the individual.
- Categories:[0, 1, 2, 3]
- trestbps:
- Type: Numerical
- Descrip:on: Res:ng blood pressure (in mm Hg) upon admission to the hospital.
- chol:
- Type: Numerical
- Descrip:on: Serum cholesterol level in mg/dl.
- Vs:
- Type: Categorical (Binary)
- Descrip:on: Fas:ng blood sugar.
- Categories:
- `0`: < 120 mg/dl
- `1`: > 120 mg/dl
- restecg:
- Type: Categorical
- Descrip:on: Res:ng electrocardiographic results.
- Categories: [0, 1, 2]
- thalach:
- Type: Numerical
- Descrip:on: Maximum heart rate achieved during the Thallium stress test.
- exang:
- Type: Categorical (Binary)
- Descrip:on: Exercise-induced angina.
- Categories:
- `0`: No
- `1`: Yes
- oldpeak:
- Type: Numerical
- Descrip:on: ST depression induced by exercise rela:ve to rest.
- slope:
- Type: Categorical
- Descrip:on: Slope of the peak exercise ST segment.
- Categories: [0, 1, 2]
- ca:
- Type: Numerical
- Descrip:on: Number of major vessels colored by fluoroscopy.
- thal:
- Type: Categorical
- Descrip:on: Thalassemia type.
- Categories: [1, 2, 3]
- target:
- Type: Categorical (Binary)
- Descrip:on: Diagnosis of heart disease.
- Categories:
- `0`: Absence of heart disease
- `1`: Presence of heart disease
3. Usage:
The dataset is intended for researchers and prac::oners in the healthcare and medical
domain for developing and valida:ng AI and machine learning models aimed at predic:ng
cardiovascular diseases. It serves as a resource for the explora:on of feature importance,
model explainability, and the development of interpretable models in the cardiovascular
health domain.
5. Acknowledgments:
Researchers u:lizing this dataset should acknowledge the source, and if applicable, the
funding agencies and ins:tu:ons suppor:ng the work related to the dataset's crea:on and
distribu:on.