ErgastF1
Ergast.com is a webservice that provides a database of Formula 1 races, starting from the 1950 season until today. The dataset includes information such as the time taken in each lap, the time taken for pit stops, the performance in the qualifying rounds etc. of all Formula 1 races from 1950 to 2017. The task is to predict the winner (or tied winners) of a race with the data available up to the start of the race (e.g., the list of the race attendees and qualifying times are known but their lap times in the race are not known).
Original source: ergast.com
Versions
ErgastF1 (by Jan Motl)
Dataset details
- Associated task:
- Classification
- Domain:
- Sport
- Data types:
- Size:
- 60.4 MB
- Count of tables:
- 14
- Count of rows:
- 544,056
- Count of columns:
- 98
- Missing values:
- Yes
- Compound keys:
- No
- Loops:
- Yes
- Type:
- Real
- Instance count:
- 31,313
- Target table:
- target
- Target column:
- win
- Target ID:
- targetId
- Target timestamp:
- raceId
Algorithms
Dataset version | Target | Algorithm | Author text | Measure | Value |
---|---|---|---|---|---|
ErgastF1 | win | FastProp | getML: Feature Learning with AutoML to build end-to-end prediction pipelines | ROC AUC | 0.9242 |
ErgastF1 | win | Deep Feature Synthesis | featuretools | ROC AUC | 0.9202 |
ErgastF1 | win | FastProp | getML: Feature Learning with AutoML to build end-to-end prediction pipelines | Accuracy | 0.9727 |
ErgastF1 | win | Deep Feature Synthesis | featuretools | Accuracy | 0.9724 |
How to download the dataset
The datasets are publicly available directly from MariaDB database.
- Open your favourite MariaDB client (MySQL Workbench works, but see FAQ)
- Use following credentials:
- hostname: relational.fel.cvut.cz
- port: 3306
- username: guest
- password: ctu-relational
- Export "ErgastF1" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).