TPCDS
TPC-DS is the new decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. Although the underlying business model of TPC-DS is a retail product supplier, the database schema, data population, queries, data maintenance model and implementation rules have been designed to be broadly representative of modern decision support systems.
Original source: www.tpc.org
Versions
Tpcds (by Jan Motl)
Dataset details
- Associated task:
- Classification
- Domain:
- Retail
- Data types:
- Size:
- 4.8 GB
- Count of tables:
- 24
- Count of rows:
- 21,005,545
- Count of columns:
- 425
- Missing values:
- Yes
- Compound keys:
- No
- Loops:
- Yes
- Type:
- Synthetic
- Instance count:
- 97,006
- Target table:
- customer
- Target column:
- c_preferred_cust_flag
- Target ID:
- c_customer_sk
- Target timestamp:
- ?
How to download the dataset
The datasets are publicly available directly from MariaDB database.
- Open your favourite MariaDB client (MySQL Workbench works, but see FAQ)
- Use following credentials:
- hostname: relational.fel.cvut.cz
- port: 3306
- username: guest
- password: ctu-relational
- Export "tpcds" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).