CORA
The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words.
Original source: web.archive.org
Versions
CORA (by Arnaud Barragao)
Dataset details
- Associated task:
- Classification
- Domain:
- Education
- Data types:
- Size:
- 4.5 MB
- Count of tables:
- 3
- Count of rows:
- 57,884
- Count of columns:
- 6
- Missing values:
- No
- Compound keys:
- No
- Loops:
- Yes
- Type:
- Real
- Instance count:
- 2,708
- Target table:
- paper
- Target column:
- class_label
- Target ID:
- paper_id
- Target timestamp:
- ?
References
Algorithms
Dataset version | Target | Algorithm | Author text | Measure | Value |
---|---|---|---|---|---|
CORA | ACORA | Distribution-based aggregation for relational learning with identifier attributes | AUC ROC | 0.97 | |
CORA | CBCC | Case-Based Collective Classification | Accuracy | 0.754 | |
CORA | EPRN | Relational Ensemble Classification | Accuracy | 0.84 | |
CORA | LBP | A Link-Based Method for Propositionalization | Accuracy | 0.85 | |
CORA | MLN | Investigating Markov Logic Networks for Collective Classification | Accuracy | 0.798 | |
CORA | PRN | Relational Ensemble Classification | Accuracy | 0.81 | |
CORA | RBC | Relational Ensemble Classification | Accuracy | 0.8 | |
CORA | RDN | Relational Ensemble Classification | Accuracy | 0.75 | |
CORA | RelF | A Link-Based Method for Propositionalization | Accuracy | 0.857 | |
CORA | RPT | Collective Classification with Relational Dependency Networks | Accuracy | 0.792 | |
Show all algorithms |
How to download the dataset
The datasets are publicly available directly from MariaDB database.
- Open your favourite MariaDB client (MySQL Workbench works, but see FAQ)
- Use following credentials:
- hostname: relational.fel.cvut.cz
- port: 3306
- username: guest
- password: ctu-relational
- Export "CORA" database (or other version of the dataset, if available) in your favourite format (e.g. CSV or SQL dump).