API Docs

load

The main function for loading a dataset is load, which returns a train and test fold for a RelationalDataset type.

RelationalDatasets.loadFunction
load(name::String, version::Union{String, Nothing} = nothing; fold::Int64 = 1)

Load the training and test folds for a dataset.

source

Convert propositional->relational

Many standard machine learning tasks are built around predicting a vector of outcomes $y$ from a data matrix $X$.

Here we include methods for converting data like these into an Inductive Logic Programming or relational representation.

from_vector

This assumes that the machine learning task can be inferred from the types of $y$: if $y$ is composed of discrete integers we are in a classification task, if $y$ is composed of continuous floats then we are in a regression task.

RelationalDatasets.from_vectorFunction
from_vector(X::Matrix{Int}, y::Vector{Int}, names::Union{Vector{String}, Nothing} = nothing)

Convert a classification dataset to an ILP representation.

source
from_vector(X::Matrix{Int}, y::Vector{Float64}, names::Union{Vector{String}, Nothing} = nothing)

Convert a regression dataset to an ILP representation.

source

Demo for converting a classification problem:

data, modes = RelationalDatasets.from_vector(
  [[0, 1, 1] [1, 0, 2] [2, 2, 0]],
  [0, 0, 1],
)
data.pos
1-element Vector{String}:
 "v4(id3)."

Regression is similar:

data, modes = RelationalDatasets.from_vector(
  [[0, 1, 1] [1, 0, 2] [2, 2, 0]],
  [1.1, 1.2, 1.3],
)
data.pos
3-element Vector{String}:
 "regressionExample(v4(id1),1.1)."
 "regressionExample(v4(id2),1.2)."
 "regressionExample(v4(id3),1.3)."

Custom names can also be passed to help make variables more interpretable. Below a small example based on the Boston Housing dataset.

The first two names are covariates and the last ("medv") is the dependent variable:

data, modes = RelationalDatasets.from_vector(
  [[1, 1] [1, 2] [2, 1]],
  [33.2, 27.5, 18.9],
  ["age", "dis", "medv"],
)
data.facts
6-element Vector{String}:
 "age(id1,age_1)."
 "age(id2,age_1)."
 "dis(id1,dis_1)."
 "dis(id2,dis_2)."
 "medv(id1,medv_2)."
 "medv(id2,medv_1)."

Constants

DATASETS

RelationalDatasets.DATASETS
14-element Vector{String}:
 "toy_cancer"
 "toy_father"
 "citeseer"
 "cora"
 "uwcse"
 "webkb"
 "financial_nlp_small"
 "nell_sports"
 "icml"
 "boston_housing"
 "drug_interactions"
 "toy_machines"
 "california_housing"
 "roofworld20"

LATEST_VERSION

RelationalDatasets.LATEST_VERSION
"v0.0.6"