TorchQL’s Database#

class torchql.Database(name: str = 'mldb')#

A database is a collection of Tables over which queries can be executed. Each Table is a dataset, or a collection of samples.

execute_pipeline(pipeline: List[Operation], name: str = 'default_pipeline', **kwargs) Table#

Execute a query pipeline over the database. The pipeline must be a list of operations, such as the pipeline from the torchql.Query class where the first operation must be a register operation.

Args:

pipeline (List[Operation]): The pipeline to execute.

name (str): The name of the pipeline. Defaults to “default_pipeline”.

kwargs: Global options that override the options local to the operations in the query pipeline.

Returns:

A Table containing the result of the query.

load(path: str, which_tables: List[str] | None = None)#

Load a database from disk.

Args:

path (str): The path from which to load the database. which_tables (List[str]): The list of table names to load. If it is None, all files with the .pt extension will be loaded.

register_dataset(d, name: str, num_workers: int | None = None, batch_size=None, disable=False)#

Register a dataset with the database. This will create a Table from the dataset and store it in the database.

Args:

d (Dataset): The dataset to register.

name (str): The name of the dataset.

store(path: str, which_tables: List[str] | None = None)#

Store the database to disk.

Args:

path (str): The path to store the database to. which_tables (List[str]): The list of table names to store. If it is None, all tables in the database will be saved.