TorchQL’s Database#
- class torchql.Database(name: str = 'mldb')#
A database is a collection of Tables over which queries can be executed. Each Table is a dataset, or a collection of samples.
- execute_pipeline(pipeline: List[Operation], name: str = 'default_pipeline', **kwargs) Table #
Execute a query pipeline over the database. The pipeline must be a list of operations, such as the pipeline from the torchql.Query class where the first operation must be a register operation.
- Args:
pipeline (List[Operation]): The pipeline to execute.
name (str): The name of the pipeline. Defaults to “default_pipeline”.
kwargs: Global options that override the options local to the operations in the query pipeline.
- Returns:
A Table containing the result of the query.
- load(path: str, which_tables: List[str] | None = None)#
Load a database from disk.
- Args:
path (str): The path from which to load the database. which_tables (List[str]): The list of table names to load. If it is None, all files with the .pt extension will be loaded.
- register_dataset(d, name: str, num_workers: int | None = None, batch_size=None, disable=False)#
Register a dataset with the database. This will create a Table from the dataset and store it in the database.
- Args:
d (Dataset): The dataset to register.
name (str): The name of the dataset.
- store(path: str, which_tables: List[str] | None = None)#
Store the database to disk.
- Args:
path (str): The path to store the database to. which_tables (List[str]): The list of table names to store. If it is None, all tables in the database will be saved.