TorchQL’s Table#
- class torchql.Table(samples, id_index: dict | None = None, transform=None, disable=False)#
A Table is a collection of samples. It is specifically designed to be queried from within a database. It is a subclass of torch.utils.data.Dataset, so it can be used as a dataset.
- batch(size, shuffle, batch_size=0, disable=False) Table #
Batch this table.
- Args:
size (int): The size of the batches.
random (bool): Whether to shuffle the records before batching.
- Returns:
A new table with the batches.
- filter(cond: Callable[[...], bool], batch_size=0, disable=False) Table #
Filter this table by a condition.
- Args:
cond (Callable): The condition to filter by.
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the filtered records.
- flatten(batch_size=0, disable=False) Table #
Flatten this table. If the records of this table are lists, the records of the new table will be the elements of the lists.
- Args:
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with flattened records
- group_by(key: Callable[[...], Any], batch_size=0, disable=False) Table #
Group this table by a key. Records are grouped by the key function. The key function should return a hashable value. The records of the new table will be tuples of the key and a list of the records that have that key.
- Args:
- key (Callable): The key to group by.
Must return a hashable value.
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the grouped records.
- group_by_with_index(key: Callable[[...], Any], batch_size=0, disable=False) Table #
Group this table by a key. Records are grouped by the key function. The key function should return a hashable value. The records of the new table will be tuples of the key and a list of the records that have that key.
- Args:
- key (Callable): The key to group by.
Must return a hashable value.
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the grouped records.
- group_reduce(key: Callable[[...], Any], reduction: Callable[[...], Any], batch_size=0, disable=False) Table #
Group this table by a key and reduce the records of each group using a reduction function.
- Args:
- key (Callable): The key to group by.
Must return a hashable value.
reduction (Callable): The reduction function that takes in the records of each group.
- Returns:
A new table with the grouped and reduced records.
- head(n=10, print_id=False)#
Get the first n records of this table.
- Args:
n (int, optional): The number of records to get. Defaults to 10.
print_id (bool, optional): Whether to print the id of each row. Defaults to False.
- Returns:
A table with the first n records of this table.
- intersect(table: Dataset, batch_size=0, disable=False) Table #
Intersect this table with another table. Common records between this and other table will be used to create a new table. Common records are identified by the id of the row. The columns of the other table will be used in the new table.
- Args:
table (Dataset): The table to intersect with.
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the common records.
- join(table: Dataset, key=None, fkey=None, batch_size=0, disable=False) Table #
Join this table with another table. Records are joined if the key function returns the same value for both tables. If no key function is provided, the index is used.
- Args:
table (Dataset): The table to join with.
- key (Callable, optional): The key function to use for this table. Defaults to None.
Must take a set of columns from a row as input and return a hashable value that serves as a key.
- fkey (Callable, optional): The foreign key function to use for the other table. Defaults to None.
Must take a set of columns from a row as input and return a hashable value that serves as a key.
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the joined records.
- order_by(key: Callable[[...], Any], reverse=False, batch_size=0, disable=False) Table #
Order this table by a key.
- Args:
key (Callable): The key to order by.
reverse (bool, optional): Whether to reverse the order. Defaults to False.
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the ordered records.
- project(cols: Callable[[...], list], batch_size=0, disable=False) Table #
Select or perform an operation on the columns of this table.
- Args:
cols (Callable): A function that takes the columns of this table as arguments and returns a list of the projected columns.
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the projected columns.
- reduce(reduction: Callable[[...], Any], batch_size=0, disable=False) Table #
Reduce the records of this table using a reduction function. This function operates over all the records of the table as opposed to each row individually.
- Args:
reduction (Callable): The reduction function that takes in the records of the table.
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the reduced records.
- sample(print_id=False)#
Get a random row of this table.
- Args:
print_id (bool, optional): Whether to print the id of the row. Defaults to False.
- Returns:
A random row of this table.
- sample_many(n=10, print_id=False)#
Get n random records of this table.
- Args:
n (int, optional): The number of records to get. Defaults to 10.
print_id (bool, optional): Whether to print the id of each row. Defaults to False.
- Returns:
A list of n random records of this table.
- transform(transform) Table #
Register a PyTorch Transform to apply to this table.
- Args:
transform (Callable): The PyTorch transform to apply to the records of this table.
- Returns:
A new table with the transformed records.
- union(table: Dataset, batch_size=0, disable=False) Table #
Union this table with another table. Records of the other table will be added to the bottom of the table.
- Args:
table (Dataset): The table to union with.
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the combined records.
- unique(batch_size=0, disable=False) Table #
Select the unique records of this table.
- Args:
- batch_size (int): The batch size to enable batch-processing of the query. Note that a batch size >= 1 assumes
your supplied functions run on batches of records as opposed to a single record.
disable (boolean): A flag that disables progress bars if set to True.
- Returns:
A new table with the unique records.