Custom Datasets¶
This module contains the custom datasets for OpenML datasets.
GenericDataset
¶
Bases: Dataset
Generic dataset that takes X,y as input and returns them as tensors
Source code in openml_pytorch/custom_datasets/generic_dataset.py
4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
OpenMLImageDataset
¶
Bases: Dataset
Class representing an image dataset from OpenML for use in PyTorch.
Methods:
__init__(self, X, y, image_size, image_dir, transform_x=None, transform_y=None)
Initializes the dataset with given data, image size, directory, and optional transformations.
__getitem__(self, idx)
Retrieves an image and its corresponding label (if available) from the dataset at the specified index. Applies transformations if provided.
__len__(self)
Returns the total number of images in the dataset.
Source code in openml_pytorch/custom_datasets/image_dataset.py
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
|
OpenMLTabularDataset
¶
Bases: Dataset
OpenMLTabularDataset
A custom dataset class to handle tabular data from OpenML (or any similar tabular dataset). It encodes categorical features and the target column using LabelEncoder from sklearn.
Methods:
Name | Description |
---|---|
__init__ |
Initializes the dataset with the data and the target column. Encodes the categorical features and target if provided. |
__getitem__ |
Retrieves the input data and target value at the specified index. Converts the data to tensors and returns them. |
__len__ |
Returns the length of the dataset. |
Source code in openml_pytorch/custom_datasets/tabular_dataset.py
5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 |
|