histo_kit.grand_qc.dataset¶

Classes

GrandQCDataset(region, bg, bbox_list[, ...])

Pytorch Dataset for extracting fixed-size patches from a region while applying padding to boundary areas.

class histo_kit.grand_qc.dataset.GrandQCDataset(region, bg, bbox_list, patch_size=512, overlap=0.7, pad_value=0, encoder='timm-efficientnet-b0', weights='imagenet')[source]¶

Bases: Dataset

Pytorch Dataset for extracting fixed-size patches from a region while applying padding to boundary areas. Also returns a background mask patch and metadata describing the location of each patch.

Parameters:

region (np.ndarray) – Source RGB region image from which patches will be extracted. Expected shape is (H, W, 3).
bg (np.ndarray) – Background mask associated with region, matching spatial dimensions (H, W).
bbox_list (list of tuples) – List of bounding boxes defining areas of interest. Each bounding box should be represented as (x_start, y_start, x_end, y_end).
patch_size (int, optional) – Target size (height and width) for the extracted patches (default is 512 which is valid for the GrandQC model).
overlap (float, optional) – Fractional overlap between neighboring patches (default is 0.7).
pad_value (int, optional) – Value used to pad pixels when patches extend beyond the region boundary. Typically background (default is Artifact.BG_THR.value, which corresponds to 0).
encoder (str, optional) – Name of the encoder used for preprocessing, passed to segmentation_models_pytorch.encoders.get_preprocessing_fn.
weights (str, optional) – Pre-trained weights to use with the encoder (default is "imagenet").

coords¶

Dictionary of patch coordinates with keys "x_start", "y_start", "x_end", "y_end".

Type:: dict

prep_fn¶

Preprocessing function for encoder normalization.

Type:: callable

patch_size¶

Final patch spatial size.

Type:: int

pad_value¶

Background padding value.

Type:: int

Notes

Returned items are dictionaries to allow downstream inference pipelines to use bounding box metadata.

preprocess(img)[source]¶

Apply encoder-specific preprocessing and convert the image to a tensor.

Parameters:: img (np.ndarray) – Input patch of shape (patch_size, patch_size, 3).
Returns:: Preprocessed tensor suitable for the GrandQC model input.
Return type:: torch.Tensor