histo_kit.grand_qc.dataset¶
Classes
|
Pytorch Dataset for extracting fixed-size patches from a region while applying padding to boundary areas. |
- class histo_kit.grand_qc.dataset.GrandQCDataset(region, bg, bbox_list, patch_size=512, overlap=0.7, pad_value=0, encoder='timm-efficientnet-b0', weights='imagenet')[source]¶
Bases:
DatasetPytorch Dataset for extracting fixed-size patches from a region while applying padding to boundary areas. Also returns a background mask patch and metadata describing the location of each patch.
- Parameters:
region (np.ndarray) – Source RGB region image from which patches will be extracted. Expected shape is (H, W, 3).
bg (np.ndarray) – Background mask associated with region, matching spatial dimensions
(H, W).bbox_list (list of tuples) – List of bounding boxes defining areas of interest. Each bounding box should be represented as
(x_start, y_start, x_end, y_end).patch_size (int, optional) – Target size (height and width) for the extracted patches (default is
512which is valid for the GrandQC model).overlap (float, optional) – Fractional overlap between neighboring patches (default is
0.7).pad_value (int, optional) – Value used to pad pixels when patches extend beyond the region boundary. Typically background (default is
Artifact.BG_THR.value, which corresponds to 0).encoder (str, optional) – Name of the encoder used for preprocessing, passed to segmentation_models_pytorch.encoders.get_preprocessing_fn.
weights (str, optional) – Pre-trained weights to use with the encoder (default is
"imagenet").
- coords¶
Dictionary of patch coordinates with keys
"x_start","y_start","x_end","y_end".- Type:
dict
- prep_fn¶
Preprocessing function for encoder normalization.
- Type:
callable
- patch_size¶
Final patch spatial size.
- Type:
int
- pad_value¶
Background padding value.
- Type:
int
Notes
Returned items are dictionaries to allow downstream inference pipelines to use bounding box metadata.