API documentation#
Content
Instrument-Specific
Simulators
Indices and tables
CLI#
PyOPIA top-level code primarily for managing cmd line entry points
- pyopia.cli.convert_raw_images(config_filename: str)#
Convert raw images files to bitmap format (png).
Input images are inferred from config.toml file. Ouput folder is created.
- Parameters:
config_filename (str) – Config filename
- pyopia.cli.docs()#
Open browser at PyOPIA’s readthedocs page
- pyopia.cli.export_to_ecotaxa(stats_filename: Path, export_filename: Path, make_label_folders: bool = False, filter_variable: Tuple[str, float, float] = [None, None, None])#
EcoTaxa export: Create a zip file with particle images and statistics csv
- Parameters:
config_filename (pathlib.Path) – Config filename
output_filename (pathlib.Path) – Name of zip file to contain exported particle images and statistics
make_label_folders (bool, optional, default False) – If True, store particle images in sub-folders with label names. NB: This must be False to create an EcoTaxa compatible zip file.
filter_variable (list) – Variables to filter on (name, min, max), e.g. [‘depth’, 5, None]
- pyopia.cli.generate_config(instrument: str, raw_files: str, model_path: str, outfolder: str, output_prefix: str)#
Put an example config.toml file in the current directory
- Parameters:
instrument (str) – either silcam, holo or uvp
raw_files (str) – raw_files
model_path (str) – model_path
outfolder (str) – outfolder
output_prefix (str) – output_prefix
- pyopia.cli.get_custom_progress_bar(description, disable)#
Create a custom rich.progress.Progress object for displaying progress bars
- pyopia.cli.init_project(project_name: str, instrument: str = 'silcam', example_data: bool = False)#
Initialize a PyOPIA processing project with a standard config file and folder layout
- Parameters:
project_name (str) – Name of project, a folder with this name will be created
instrument (str) – Either silcam, holo or uvp
example_data (bool) – If specified, download 10 example SilCam images and put them in the images/ folder
- pyopia.cli.make_montage(stats_filename: Path, output_filename: str = 'montage.png', filter_variable: Tuple[str, float, float] = [None, None, None])#
Create a montage of particles
- Parameters:
config_filename (str) – Config filename
output_filename (str) – Store montage figure to this filename
filter_variable (list) – Variables to filter on (name, min, max), e.g. [‘depth’, 5, None]
- pyopia.cli.merge_mfdata(path_to_data: str, prefix='*', overwrite_existing_partials: bool = True, chunk_size: int = None)#
Combine a multi-file directory of STATS.nc files into a single ‘-STATS.nc’ file that can then be loaded with {func}`pyopia.io.load_stats`
- Parameters:
path_to_data (str) – Folder name containing nc files with pattern ‘Image-D-STATS.nc’
prefix (str) – Prefix to multi-file dataset (for replacing the wildcard in ‘Image-D-STATS.nc’). Defaults to ‘*’
overwrite_existing_partials (bool) – Do not reprocess existing merged netcdf files for each chunk if False. Otherwise reprocess (load) and overwrite. This can be used to restart or continue a previous merge operation as new files become available.
chunk_size (int) – Process this many files together and store as partially merged netcdf files, which are then merged at the end. Default: None, process all files together.
- pyopia.cli.modify_config(existing_filename: str, modified_filename: str, raw_files=None, pixel_size=None, step_name=None, modify_arg=None, modify_value=None)#
Modify a existing config.toml file and write a new one to disc
- Parameters:
existing_filename (str) – e.g. config.toml
modified_filename (str) – e.g. config_new.toml
raw_files (str, optional) – modify the raw file input in the [general] settings, by default None
pixel_size (str, optional) – modify the pixel size in the [general] settings, by default None
step_name (str, optional) – the name of the step to modify e.g. segmentation, by default None
modify_arg (str, optional) – the name of the step to modify e.g. threshold. existing arguments will be overwritten, non-existent arguments will be created, by default None
modify_value (str or floar, optional) – new value to attach to the ‘modify_arg’ setting e.g. 0.85. Accepts either string or float input, by default None
- pyopia.cli.process(config_filename: str, num_chunks: int = 1, strategy: str = 'block')#
Run a PyOPIA processing pipeline based on given a config.toml
- Parameters:
config_filename (str) – Config filename
numchunks (int, optional) – Split the dataset into chucks, and process in parallell, by default 1
strategy (str, optional) – Strategy to use for chunking dataset, either block or interleave. Defult: block
- pyopia.cli.process_file_list(file_list, pipeline_config, c)#
Run a PyOPIA processing pipeline for a chuncked list of files based on a given config.toml
- Parameters:
file_list (str) – List of file paths to process, where each file will be passed individually through the processing pipeline
pipeline_config (str) – Loaded config.toml file to initialize the processing pipeline and setup logging
c (int) – Chunk index for tracking progress and logging. If set to 0, enables the progress bar; for other values, the progress bar is disabled.
- pyopia.cli.process_realtime(config_filename: str, watch_folder: str = None)#
Run a PyOPIA processing pipeline in realtime by watching a folder.
- Parameters:
config_filename (str) – Config filename
watch_folder (str, optional) – Folder to monitor. If not provided, inferred from general.raw_files in config.
Notes
Single-core only: files are processed sequentially by a single worker thread.
Uses watchdog moved events only (rename into place) to avoid half-written files.
- pyopia.cli.setup_logging(pipeline_config)#
Configure logging
- Parameters:
pipeline_config (dict) – TOML settings
Pipeline#
Module for managing the PyOpia processing pipeline
Refer to the Pipeline class documentation for examples of how to process datasets and images
- class pyopia.pipeline.Data#
Data dictionary which is passed between
pyopia.pipelinesteps.- bgstack: float#
List of images making up the background (either static or moving) Obtained from
pyopia.background.CorrectBackgroundAccurate
- cl: object#
classifier object from
pyopia.classify.Classify
- filename: str#
Filename string
- im_corrected: float#
Single composite image of focussed particles ready for segmentation Obtained from e.g.
pyopia.background.CorrectBackgroundAccurate
- im_focussed: float#
Focussed holographic image
- im_masked: float#
Masked raw image with removed potentially noisy border region before further processsing Obtained from e.g.
pyopia.instrument.common.RectangularImageMask
- im_minimum: float#
A 2-d flattened RGB image representing the minmum intensity of all channels Obtained from e.g.
pyopia.instrument.silcam.ImagePrep
- im_stack: float#
3-d array of reconstructed real hologram images Obtained from
pyopia.instrument.holo.Reconstruct
- imbg: float#
Background image that can be used to correct
pyopia.pipeline.Data.imrawand calcaultepyopia.pipeline.Data.im_correctedObtained frompyopia.background.CorrectBackgroundAccurate
- imbw: float#
Segmented binary image identifying particles from water Obtained from e.g.
pyopia.process.Segment
- imc: float#
Deprecatied. Replaced by im_corrected
- img: float#
Deprecatied. Replaced by imraw
- imraw: float#
Raw uncorrected image
- imref: float#
Refereence background corrected image passed to silcam classifier
- imss: float#
Stack summary image used to locate possible particles Obtained from
pyopia.instrument.holo.Focus
- raw_files: str#
String used by glob to obtain file list of data to be processed This is exracted automatically from ‘general.raw_files’ in the toml config during pipeline initialisation.
- stats: DataFrame#
stats DataFrame containing particle statistics of every particle Obtained from e.g.
pyopia.process.CalculateStats
- steps_string: str#
String documenting the steps given to
pyopia.pipelineThis is put here for documentation purposes, and saving as metadata.
- timestamp: datetime#
timestamp from e.g.
pyopia.instrument.silcam.timestamp_from_filename()
- class pyopia.pipeline.FilesToProcess(glob_pattern=None)#
Build file list from glob pattern if specified. Create FilesToProcess.chunked_files is chunks specified File list from glob will be sorted. If a filelist file is specified, load the list from there without sorting.
- Parameters:
glob_pattern (str, optional) – Glob pattern, by default None. If it ends with .txt, interpret as a filelist file.
- build_initial_background_files(average_window=0)#
Create a list of files to use for initializing the background in the first chunk
- Parameters:
average_window (int, optional) – number of images to use in creating a background, by default 0
- chunk_files(num_chunks: int, strategy: str = 'block')#
Chunk the file list and create FilesToProcess.chunked_files
- Parameters:
num_chunks (int) – number of chunks to produce (must be at least 1)
strategy (str, optional) – Strategy to use for chunking dataset, either block or interleave. Defult: block
- from_filelist_file(path_to_filelist)#
Initialize explicit list of files to process from a text file. The text file should contain one path to an image per line, which should be processed in order.
- prepare_chunking(num_chunks, average_window, bgshift_function, strategy='block')#
Chunk the file list and add initial background files to each chunk
- Parameters:
num_chunks (int) – Number of chunks to produce (must be at least 1)
average_window (int) – Number of images to use for background correction
bgshift_function (str) – Background update strategy, either pass (static background) or accurate
strategy (str, optional) – Strategy to use for chunking dataset, either block or interleave. Defult: block
- to_filelist_file(path_to_filelist)#
Write file list to a txt file
- Parameters:
path_to_filelist (str) – Path to txt file to write
- class pyopia.pipeline.Pipeline(settings, initial_steps=['initial', 'classifier', 'createbackground'])#
The processing pipeline class
Note
The classes called in the Pipeline steps can be modified, and the names of the steps changed. New steps can be added or deleted as required.
The classes called in the Pipeline steps need to take a TOML-formatted dictionary as input and return a dictionary of data as output. This common data dictionary:
pyopia.pipeline.Datais therefore passed between steps so that data or variables generated by each step can be passed along the pipeline.By default, the step names: initial, classifier, and createbackground are run when initialising Pipeline. The remaining steps will be run on Pipeline.run(). You can add initial steps with the optional input initial_steps, which takes a list of strings of the step key names that should only be run on initialisation of the pipeline. i.e.: processing_pipeline = pyopia.pipeline.Pipeline(toml_settings, initial_steps=[‘classifier’, ‘novel_initial_process’])
The step called ‘classifier’ must return a dict containing:
pyopia.pipeline.Data.clin order to run successfully.Pipeline.run()takes a string as input. This string is put intopyopia.pipeline.Data, available to the steps in the pipeline as data[‘filename’]. This is intended for use in looping through several files during processing, so run can be called multiple times with different filenames.Examples
Examples of setting up and running a pipeline can be found for SilCam here, and holographic analysis here.
Example config files can be found for SilCam here, and for holographic analysis here.
You can check the workflow used by reading the steps from the metadata in the output file using
pyopia.io.steps_from_xstats()More examples and guides can be found on the PyOIA By Example page.
- print_steps()#
Print the version number and steps dict (for log_level = DEBUG)
- run(filename)#
Method for executing the processing pipeline.
- Parameters:
filename (str) – file to be processed
- Returns:
stats – particle statistics associated with ‘filename’
- Return type:
DataFrame
Note
The returned stats from this function are single-image only and not appended if you loop through several filenames! It is recommended to use this step in the pipeline for properly appending data into NetCDF format when processing several files.
[steps.output] pipeline_class = 'pyopia.io.StatsDisc' output_datafile = 'proc/test' # prefix path for output nc file
- run_step(stepname)#
Execute a pipeline step and update the pipeline data
- Parameters:
stepname (str) – Name of the step defined in the settings
- step_callobj(stepname)#
Generate a callable object for use in run_step()
- Parameters:
stepname (str) – Name of the step defined in the settings
- Returns:
callable object for use in run_step()
- Return type:
obj
- pyopia.pipeline.build_repr(toml_steps, step_name)#
Build a callable object from settings, which can be used to construct the pipeline steps dict
- Parameters:
toml_steps (dict) – TOML-formatted steps
step_name (str) – the key of the TOML-formatted steps which should be use to create a callable object
- Returns:
callable object, useable in a pipeline steps dict
- Return type:
obj
- pyopia.pipeline.build_steps(toml_steps)#
Build a steps dictionary, ready for pipeline use, from a TOML-formatted steps dict
- Parameters:
toml_steps (dict) – TOML-formatted steps (usually loaded from a config.toml file)
- Returns:
steps dict that is useable by pyopia.pipeline.Pipeline
- Return type:
dict
- pyopia.pipeline.steps_to_string(steps)#
Deprecated. Convert pipeline steps dictionary to a human-readable string
- Parameters:
steps (dict) – pipeline steps dictionary
- Returns:
steps_str – human-readable string of the types and variables
- Return type:
str
Background#
Background correction module (inherited from PySilCam)
- class pyopia.background.CorrectBackgroundAccurate(bgshift_function='pass', average_window=1, image_source='imraw', divide_bg=False)#
pyopia.pipelinecompatible class that calls:pyopia.background.correct_im_accurate()and will shift the background using a moving average function if given.The background stack and background image are created during the first ‘average_window’ (int) calls to this class, and the skip_next_steps flag is set in the pipeline Data. No background correction is performed during these steps.
- Required keys in
pyopia.pipeline.Data:
- Parameters:
bgshift_function ((string, optional)) –
Function used to shift the background. Defaults to passing (i.e. static background) Available options are ‘accurate’, ‘fast’, or ‘pass’ to apply a static background correction:
average_window (int) – number of images to use in the background image stack
image_source ((str, optional)) – The key in Pipeline.data of the image to be background corrected. Defaults to ‘imraw’
divide_bg ((bool)) – If True, it performs background correction by dividing the raw image by the background. Default to False.
- Returns:
data – containing the following new keys:
pyopia.pipeline.Data.im_correctedpyopia.pipeline.Data.im_corrected- Return type:
Examples
Apply moving average using
pyopia.background.shift_bgstack_accurate():[steps.correctbackground] pipeline_class = 'pyopia.background.CorrectBackgroundAccurate' bgshift_function = 'accurate' average_window = 5
Apply static background correction:
[steps.correctbackground] pipeline_class = 'pyopia.background.CorrectBackgroundAccurate' bgshift_function = 'pass' average_window = 5
If you do not want to do background correction, leave this step out of the pipeline. Then you could use
pyopia.pipeline.CorrectBackgroundNoneif you need to instead.- Required keys in
- class pyopia.background.CorrectBackgroundNone#
pyopia.pipelinecompatible class for use when no background correction is required. This simply makes `data[‘im_corrected’] = data[‘imraw’] in the pipeline. This simply makes `data[‘im_corrected’] = data[‘imraw’] in the pipeline.- Required keys in
pyopia.pipeline.Data:
- Parameters:
None –
- Returns:
data – containing the following new keys:
- Return type:
Don’t apply any background correction after image load step :
[steps.nobackground] pipeline_class = 'pyopia.background.CorrectBackgroundNone'
- Required keys in
- pyopia.background.correct_im_accurate(imbg, imraw, divide_bg=False)#
Corrects raw image by subtracting or dividing the background and scaling the output
For dividing method see: https://doi.org/10.1016/j.marpolbul.2016.11.063)
There is a small chance of clipping of imc in both crushed blacks and blown highlights if the background or raw images are very poorly obtained
- Parameters:
imbg (float64) – background averaged image
imraw (float64) – raw image
divide_bg ((bool, optional)) – If True, the correction will be performed by dividing the raw image by the background Default to False
- Returns:
im_corrected – corrected image, same type as input
- Return type:
float64
- pyopia.background.correct_im_fast(imbg, imraw)#
Corrects raw image by subtracting the background and clipping the ouput without scaling
There is high potential for clipping of imc in both crushed blacks an blown highlights, especially if the background or raw images are not properly obtained
- Parameters:
imraw (array) – raw image
imbg (array) – background averaged image
- Returns:
im_corrected – corrected image
- Return type:
array
- pyopia.background.ini_background(bgfiles, load_function)#
Create and initial background stack and average image
- Parameters:
bgfiles (list) – List of strings of filenames to be used in background creation
load_function (object) – This function should take a filename and return an image, for example:
pyopia.instrument.silcam.load_image()
- Returns:
bgstack (list) – list of all images in the background stack
imbg (array) – background image
- pyopia.background.shift_and_correct(bgstack, imbg, imraw, stacklength, real_time_stats=False)#
Shifts the background stack and averaged image and corrects the new raw image.
This is a wrapper for shift_bgstack and correct_im
- Parameters:
bgstack (list) – list of all images in the background stack
imbg (float64) – background image
imraw (float64) – raw image
stacklength (int) – unused int here - just there to maintain the same behaviour as shift_bgstack_fast()
real_time_stats (Bool, optional) – True use fast functions, if False use accurate functions., by default False
- Returns:
bgstack (list) – list of all images in the background stack
imbg (float64) – background averaged image
im_corrected (float64) – corrected image
- pyopia.background.shift_bgstack_accurate(bgstack, imbg, imnew)#
Shifts the background by popping the oldest and added a new image
The new background is calculated slowly by computing the mean of all images in the background stack.
- Parameters:
bgstack (list) – list of all images in the background stack
imbg (array) – background image
imnew (array) – new image to be added to stack
- Returns:
bgstack (list) – updated list of all background images
imbg (array) – updated actual background image
- pyopia.background.shift_bgstack_fast(bgstack, imbg, imnew)#
Shifts the background by popping the oldest and added a new image
The new background is appoximated quickly by subtracting the old image and adding the new image (both scaled by the stacklength). This is close to a running mean, but not quite.
- Parameters:
bgstack (list) – list of all images in the background stack
imbg (uint8) – background image
imnew (unit8) – new image to be added to stack
- Returns:
bgstack (list) – updated list of all background images
imbg (array) – updated actual background image
Process#
Module containing tools for processing particle image data
- class pyopia.process.CalculateImageStats#
PyOpia pipline-compatible class for collecting whole-image statistics
- Required keys in
pyopia.pipeline.Data:
- Parameters:
None –
- Returns:
data – containing the following new keys:
pyopia.pipeline.Data.image_stats- Return type:
- Required keys in
- class pyopia.process.CalculateStats(max_coverage=30, max_particles=5000, export_outputpath=None, min_length=0, propnames=['major_axis_length', 'minor_axis_length', 'equivalent_diameter'], roi_source='im_corrected', bbox_expansion=0.0)#
PyOpia pipline-compatible class for calling statextract
- Required keys in
pyopia.pipeline.Data:
- Parameters:
max_coverage ((int, optional)) – percentage of the image that is allowed to be filled by particles. Defaults to 30.
max_particles ((int, optional)) – maximum allowed number of particles in an image. Exceeding this will discard the image from analysis. Defaults to 5000.
export_outputpath ((str, optional)) – Path to folder to put extracted particle ROIs (in h5 files). Required for making montages later.
min_length ((int, optional)) – The minimum length of particles (in pixels) to includ in output ROIs
propnames ((list, optional)) – Specifies properties wanted from skimage.regionprops. Defaults to [‘major_axis_length’, ‘minor_axis_length’, ‘equivalent_diameter’]
roi_source ((str, optional)) – Key of an image in Pipeline.data that is used for outputting ROIs and passing to the classifier. Defaults to ‘im_corrected’
bbox_expansion ((float, optional)) – Fractional expansion applied to each particle bounding box before the ROI is cropped and exported, e.g.
0.1enlarges the crop by 10% in width and height (5% on each side, clamped to image bounds). The regionprops measurements and theminr/minc/maxr/maxccolumns written into stats are unaffected. Defaults to0.0(no expansion).as:: (Configure from a TOML pipeline) – [steps.statextract] pipeline_class = “pyopia.process.CalculateStats” export_outputpath = “/path/to/rois” bbox_expansion = 0.1
- Returns:
data – containing the following new keys:
- Return type:
- Required keys in
- class pyopia.process.Segment(minimum_area=12, threshold=0.98, fill_holes=True, segment_source='im_corrected', segmentation_method='fast')#
PyOpia pipline-compatible class for calling segment
- Required keys in
pyopia.pipeline.Data:
- Parameters:
minimum_area ((int, optional)) – minimum number of pixels for particle detection. Defaults to 12.
threshold ((float, optional)) – threshold for segmentation. Defaults to 0.98.
fill_holes ((bool)) – runs ndi.binary_fill_holes if True. Defaults to True.
segment_source ((str, optional)) – The key in Pipeline.data of the image to be segmented. Defaults to ‘im_corrected’
segmentation_method ((str, optional)) – Segmentation method to use:
'fast'or'accurate'. Defaults to'fast'.
- Returns:
data – containing the following new keys:
- Return type:
- Required keys in
- pyopia.process.clean_bw(imbw, minimum_area)#
Cleans up particles which are too small and particles touching the border
- Parameters:
imbw (array) – Segmented image
minimum_area (float) – Minimum number of accepted pixels for a particle
- Returns:
imbw_clean – cleaned up segmented image
- Return type:
array
- pyopia.process.concentration_check(imbw, max_coverage=30)#
Check saturation level of the sample volume by comparing area of particles with settings.Process.max_coverage
- Parameters:
imbw (array) – segmented image
max_coverage (int, optional) – percentage of iamge allowed to be black, by default 30
- Returns:
sat_check (bool) – On if the saturation is acceptable. True if the image is acceptable
saturation (float) – Percentage of maximum acceptable saturation defined by max_coverage
- pyopia.process.expand_bbox(bbox, image_shape, fraction)#
Expand a bounding box by a fraction of its width and height, clamped to image bounds.
The expansion is split evenly on each side, so a fraction of 0.10 grows the bounding box by 5% on each side (total +10% width, +10% height). Coordinates are clamped to remain inside the image. Useful for adding visual context around exported particle ROIs without altering the underlying regionprops measurements.
- Parameters:
bbox (array-like of int) – [min_row, min_col, max_row, max_col], following the skimage regionprops convention where
max_rowandmax_colare exclusive.image_shape (tuple) – Shape of the full image. Only the first two elements (H, W) are used, so passing
imc.shapeworks for both 2-D and 3-D images.fraction (float) – Total fractional expansion of width and height.
0.1= +10%.0(orNone) returns the bbox unchanged. Must be non-negative.
- Returns:
expanded – Expanded and clamped bounding box, integer-valued.
- Return type:
ndarray of int, shape (4,)
- Raises:
ValueError – If
fractionis negative.
- pyopia.process.extract_particles(imc, timestamp, Classification, region_properties, export_outputpath=None, min_length=0, propnames=['major_axis_length', 'minor_axis_length', 'equivalent_diameter'], bbox_expansion=0.0)#
Extracts the particles to build stats and export particle rois to HDF5 files
- Parameters:
imc (array) – background-corrected image
timestamp (timestamp) – timestamp of image collection
Classification (keras model) – initialised classification class from pyiopia.classify
region_properties (object) – region properties object returned from regionprops (skimage.measure.regionprops)
export_outputpath (str, optional) – path for writing h5 output files. Defaults to None, which switches off file writing, by default None
min_length (int, optional) – specifies minimum particle length in pixels to include, by default 0
propnames (list, optional) – Specifies list of skimage regionprops to export to the output file. Must contain default values that can be appended to, by default [‘major_axis_length’, ‘minor_axis_length’, ‘equivalent_diameter’]
bbox_expansion (float, optional) – Fractional expansion of the bounding box used when cropping each ROI for export.
0.0(default) preserves prior behaviour.0.1grows the crop by 10% in width and height (5% on each side), clamped to image bounds. Only the exported ROI image is affected; theminr/minc/maxr/ maxccolumns saved in stats continue to report the un-expanded regionprops bbox so that measurements are unchanged.
- Returns:
stats – List of particle statistics for every particle, according to Partstats class
- Return type:
DataFrame
- pyopia.process.extract_roi(input_image, bbox)#
Given a full image and bounding box, this will return the roi image from within the bounding box
- Parameters:
input_image (array) – Full image. Can be any image, such as background-corrected image
bbox (array) – bounding box from regionprops [r1, c1, r2, c2]
- Returns:
roi – Image cropped to region of interest
- Return type:
array
- pyopia.process.get_spine_length(imbw_roi)#
Extracts the spine length of particles from a binary particle image (imbw is a binary roi)
- Parameters:
imbw_roi (array) – a binary roi of a single particle
- Returns:
spine_length – spine length of particle (in pixels)
- Return type:
float
- pyopia.process.image2blackwhite_accurate(input_image, greythresh)#
Converts corrected image (im_corrected) to a binary image using greythresh as the threshold value (some auto-scaling of greythresh is done)
- Parameters:
input_image (array) – image. Usually a background-corrected image
greythresh (float) –
- threshold multiplier (greythresh is multiplied by 50th percentile of the image
histogram)
- Returns:
imbw – segmented image (binary image)
- Return type:
array
- pyopia.process.image2blackwhite_fast(input_image, greythresh)#
Converts an image (input_image) to a binary image using greythresh as the threshold value (fixed scaling of greythresh is done)
- Parameters:
input_image (array) – image. Usually a background-corrected image
greythresh (float) –
- threshold multiplier (greythresh is multiplied by 50th percentile of the image
histogram)
- Returns:
imbw – segmented image (binary image)
- Return type:
array
- pyopia.process.measure_particles(imbw, max_particles=5000)#
Measures properties of particles
- Parameters:
imbw (array) – full-frame binary image
max_particles (int, optional) – maximum number of particles accepted, by default 5000
- Returns:
region_properties – Region properties object returned from regionprops (skimage.measure.regionprops)
- Return type:
object
- Raises:
RuntimeError – Raises an error if the number of particles exceeds max_particles
- pyopia.process.put_roi_in_h5(export_outputpath, HDF5File, roi, filename, i)#
Adds rois to an open hdf file if export_outputpath is not None. For use within {func}`pyopia.process.export_particles`
- Parameters:
export_outputpath (str) – path to folder in which to put ROIs
HDF5File (h5 file object) – file object for h5 file
roi (uint8) – particle ROI image
i (int) – particle number
- Returns:
filename – filename
- Return type:
str
- pyopia.process.segment(img, threshold=0.98, minimum_area=12, fill_holes=True, segmentation_method='fast')#
Create a binary image from a background-corrected image.
- Parameters:
img (np.array) – background-corrected image
threshold (float, optional) – segmentation threshold, by default 0.98
minimum_area (int, optional) – minimum number of pixels to be considered a particle, by default 12
fill_holes (bool, optional) – runs ndi.binary_fill_holes if True, by default True
segmentation_method (str, optional) – Segmentation function to use. Supports
'fast'and'accurate'. By default'fast'.
- Returns:
imbw – segmented image
- Return type:
np.array
- pyopia.process.statextract(imbw, timestamp, imc, Classification=None, max_coverage=30, max_particles=5000, export_outputpath=None, min_length=0, propnames=['major_axis_length', 'minor_axis_length', 'equivalent_diameter'], bbox_expansion=0.0)#
Extracts statistics of particles in a binary images (imbw)
- Parameters:
imbw (array) – Segmented binary image
timestamp (timestamp) – Timestamp of image collection
imc (array) – Image to analyse (e.g. background-corrected image)
Classification (keras model, optional) – Initialised classification class from pyiopia.classify, by default None
max_coverage (int, optional) – Maximum percentge of image that is acceptable as covered by particles. Image skipped if exceeded, by default 30
max_particles (int, optional) – Maximum number of particles accepted in the image. Image skipped if exceeded., by default 5000
export_outputpath (str, optional) – Path for writing h5 output files. Defaults to None, which switches off file writing, by default None
min_length (int, optional) – Specifies minimum particle length in pixels to include, by default 0
propnames (list, optional) – Specifies list of skimage regionprops to export to the output file. Must contain default values that can be appended to, by default [‘major_axis_length’, ‘minor_axis_length’, ‘equivalent_diameter’]
bbox_expansion (float, optional) – Fractional expansion of bounding boxes when cropping ROI images for export. See
extract_particles(). Defaults to 0.0 (no expansion).
- Returns:
stats (DataFrame) – Pandas DataFrame of particle statistics for every particle
saturation (float) – Percentage saturation of image
Statistics#
Module containing tools for handling particle image statistics after processing
- class pyopia.statistics.PerClassConcentration(output_csv, probability_threshold=0.0, overwrite=False)#
PyOpia pipeline-compatible class for computing per-class number concentrations (in numbers/litre) for each processed image, and appending the result as a timestamp-indexed row to a CSV file.
- Required keys in
pyopia.pipeline.Data: pyopia.pipeline.Data.stats(withprobability_<class>columns)
For every particle in the current image’s stats, the most likely class is selected as
argmaxover theprobability_<class>columns. Particles whose best-guess probability is belowprobability_thresholdare counted in anunclassifiedcolumn. Counts are divided by the per-image sample volume (computed from pixel size, path length and image dimensions) to give a concentration in numbers/litre.A row is appended to
output_csvper image, with the image timestamp as the index. Columns are<class_name>for each classifier class, plusunclassified,totalandsample_volume_L.Both
pixel_sizeandpath_lengthare read from thegeneralsection of the settings dict (data['settings']['general']).- Parameters:
output_csv (str) – Path to the CSV file to write. Parent directory is created if missing.
probability_threshold (float, optional) – Minimum best-guess probability for a particle to count toward its class. Particles below this are counted in the
unclassifiedcolumn. Default 0.0.overwrite (bool, optional) – If True, remove any existing
output_csvat construction time so the run starts with a fresh file. Default False (append).
- Returns:
data – The pipeline data dict, unchanged.
- Return type:
Example
Example config for pipeline usage:
[general] pixel_size = 28 path_length = 40 [steps.classconcentration] pipeline_class = 'pyopia.statistics.PerClassConcentration' output_csv = 'proc/per_class_concentration.csv' probability_threshold = 0.5 overwrite = false
- Required keys in
- pyopia.statistics.add_best_guesses_to_stats(stats)#
Calculates the most likely tensorflow classification and adds best guesses to stats dataframe.
- Parameters:
stats (DataFrame)) – particle statistics from silcam process
- Returns:
stats – particle statistics from silcam process with new columns for best guess and best guess value
- Return type:
DataFrame
- pyopia.statistics.add_depth_to_stats(stats, time, depth)#
If you have a depth time-series, use this function to find the depth of each line in stats
- Parameters:
stats (DataFrame) – particle statistics
time (array) – time stamps associated with depth argument
depth (array) – depths associated with the time argument
- Returns:
stats – particle statistics now with a ‘Depth’ column for each particle
- Return type:
DataFrame
- pyopia.statistics.bright_norm(im, brightness=1.0)#
Eye-candy function for normalising the image brightness
- Parameters:
im (array) – image (normally a particle ROI)
brightness (float, optional) – median of histogram will be shifted to align with this value. Should be a float between 0-1, by default 1
- Returns:
im – image with modified brightness
- Return type:
array
- pyopia.statistics.count_images_in_stats(stats)#
count the number of raw images used to generate stats
- Parameters:
stats (DataFrame) – particle statistics
- Returns:
n_images – number of images in the stats data
- Return type:
int
- pyopia.statistics.crop_stats(stats, crop_stats)#
Filters stats file based on whether the particles are within a rectangle specified by crop_stats.
- Parameters:
stats (DataFrame) – Particle stats dataframe for every particle
crop_stats (tuple) – 4-tuple of lower-left (row, column) then upper-right (row, column) coord of crop
- Returns:
cropped_stats – cropped silcam stats file
- Return type:
DataFrame
- pyopia.statistics.d50_from_stats(stats, pixel_size)#
Calculate the d50 from the stats and settings
- Parameters:
stats (DataFrame) – particle statistics from silcam process
pixel_size (float) – pixel size in microns per pixel
- Returns:
d50 – the 50th percentile of the cumulative sum of the volume distributon, in microns
- Return type:
float
- pyopia.statistics.d50_from_vd(volume_distribution, dias)#
Calculate d50 from a volume distribution
- Parameters:
volume_distribution (array) – Particle volume distribution calculated from vd_from_stats()
dias (array) – mid-points in the size classes corresponding the the volume distribution, returned from get_size_bins()
- Returns:
d50 – The 50th percentile of the cumulative sum of the volume distributon, in microns
- Return type:
float
- pyopia.statistics.explode_contrast(im)#
Eye-candy function for exploding the contrast of a particle iamge (roi)
- Parameters:
im (array) – image (normally a particle ROI)
- Returns:
im_mod – image following exploded contrast
- Return type:
array
- pyopia.statistics.extract_latest_stats(stats, window_size)#
Extracts the stats data from within the last number of seconds specified by window_size.
- Parameters:
stats (DataFrame) – particle statistics
window_size (float) – number of seconds to extract from the end of the stats data
- Returns:
stats_selected – particle statistics after specified time window (given by window_size)
- Return type:
DataFrame
- pyopia.statistics.extract_nth_largest(stats, n=0)#
Return statistics of the nth largest particle
- Parameters:
stats (DataFrame) – particle statistics
n (int, optional) – nth largest particle to use, by default 0
- Returns:
statistics of the nth largest particle
- Return type:
stats_extract
- pyopia.statistics.extract_nth_longest(stats, n=0)#
Return statistics of the nth longest particle
- Parameters:
stats (DataFrame) – particle statistics
n (int, optional) – nth largest particle to use, by default 0
- Returns:
statistics of the nth largest particle
- Return type:
stats_extract
- pyopia.statistics.extract_oil(stats, probability_threshold=0.85, solidity_threshold=0.95, feret_threshold=0.3)#
Creates a new stats dataframe containing only oil, based on some thresholds on calculated statistics
- Parameters:
stats (DataFrame) – particle statistics
probability_threshold (float, optional) – Threshold applied to probability_oil (from the classifier), by default 0.85
solidity_threshold (float, optional) – Threshold applied to the solidity statistic (area of object / convex hull). For droplets, this threshold is used as a crude way of removing operlapping droplets by ensuring there are no substantial indents in the alpha shape, by default 0.95
feret_threshold (float, optional) – Threshold of deformation (minor/major axis) beyond which the droplet is considered significantly deformed or at risk of breakup., by default 0.3
- Returns:
oilstats – particle statistics for just oil (a new stats DataFrame containing only oil). .. Warning: this returned DataFrame will likely have a shorter length than the original and can even be empty for single-image DataFrames if no particles satisfy the thresholds, so be carefull to include all analyzed images when calculating volume concentraitons
- Return type:
DataFrame
- pyopia.statistics.gen_roifiles(stats, auto_scaler=500)#
Generates a list of filenames suitable for making montages with
- Parameters:
stats (DataFrame) – particle statistics
auto_scaler (int) – approximate number of particle that are attempted to be pack into montage, by default 500
roifiles (list) – a list of string of filenames that can be passed to montage_maker() for making nice montages
- pyopia.statistics.get_j(dias, number_distribution)#
Calculates the junge slope from a correctly-scale number distribution (number per micron per litre must be the units of nd)
- Parameters:
dias (array) – mid-point of size bins
number_distribution (array) – number distribution in number per micron per litre
- Returns:
junge_slope – Junge slope from fitting of psd between 150 and 300um
- Return type:
float
- pyopia.statistics.get_sample_volume(pix_size, path_length, imx=2048, imy=2448)#
calculate the sample volume of one image
- Parameters:
pix_size (float) – size of pixels in microns
path_length (float) – path length of sample volume in mm
imx (int, optional) – image x dimention in pixels, by default 2048
imy (int, optional) – image y dimention in pixels, by default 2448
- Returns:
sample_volume_litres – Volume of the sample volume in litres
- Return type:
float
- pyopia.statistics.get_size_bins()#
Retrieve log-spaced size bins for PSD analysis by doing the same binning as LISST-100x, but with 53 size bins
- Returns:
dias (array) – Mid-points of size bins in microns
bin_limits (array) – Limits of size bins in microns
- pyopia.statistics.make_montage(stats_file_or_df, pixel_size, roidir, auto_scaler=500, msize=1024, maxlength=100000, crop_stats=None, brightness=1, eyecandy=True)#
Makes nice looking montage from a directory of extracted particle images
- Parameters:
stats_file_or_df (DataFrame or str) – either a str specifying the location of the STATS.nc file that comes from processing, or a stats dataframe
pixel_size (float) – pixel size of system
roidir (str) – location of roifiles
auto_scaler (int, optional) – approximate number of particle that are attempted to be packed into montage, by default 500
msize (int, optional) – size of canvas in pixels, by default 1024
maxlength (int, optional) – maximum length in microns of particles to be included in montage, by default 100000
crop_stats (tuple, optional) – None or 4-tuple of lower-left then upper-right coord of crop, by default None
brightness (int, optional) – brighness of packaged particles used with eyecandy option, by default 1
eyecandy (bool, optional) – boolean which if True will explode the contrast of packed particles (nice for natural particles, but not so good for oil and gas)., by default True
- Returns:
montage_image – montage image that can be plotted with
pyopia.plotting.montage_plot()- Return type:
array
- pyopia.statistics.make_timeseries_vd(stats, pixel_size, path_length, time_reference)#
Makes a dataframe of time-series volume distribution and d50 similar to Sequoia LISST-100 output, and exportable to things like Excel or csv.
Note
If zero particles are detected within the stats daraframe, then the volume concentration should be reported as zero for that time. For this function to have awareness of these times, it requires time_reference variable. If you use stats[‘timestamp’].unique() for this, then you are assuming you have at least one particle per image. It is better to use image_stats[‘timestamp’].values instead, which can be obtained from
pyopia.io.load_image_stats()- Parameters:
stats (DataFrame) – loaded from a *-STATS.nc file (convert from xarray like this: stats = xstats.to_dataframe())
pixel_size (float) – pixel size in microns per pixel
path_length (float) – path length of the sample volume in mm
time_reference (array) – time-series associated with the stats dataset stats[‘timestamp’].unique()
- Returns:
time_series – time series volume concentrations are in uL/L columns with number headings are diameter mid-points
- Return type:
DataFrame
Example
- pyopia.statistics.nc_from_nd(number_distribution, sample_volume)#
Calculate the number concentration from the count and sample volume
- Parameters:
number_distribution (array) – number distribution
sample_volume (float, optional) – sample volume size (litres), by default 1
- Returns:
number_concentration – Particle number concentration in #/L
- Return type:
float
- pyopia.statistics.nc_vc_from_stats(stats, pix_size, path_length, imx=2048, imy=2448)#
Calculates important summary statistics from a stats DataFrame
- Parameters:
stats (DataFrame) – particle statistics
pix_size (float) – size of pixels in microns
path_length (float) – path length of sample volume in mm
imx (int, optional) – number of x-dimention pixels in the image, by default 2048
imy (int, optional) – number of y-dimention pixels in the image, by default 2448
- Returns:
number_concentration (float) – Total number concentration in #/L
volume_concentration (float) – Total volume concentration in uL/L
sample_volume (float) – Total volume of water sampled in L
junge_slope (float) – Slope of a fitted juge distribution between 150-300um
- pyopia.statistics.nd_from_stats(stats, pix_size)#
Calculate number distirbution from stats units are number per bin per sample volume
- Parameters:
stats (DataFrame) – particle statistics from silcam process
pix_size (float) – pixel size in microns
- Returns:
dias (array) – mid-points of size bins
number_distribution (array) – number distribution in number/size-bin/sample-volume
- pyopia.statistics.nd_from_stats_scaled(stats, pix_size, path_length)#
Calcualte a scaled number distribution from stats. units of nd are in number per micron per litre
- Parameters:
stats (DataFrame) – Particle statistics from silcam process
pix_size (float) – size of pixels in microns
path_length (float) – path length of sample volume in mm
- Returns:
dias (array) – mid-points of size bins
number_distribution (array) – number distribution in number/micron/litre
- pyopia.statistics.nd_rescale(dias, number_distribution, sample_volume)#
Rescale a number distribution from number per bin per sample volume to number per micron per litre.
- Parameters:
dias (array) – mid-points of size bins
number_distribution (array) – unscaled number distribution
sample_volume (float) – sample volume of each image
- Returns:
number_distribution_scaled – scaled number distribution (number per micron per litre)
- Return type:
array
- pyopia.statistics.roi_from_export_name(exportname, path)#
Returns an image from the export_name string in the -STATS.h5 file
Get the exportname like this:
`python exportname = stats['export_name'].values[0] `- Parameters:
exportname (str) – string containing the name of the exported particle e.g. stats[‘export_name’].values[0]
path (str) – path to exported h5 files
- Returns:
im – particle ROI image
- Return type:
array
- pyopia.statistics.show_h5_meta(h5file)#
prints metadata from an exported hdf5 file created from silcam process
- Parameters:
h5file (str) – h5 filename from exported data from silcam process
- pyopia.statistics.statscsv_to_statshdf(stats_file)#
Convert old STATS.csv file to a STATS.h5 file
- Parameters:
stats_file (str) – filename of stats file
- pyopia.statistics.trim_stats(stats_file, start_time, end_time, write_new=False, stats=[])#
Chops a STATS.h5 file given a start and end time
- Parameters:
stats_file (str) – filename of stats file
start_time (timestr) – start time of interesting window
end_time (timestr) – end time of interesting window
write_new (bool, optional) – if True will write a new stats csv file to disc, by default False
stats (DataFrame, optional) – pass stats DataFrame into here if you don’t want to load the data from the stats_file given. In this case the stats_file string is only used for creating the new output datafilename., by default []
- Returns:
trimmed_stats (DataFrame) – particle statistics
outname (str) – name of new stats csv file written to disc
- pyopia.statistics.vd_from_nd(number_distribution, dias, sample_volume=1.0)#
Calculate volume concentration from particle count
- Parameters:
number_distribution (array) – number distribution
dias (array) – particle diameters in microns associated with number_distribution
sample_volume (float, optional) – sample volume size (litres), by default 1
- Returns:
volume_distribution – Particle volume distribution
- Return type:
array
- pyopia.statistics.vd_from_stats(stats, pix_size)#
Calculate volume distribution from stats units of miro-litres per sample volume
- Parameters:
stats (DataFrame) – particle statistics from silcam process
pix_size (float) – pixel size in microns
- Returns:
dias (array) – mid-points of size bins
volume_distribution (array) – volume distribution in micro-litres/sample-volume
- pyopia.statistics.vd_to_nc(volume_distribution, dias)#
calculate number concentration from volume distribution
- Parameters:
volume_distribution (array) – particle volume distribution calculated from vd_from_stats()
dias (array) – mid-points in the size classes corresponding the the volume distribution, returned from get_size_bins()
- Returns:
number_concentration – number concentration (scaling is the same unit as the input vd). If vd is a 2d array [time, vd_bins], nc will be the concentration for row
- Return type:
float
- pyopia.statistics.vd_to_nd(volume_distribution, dias)#
convert volume distribution to number distribution
- Parameters:
volume_distribution (array) – particle volume distribution calculated from vd_from_stats()
dias (array) – mid-points in the size classes corresponding the the volume distribution, returned from get_size_bins()
- Returns:
number_distribution – number distribution as number per micron per bin (scaling is the same unit as the input vd)
- Return type:
array
Plotting#
Particle plotting functionality for standardised figures e.g. image presentation, size distributions, montages etc.
- pyopia.plotting.classify_plot_class_rois(class_name, classifier, filelist)#
Classify single-object (ROI) images and plot images in a grid with best guess class.
- Parameters:
class_name (str) – Name of class ROI files belong to (e.g. ‘copepod’)
classifier (pyopia.classify.Classify) – PyOPIA classifier instance
filelist (list) – List of single-object (ROI) files
- Returns:
df_ – Classification results for each image
- Return type:
pandas.DataFrame
- pyopia.plotting.classify_rois(roilist, classifier)#
Classify list of single-object images
If true_class is specified, mark ROIs not matching this class in the figure.
- Parameters:
roilist (list) – List of ROI images to classify
classifier (pyopia.classify.Classify) – Used to classify ROIs
- Returns:
df – Class probabilities for each item in roifiles
- Return type:
pd.DataFrame
- pyopia.plotting.montage_plot(montage, pixel_size)#
Plots a SilCam particle montage with a 1mm scale reference
- Parameters:
montage (uint8) – a montage created with scpp.make_montage
pixel_size (float) – the pixel size (um) of the imaging system used
- pyopia.plotting.plot_classified_rois(roilist, df_class_labels, true_class=None)#
Plot classified single-object images and show them in a figure grid with classification info
If true_class is specified, mark ROIs not matching this class in the figure.
- Parameters:
roilist (list) – List of ROI image to classify
df_class_labels (pd.DataFrame) – Class label for each image in roilist in a column named “best guess”
true_class (str) – True class of listed ROIs
- Returns:
fig (matplotlib figure)
ax (matplotlib axes)
- pyopia.plotting.show_image(image, pixel_size)#
Plots a scaled figure (in mm) of an image
- Parameters:
image (float) – Image (usually a corrected image, such as im_corrected)
pixel_size (float) – the pixel size (um) of the imaging system used
IO#
Module containing tools for datafile and metadata handling
- pyopia.io.StatsH5(**kwargs)#
Deprecated since version 2.4.8:
pyopia.io.StatsH5will be removed in version 3.0.0, it is replaced bypyopia.io.StatsToDisc.PyOpia pipline-compatible class for calling write_stats() that creates h5 files.
- Parameters:
output_datafile (str) – prefix path for output nc file
dataformat (str) – either ‘nc’ or ‘h5
export_name_len (int) – max number of chars allowed for col ‘export_name’. Defaults to 40
append (bool) – Append all processed data into one nc file. Defaults to True. If False, then one nc file will be generated per raw image, which can be loaded using
pyopia.io.combine_stats_netcdf_files()This is useful for larger datasets, where appending causes substantial slowdown as the dataset gets larger.
- Returns:
data – data from the pipeline
- Return type:
Example
Example config for pipeline useage:
[steps.output] pipeline_class = 'pyopia.io.StatsH5' output_datafile = './test' # prefix path for output nc file append = true
- class pyopia.io.StatsToDisc(output_datafile='data', dataformat='nc', export_name_len=40, append=True, project_metadata_file=None, auxillary_data_file=None)#
PyOpia pipline-compatible class for calling write_stats() that created NetCDF files.
- Parameters:
output_datafile (str) – prefix path for output nc file
dataformat (str) – either ‘nc’ or ‘h5
export_name_len (int) – max number of chars allowed for col ‘export_name’. Defaults to 40
append (bool) – Append all processed data into one nc file. Defaults to True. If False, then one nc file will be generated per raw image, which can be loaded using
pyopia.io.combine_stats_netcdf_files()This is useful for larger datasets, where appending causes substantial slowdown as the dataset gets larger.project_metadata_file (str,) – Path to project metadata file, this is added to stats netcdf file global metadata.
auxillary_data_file (str) – Path to auxillary data file, columns in this (csv) file is interpolated and added to stats netcdf file.
- Returns:
data – data from the pipeline
- Return type:
Example
Example config for pipeline useage:
[steps.output] pipeline_class = 'pyopia.io.StatsToDisc' output_datafile = './test' # prefix path for output nc file append = true project_metadata_file = 'metadata.json' auxillary_data_file = 'auxillarydata/auxillary_data.csv'
- pyopia.io.add_cf_attributes(xstats)#
Adds CF-compliant attributes and units to the xarray Dataset.
- Parameters:
xstats (xarray.Dataset) – The dataset to which CF-compliant attributes will be added.
- pyopia.io.combine_stats_netcdf_files(path_to_data, prefix='*')#
Deprecated since version 2.4.11:
pyopia.io.combine_stats_netcdf_fileswill be removed in version 3.0.0, it is replaced bypyopia.io.concat_stats_netcdf_files.Combine a multi-file directory of STATS.nc files into a ‘stats’ xarray dataset created by
pyopia.io.write_stats()when using ‘append = false’- Parameters:
path_to_data (str) – Folder name containing nc files with pattern ‘Image-D-STATS.nc’
prefix (str) – Prefix to multi-file dataset (for replacing <prefix> in the file name pattern ‘<prefix>Image-D*-STATS.nc’). Defaults to ‘*’
- Returns:
xstats (xarray.Dataset) – Particle statistics and metatdata from processing steps
image_stats (xarray.Dataset) – summary statistics of each raw image (including those with no particles)
- pyopia.io.concat_stats_netcdf_files(sorted_filelist)#
Concatenate specified list of STATS.nc files into one ‘xstats’ xarray dataset created by :func:`pyopia.io.write_stats when using ‘append = false’.
Existing files are first loaded and then combined, so memory usage will go up with longer file lists.
- Parameters:
sorted_filelist (str) – List of files to be combined into single dataset
- Returns:
xstats (xarray.Dataset or None) – Particle statistics and metatdata from processing steps
image_stats (xarray.Dataset or None) – Summary statistics of each raw image (including those with no particles)
- pyopia.io.load_image_stats(datafilename)#
Load the summary stats and time information for each image
- Parameters:
datafilename (str) – filename of -STATS.nc
- Returns:
image_stats – summary statistics of each raw image (including those with no particles)
- Return type:
xarray.Dataset
- pyopia.io.load_stats(datafilename)#
Load -STATS.nc file as xarray Dataset
Warning
Support for loading of old -STATS.h5 formats will be removed in version 3.0.0. They will need to be converted to .nc prior to loading. Data loaded from -STATS.h5 are returned as an xarray Dataset without metadata.
- Parameters:
datafilename (str) – filename of -STATS.h5 or STATS.nc
- Returns:
xstats – Particle statistics
- Return type:
xarray.Dataset
- pyopia.io.load_stats_as_dataframe(stats_file)#
A loading function for stats files that forces stats into a pandas DataFrame
- Parameters:
stats_file (str) – filename of NetCDF of H5 -STATS file
- Returns:
stats – stats pandas dataframe
- Return type:
DataFrame
- pyopia.io.load_toml(toml_file)#
Load a TOML settings file from file
- Parameters:
toml_file (str) – TOML filename
- Returns:
settings – TOML settings
- Return type:
dict
- pyopia.io.make_xstats(stats, toml_steps, proj_metadata=None, auxillary_data: ~pyopia.auxillarydata.AuxillaryData = <pyopia.auxillarydata.AuxillaryData object>)#
Converts a stats dataframe into xarray DataSet, with metadata
- Parameters:
stats (Pandas DataFrame) – particle statistics
toml_steps (dict) – TOML-based steps dictionary
proj_metadata (pyopia.metadata.Metadata) – Project metadata, such as license, creator, etc. Added to stats netcdf
auxillary_data (xr.Dataset) – Auxillary data varies, such as depth, temperature, etc. Added to stats netcdf
- Returns:
xstats – Xarray version of stats dataframe, including metadata
- Return type:
xarray.Dataset
- pyopia.io.merge_and_save_mfdataset(path_to_data, prefix='*', overwrite_existing_partials=False, chunk_size=None)#
Combine a multi-file directory of STATS.nc files into a single ‘-STATS.nc’ file that can then be loaded with
pyopia.io.load_stats()- Parameters:
path_to_data (str) – Folder name containing nc files with pattern ‘Image-D-STATS.nc’
prefix (str) – Prefix to multi-file dataset (for replacing the wildcard in ‘Image-D-STATS.nc’). Defaults to ‘*’
overwrite_existing_partials (bool) – Do not reprocess existing merged netcdf files for each chunk if False. Otherwise reprocess (load) and overwrite. This can be used to restart or continue a previous merge operation as new files become available.
chunk_size (int) – Number of files to be loaded and merged in each step. Produces a number of intermediate/partially merged netcdf files equal to the total number of input files divided by chunk_size. The last chunk may contain less files than specified, depending on the total number of files. Default: None, which processes all files together.
- pyopia.io.setup_xstats_encoding(xstats, string_vars=['export_name', 'holo_filename'])#
Setup encoding for writing to NetCDF, where string variables are explicitly defined as string types
Notes
Setting up encoding like this for xstats is needed because default behaviour is to set everything as float if there is no value, so in a situation where the first image contains no particles we must ensure that string variables are set as string types.
- Parameters:
xstats (xarray.Dataset) – Xarray version of stats dataframe, including metadata
string_vars (list, optional) – list of string columns in xstats, by default [‘export_name’, ‘holo_filename’]
- Returns:
encoding – ‘encoding’ input argument to be given to xstats.to_netcdf()
- Return type:
dict
- pyopia.io.show_h5_meta(h5file)#
prints metadata from an exported hdf5 file created from pyopia.process
- Parameters:
h5file (str) – h5 filename from exported data from pyopia.process
- pyopia.io.steps_from_xstats(xstats)#
Get the steps attribute from xarray version of the particle stats into a dictionary
- Parameters:
xstats (xarray.DataSet) – xarray version of the particle stats dataframe, containing metadata
- Returns:
steps – TOML-formatted dictionary of pipeline steps
- Return type:
dict
- pyopia.io.write_stats(stats, datafilename, settings=None, export_name_len=40, dataformat='nc', append=True, image_stats=None, proj_metadata=None, auxillary_data: ~pyopia.auxillarydata.AuxillaryData = <pyopia.auxillarydata.AuxillaryData object>)#
Writes particle stats into the ouput file. Appends if file already exists.
- Parameters:
datafilename (str) – Filame prefix for -STATS.h5 file that may or may not include a path
stats (DataFrame or xr.Dataset) – Particle statistics
export_name_len (int) – Max number of chars allowed for col ‘export_name’
append (bool) – Append all processed data into one nc file. Defaults to True. If False, then one nc file will be generated per raw image, which can be loaded using
pyopia.io.combine_stats_netcdf_files()This is useful for larger datasets, where appending causes substantial slowdown as the dataset gets larger.image_stats (xr.Dataset) – Summary statistics of each raw image (including those with no particles)
proj_metadata (pyopia.metadata.Metadata) – Project metadata, such as license, creator, etc. Added to stats netcdf
auxillary_data (AuxillaryData) – Auxillary data variables, such as depth, temperature, etc. Added to stats netcdf
Classify#
ExampleData#
- pyopia.exampledata.get_classifier_database_from_pysilcam_blob(download_directory='./')#
Downloads and unzips the silcam_database of labelled example images from pysilcam.blob into the working dir. if it doesn’t already exist
- Parameters:
download_directory (string) – directory to download and unzip the silcam_database.zip into. Defaults to “./”
- Returns:
download_directory, the directory that the silcam_database.zip was downloaded and unzipped into
- Return type:
string
- pyopia.exampledata.get_example_hologram_and_background(download_directory='./')#
calls get_file_from_pysilcam_blob for a raw hologram, and its associated background image.
- Parameters:
download_directory (string) – directory to download the file into. Defaults to “./”
- Returns:
string – holo_filename
string – holo_background_filename
- pyopia.exampledata.get_example_model(download_directory='./')#
Download PyOPIA default CNN model classifier
Download from the pysilcam blob storage into the working dir. If the file exists, skip the download.
- Parameters:
download_directory (string) – directory to download the file into. Defaults to “./”
- Returns:
model_filename
- Return type:
string
- pyopia.exampledata.get_example_silc_image(download_directory='./')#
calls get_file_from_pysilcam_blob for a silcam iamge
- Parameters:
download_directory (string) – directory to download the file into. Defaults to “./”
- Returns:
filename of the downloaded silcam image
- Return type:
string
- pyopia.exampledata.get_file_from_pysilcam_blob(filename, download_directory='./')#
Downloads a specified filename from the pysilcam.blob into the working dir. if it doesn’t already exist
only works for known filenames that are on this blob
- Parameters:
filename (string) – known filename on the blob
download_directory (string) – directory to download the file into. Defaults to “./”
- Returns:
filename of the downloaded file
- Return type:
string
- pyopia.exampledata.get_folder_from_holo_repository(foldername='holo_test_data_01', existsok=False)#
Downloads a specified folder from the holo testing repository into the working dir. if it doesn’t already exist
only works for known folders that are on the GoogleDrive repository by default will download a known-good folder. Additional elif statements can be added to implement additional folders.
- Parameters:
foldername (string) – known filename on the blob
existsok ((bool, optional)) – if True, then don’t download if the specified folder already exists, defaults to False
Instruments#
SilCam#
Module containing SilCam specific tools to enable compatability with the pyopia.pipeline
See: Davies, E. J., Brandvik, P. J., Leirvik, F., & Nepstad, R. (2017). The use of wide-band transmittance imaging to size and classify suspended particulate matter in seawater. Marine Pollution Bulletin, 115(1–2). https://doi.org/10.1016/j.marpolbul.2016.11.063
- class pyopia.instrument.silcam.ImagePrep(image_level='im_corrected')#
PyOpia pipline-compatible class for preparing silcam images for further analysis
- Required keys in
pyopia.pipeline.Data:
- Returns:
data – containing the following new keys:
- Return type:
- Required keys in
- class pyopia.instrument.silcam.SilCamLoad(image_format='infer', prefix_chars=1)#
PyOpia pipline-compatible class for loading a single silcam image and extracting the timestamp using
pyopia.instrument.silcam.timestamp_from_filename()- Required keys in
pyopia.pipeline.Data: pyopia.pipeline.Data.filenameconforming to the format ‘DYYYYmmddTHHMMSS.ffffff.silc’
e.g. ‘D20240919T074500.183294.silc’
- Parameters:
image_format (str, optional) – Image file format. Can be either ‘infer’, ‘rgb8’, ‘bayer_rg8’ or ‘mono8’, by default ‘infer’.
prefix_chars (int, optional) – number of characters to ignore at start of filename when parsing timestamp, by default 1 (e.g. to ignore ‘D’ in ‘D20221101T120000.silc’)
Note
- ‘infer’ uses the file extension to determine the image format using the following convention:
‘.silc’ for RGB8
‘.msilc’ for MONO8
‘.bsilc’ for BAYER_RG8
‘.bmp’ for using skimage.io.imread
- Returns:
data – containing the following new keys:
- Return type:
- Required keys in
- pyopia.instrument.silcam.generate_config(raw_files: str, model_path: str, outfolder: str, output_prefix: str)#
Generate example silcam config.toml as a dict
- Parameters:
raw_files (str) – raw_files
model_path (str) – model_path
outfolder (str) – outfolder
output_prefix (str) – output_prefix
- Returns:
pipeline_config toml dict
- Return type:
dict
- pyopia.instrument.silcam.load_bayer_rgb8(filename)#
load an RG8 .bsilc file from disc and convert it to RGB image
Assumes 8-bit Bayer-RG (Red-Green) image in range 0-255
- Parameters:
filename (string) – filename to load
- Returns:
raw image float between 0-1
- Return type:
array
- pyopia.instrument.silcam.load_image(filename)#
Deprecated since version 2.4.6:
pyopia.instrument.silcam.load_image()will be removed in version 3.0.0, it is replaced bypyopia.instrument.silcam.load_rgb8()because this is more explicit to that image type.Load an RGB .silc file from disc
- Parameters:
filename (string) – filename to load
- Returns:
raw image float between 0-1
- Return type:
array
- pyopia.instrument.silcam.load_mono8(filename)#
load a mono8 .msilc file from disc
Assumes 8-bit mono image in range 0-255
- Parameters:
filename (string) – filename to load
- Returns:
raw image float between 0-1
- Return type:
array
- pyopia.instrument.silcam.load_rgb8(filename)#
load an RGB .silc file from disc
Assumes 8-bit RGB image in range 0-255
- Parameters:
filename (string) – filename to load
- Returns:
raw image float between 0-1
- Return type:
array
- pyopia.instrument.silcam.timestamp_from_filename(filename, prefix_chars=1)#
get a pandas timestamp from a silcam filename
- Parameters:
filename (string) – silcam filename (.silc)
prefix_chars (int, optional) – number of characters to ignore at start of filename when parsing timestamp, by default 1 (e.g. to ignore ‘D’ in ‘D20221101T120000.silc’)
- Returns:
timestamp – timestamp from pandas.to_datetime()
- Return type:
pandas.Timestamp
Holo#
This is a module containing basic processing for reconstruction of in-line holographic images with pyopia.pipeline.
See (and references therein): Davies EJ, Buscombe D, Graham GW & Nimmo-Smith WAM (2015) ‘Evaluating Unsupervised Methods to Size and Classify Suspended Particles Using Digital In-Line Holography’ Journal of Atmospheric and Oceanic Technology 32, (6) 1241-1256, https://doi.org/10.1175/JTECH-D-14-00157.1 https://journals.ametsoc.org/view/journals/atot/32/6/jtech-d-14-00157_1.xml
2022-11-01 Alex Nimmo-Smith alex.nimmo.smith@plymouth.ac.uk
- class pyopia.instrument.holo.Focus(stacksummary_function='std_map', threshold=0.9, focus_function='find_focus_imax', discard_end_slices=True, increase_depth_of_field=False, merge_adjacent_particles=0)#
PyOpia pipline-compatible class for creating a focussed image from an image stack
- Required keys in
pyopia.pipeline.Data:
- Parameters:
stacksummary_function ((string, optional)) –
Function used to summarise the stack Available functions are:
pyopia.instrument.holo.max_map()pyopia.instrument.holo.std_map()(default)threshold (float) – threshold to apply during initial segmentation
focus_function ((string, optional)) –
Function used to focus particles within the stack Available functions are:
pyopia.instrument.holo.find_focus_imax()(default)discard_end_slices ((bool, optional)) – set to True to discard particles that focus at either first or last slice
increase_depth_of_field ((bool, optional)) – set to True to use max values from planes either side of main focus plane to create focussed image (default False)
merge_adjacent_particles ((bool, optional)) – set to 0 (default) to deactivate, set to positive integer to give radius in pixels of smoothing of stack summary image to merge adjacent particles
- Returns:
data – containing the following keys:
pyopia.pipeline.Data.im_focussedpyopia.pipeline.Data.stack_rppyopia.pipeline.Data.stack_ifocus- Return type:
- Required keys in
- class pyopia.instrument.holo.Initial(wavelength, n, offset, minZ, maxZ, stepZ)#
PyOpia pipline-compatible class for one-time setup of holograhic reconstruction
- Parameters:
wavelength (float) – laser wavelength in nm
n (float) – refractive index of medium
offset (float) – offset of focal plane from hologram plane in mm
minZ (float) – minimum reconstruction distance in mm
maxZ (float) – maximum reconstruction distance in mm
stepZ (float) – step size in mm (i.e. resolution of reconstruction between minZ and maxZ)
- Returns:
kern (np.arry) – reconstruction kernel
im_stack (np.array) – pre-allocated array to receive reconstruction
- class pyopia.instrument.holo.Load(prefix_chars=1)#
PyOpia pipline-compatible class for loading a single holo image
- Parameters:
filename (string) – hologram filename (.pgm)
prefix_chars (int, optional) – number of characters to ignore at start of filename when parsing timestamp, by default 1 (e.g. to ignore ‘D’ in ‘D20221101T120000.pgm’)
- Returns:
timestamp (pandas.Timestamp) – timestamp from filename
imraw (np.array) – hologram
- class pyopia.instrument.holo.MergeStats#
PyOpia pipline-compatible class for merging holo-specific statistics into output stats
- Parameters:
None –
- Returns:
data – Updated pipeline data, where data[‘stats’] includes the new columns: ‘holo_filename’, ‘z’, and ‘ifocus’
- Return type:
- class pyopia.instrument.holo.Reconstruct(stack_clean=0, forward_filter_option=0, inverse_output_option=0)#
PyOpia pipline-compatible class for reconstructing a single holo image
- Required keys in
pyopia.pipeline.Data:
- Parameters:
stack_clean (float) – defines amount of cleaning of stack (fraction of max value below which to zero)
forward_filter_option (int) – switch to control filtering in frequency domain (0=none,1=DC only,2=zero ferquency/default)
inverse_output_option (int) – switch to control optional scaling of output intensity (0=square/default,1=linear)
- Returns:
data – containing the following new keys:
- Return type:
- Required keys in
- pyopia.instrument.holo.clean_stack(im_stack, stack_clean)#
clean the im_stack by removing low value pixels - set to 0 to disable
- Parameters:
im_stack (np.array) –
stack_clean (flaot) – pixels below this value will be zeroed
- Returns:
cleaned version of im_stack
- Return type:
np.array
- pyopia.instrument.holo.create_kernel(im, pixel_size, wavelength, n, offset, minZ, maxZ, stepZ)#
create reconstruction kernel
- Parameters:
im (np.arry) – hologram
pixel_size (float) – pixel_size in microns per pixel (i.e. usually 4.4 for lisst-holo type of resolution)
wavelength (float) – laser wavelength in nm
minZ (float) – minimum reconstruction distance in mm
maxZ (float) – maximum reconstruction distance in mm
stepZ (float) – step size in mm (i.e. resolution of reconstruction between minZ and maxZ)
- Returns:
holographic reconstruction kernel (3D array of complex numbers)
- Return type:
np.array
- pyopia.instrument.holo.find_focus_imax(im_stack, bbox, increase_depth_of_field)#
finds and returns the focussed image for the bbox region within im_stack using intensity of bbox area
- Parameters:
im_stack (nparray) – image stack
bbox (tuple) – Bounding box (min_row, min_col, max_row, max_col)
increase_depth_of_field (bool) – set to True to use max values from planes either side of main focus plane to create focussed image (default False)
- Returns:
im (image) – focussed image for bbox
ifocus (int) – index through stack of focussed image
- pyopia.instrument.holo.find_focus_sobel(im_stack, bbox, increase_depth_of_field)#
finds and returns the focussed image for the bbox region within im_stack using edge magnitude of bbox area
- Parameters:
im_stack (nparray) – image stack
bbox (tuple) – Bounding box (min_row, min_col, max_row, max_col)
increase_depth_of_field (bool) – set to True to use max values from planes either side of main focus plane to create focussed image (default False)
- Returns:
im (image) – focussed image for bbox
ifocus (int) – index through stack of focussed image
- pyopia.instrument.holo.forward_transform(im, forward_filter_option=2)#
Perform forward transform with optional filtering
- Parameters:
im (np.array) – hologram (usually background-corrected)
forward_filter_option (int) – filtering in frequency domain (0=none/default,1=DC only,2=zero ferquency)
- Returns:
im_fft – im_fft
- Return type:
np.array
- pyopia.instrument.holo.generate_config(raw_files: str, model_path: str, outfolder: str, output_prefix: str)#
Generaste example holo config.toml as a dict
- Parameters:
raw_files (str) – raw_files
model_path (str) – model_path
outfolder (str) – outfolder
output_prefix (str) – output_prefix
- Returns:
pipeline_config toml dict
- Return type:
dict
- pyopia.instrument.holo.inverse_transform(im_fft, kern, im_stack, inverse_output_option=0)#
create the reconstructed hologram stack of real images
- Parameters:
im_fft (np.array) – calculated from forward_transform
kern (np.array) – calculated from create_kernel
im_stack (np.array) – pre-allocated array to receive output
inverse_output_option (int) – optional scaling of output intensity (0=square/default,1=linear)
- Returns:
im_stack
- Return type:
np.arry
- pyopia.instrument.holo.load_image(filename)#
load a hologram image file from disc
- Parameters:
filename (string) – filename to load
- Returns:
raw image
- Return type:
array
- pyopia.instrument.holo.max_map(im_stack)#
_summary_
- Parameters:
im_stack (_type_) – _description_
- Returns:
_description_
- Return type:
_type_
- pyopia.instrument.holo.read_lisst_holo_info(filename)#
reads the non-image information (timestamp, etc) from LISST-HOLO holograms
- Parameters:
filename (string) – filename to load
- Returns:
timestamp – timestamp
- Return type:
timestamp
- pyopia.instrument.holo.rescale_image(im)#
rescale im (e.g. may be stack summary) to be dark particles on light background
- Parameters:
im (image) – input image to be scaled
- Returns:
im – scaled and inverted image
- Return type:
image
- pyopia.instrument.holo.std_map(im_stack)#
_summary_
- Parameters:
im_stack (_type_) – _description_
- Returns:
_description_
- Return type:
_type_
UVP#
Module containing UVP specific tools to enable compatability with the pyopia.pipeline
- class pyopia.instrument.uvp.UVPLoad#
PyOpia pipline-compatible class for loading a single UVP image using
pyopia.instrument.uvp.load_image()and extracting the timestamp usingpyopia.instrument.uvp.timestamp_from_filename()- Required keys in
pyopia.pipeline.Data:
- Returns:
data – containing the following new keys:
- Return type:
- Required keys in
- pyopia.instrument.uvp.generate_config(raw_files: str, model_path: str, outfolder: str, output_prefix: str)#
Generate example uvp config.toml as a dict
- Parameters:
raw_files (str) – raw_files
model_path (str) – model_path
outfolder (str) – outfolder
output_prefix (str) – output_prefix
- Returns:
pipeline_config toml dict
- Return type:
dict
- pyopia.instrument.uvp.load_image(filename)#
load a UVP .png file from disc
- Parameters:
filename (string) – filename to load
- Returns:
raw image float between 0-1, inverted so that particles are dark on a light background
- Return type:
array
- pyopia.instrument.uvp.timestamp_from_filename(filename)#
get a pandas timestamp from a UVP vignette image filename
- Parameters:
(string) (filename) –
- Returns:
timestamp
- Return type:
timestamp from pandas.to_datetime()
Common#
Non-instrument-specific functions that operates on the image loading or initial processing level.
- class pyopia.instrument.common.CircularImageMask(radius, center=None)#
PyOPIA pipline-compatible class for masking out part of the raw image with a circular centered disc
Required keys in
pyopia.pipeline.Data: -pyopia.pipeline.Data.imraw- Parameters:
radius ((int)) – Radius in pixel of the circular disc mask (image outside disc is set to 0)
center ((list of ints, optional)) – Center coordinate for masking circle (pixels)
- Returns:
data – containing the new key:
- Return type:
Put this in your pipeline right after load step to mask out border outside specified pixel coordinates. Remember to update the next step in the pipeline to use ‘im_masked’ as source.
[steps.mask] pipeline_class = 'pyopia.instrument.common.CircularImageMask' radius = 500 center = (600, 600) # Optional, is image center by default
- class pyopia.instrument.common.RectangularImageMask(mask_bbox=None)#
PyOpia pipline-compatible class for masking out part of the raw image.
Required keys in
pyopia.pipeline.Data: -pyopia.pipeline.Data.imraw- Parameters:
mask_bbox ((list, optional)) – Pixel corner coordinates of rectangle to mask (image outside the rectangle is set to 0)
- Returns:
data – containing the new key:
- Return type:
Put this in your pipeline right after load step to mask out border outside specified pixel coordinates. Remember to update the next step in the pipeline to use ‘im_masked’ as source.
[steps.mask] pipeline_class = 'pyopia.instrument.common.RectangularImageMask' mask_bbox = [[200, 1850], [400, 2048], [0, 3]]
The mask_bbox is [[start_row, end_row], [start_col, end_col], [start_colorchan, end_colorchan]]
- pyopia.instrument.common.apply_circular_mask(image, radius, center=None)#
Apply a circular mask to an RGB image, zeroing out pixels outside the disc.
- Parameters:
image ((np.array)) – numpy array of shape (h,w) or (h, w, 3)
radius (int) – radius of the circular mask, in pixels
center ((int, int)) – center of the circular mask, in pixels from top left corner vertical coordinate first optional, will use center of image if not given
- Returns:
numpy array of shape (h,w) or (h, w, 3) with mask applied
- Return type:
masked_image
Simulators#
Silcam-Simulator#
Module containing tools for assessing statistical reliability of silcam size distributions
- class pyopia.simulator.silcam.SilcamSimulator(total_volume_concentration=1000, d50=1000, MinD=10, PIX_SIZE=28, PATH_LENGTH=40, imx=2048, imy=2448, nims=50)#
SilCam simulator
- Parameters:
total_volume_concentration (int, optional) – total volume concentration, by default 1000
d50 (int, optional) – median particle size, by default 1000
MinD (int, optional) – minimum diameter to simulate, by default 10
PIX_SIZE (int, optional) – pixel size (um), by default 28
PATH_LENGTH (int, optional) – path length (mm), by default 40
imx (int, optional) – image x dimension, by default 2048
imy (int, optional) – image y dimension, by default 2448
nims (int, optional) – number of images to simulate, by default 50
Example
from pyopia.simulator.silcam import SilcamSimulator sim = SilcamSimulator() sim.check_convergence() sim.synthesize() sim.process_synthetic_image() sim.plot()
- check_convergence()#
Check statistical convergence of randomly selected size distributions over the `nims`number of images
- Parameters:
data['volume_distribution'] (array) – volume distribution of shape (nims, dias)
data['cumulative_volume_concentration'] (float) – cumulative mean volume concentration of length nims
data['cumulative_d50'] (float) – cumulative average d50 of length nims
- process_synthetic_image()#
Put the synthetic image data[‘synthetic_image_data’][‘image’] through a basic pyopia processing pipeline
- Parameters:
data['synthetic_image_data']['pyopia_processed_volume_distribution'] (array) – pyopia processed volume distribution associated with `dias`size classes
- synthesize(add_noise=False, noise_var=0.001, database_path='', database_image_ext='tiff')#
Synthesize an image and document the distributions of particles used as input
- Parameters:
add_noise (bool, optional) – Uses skimage.util.random_noise() to add gaussian noise with variance defined by noise_var, by default False
noise_var (float, optional) – Passed to the var argument of skimage.util.random_noise(), by default 0.001
database_path (str, optional) – Path to a folder of particle ROI images to be randomly selected from to build the synthetic image. If this is an empty string (default), then black discs will be used instead of real images., by default ‘’
database_image_ext (str, optional) – Image file extension to look for within the folder specified by database_path (must be a type that is loadable by skimage.io.imread() e.g. png of tiff), by default ‘tiff’
data['synthetic_image_data']['image'] (array) – synthetic image
data['synthetic_image_data']['input_volume_distribution'] (array) – Volume distribution used to create the synthetic image
- weibull_distribution(x)#
calculate weibull distribution
- Parameters:
x (array) – size bins of input
- Returns:
weibull distribution
- Return type:
array
- pyopia.simulator.silcam.extract_and_scale_example_image(output_length, file_list)#
Randomly select a file from the input file_list list, load this and then scale it (maintaining aspect ratio) to match the longest x-y dimention to rad_ number of pixel
- Parameters:
output_length (float) – wanted longest dimention
file_list (list) – list of filenames to chose from (to be read with skimage.io.imread)
- Returns:
example_image – resized image
- Return type:
array