trainCellpose

Train custom Cellpose model

Since R2023b

Syntax

trainCellpose(dataFolder,outputModelFile)

trainCellpose(dataFolder,outputModelFile,Name=Value)

Description

trainCellpose(dataFolder,outputModelFile) trains a custom Cellpose model by providing an interface to the Cellpose Library. Use this syntax to train a model with default options. The function identifies pairs of training and label images in the dataFolder folder, and assumes that each label image has the same file name as the corresponding training image, plus the suffix "_labels".

example

trainCellpose(dataFolder,outputModelFile,Name=Value) specifies options using one or more name-value arguments. For example, ImageSuffix="_imRGB" trains the model using only images in the specified data folder with filenames that end in _imRGB.

Note

This functionality requires Deep Learning Toolbox™, Computer Vision Toolbox™, and the Medical Imaging Toolbox™ Interface for Cellpose Library. You can install the Medical Imaging Toolbox Interface for Cellpose Library from Add-On Explorer. For more information about installing add-ons, see Get and Manage Add-Ons.

Examples

collapse all

Train Cellpose Model

Train a custom Cellpose model. This example shows how to use the trainCellpose function for a hypothetical training data set.

The function requires the path to the training data and the path to save the new trained model as inputs.

dataFolder = "C:\trainingData";
outputModelFile = "C:\cellposeModels\retrainedCyto2Model"

By default, the function retrains a copy of the cyto2 model from the Cellpose Library. This code uses an ImageSuffix value of _imRGB and a LabelSuffix value of _mask to specify the suffixes for the training and label images, respectively. For example, the function recognizes files named im1_imRGB.png and im1_mask.png as a training image and its ground truth label image.

trainCellpose(dataFolder,outputModelFile,...
    MaxEpochs=2,...
    ImageSuffix="_imRGB",...
    LabelSuffix="_mask");

Input Arguments

collapse all

`dataFolder` — Path to data folder
string scalar | character vector

Path to the data folder, specified as a string scalar or character vector. Specify dataFolder as the path to a folder that contains training images and their corresponding ground truth label images.

Training images must be in the TIFF, JPEG, or PNG file format.
Ground truth label images must be in the TIFF or PNG file format. Each ground truth image must have the same name as the corresponding training image, with a suffix specified by LabelSuffix.

Note

Because the function writes intermediate flow files to the data folder, the data folder must have write permissions. The function reuses the intermediate files if you perform training multiple times, to make training faster.

Data Types: char | string

`outputModelFile` — Output model file
string scalar | character vector

Output model file, specified as a string scalar or character vector. Specify the full path to the folder where you want the function to write the trained model.

Data Types: char | string

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Example: trainCellpose(dataFolder,outputModelFile,PretrainedModel="") trains an uninitialized Cellpose model, rather than retraining a pretrained model.

`ImageSuffix` — Training image suffix
`""` (default) | string scalar | character vector

Training image suffix, specified as a string scalar or character vector. Use this argument to specify a suffix, excluding the file extension, by which to filter training images in dataFolder. When specified, trainCellpose excludes images that do not end in the suffix from training.

Data Types: char | string

`MainChannel` — Main channel to segment
`"average"` (default) | `"R"` | `"G"` | `"B"`

Main channel for the trained network to segment, specified as one of these options.

"average" — Use the average value across channels for training. Use this value for grayscale images.
"R" — Use the first image channel for training, corresponding to the red channel of an RGB image.
"G" — Use the second image channel for training, corresponding to the green channel of an RGB image.
"B" — Use the third image channel for training, corresponding to the blue channel of an RGB image.

This argument corresponds to the chan parameter in the Cellpose Library.

Data Types: char | string

`AuxiliaryChannel` — Auxiliary channel
`"none"` (default) | `"R"` | `"G"` | `"B"`

Auxiliary channel to use for training, specified as "none", "R", "G", "B". If this value is "none", then the function uses an auxiliary image containing all zeros during training. This argument corresponds to the chan2 parameter in the Cellpose Library.

Data Types: char | string

`LabelSuffix` — Label suffix
`"_labels"` (default) | string scalar | character vector

Label suffix, specified as a string scalar or character vector. The function uses the label suffix to search for ground truth label files in dataFolder. By default, the function uses files ending in "_labels" as the ground truth labels.

Data Types: char | string

`PretrainedModel` — Pretrained model
`"cyto2"` (default) | `""` | string scalar | character vector

Pretrained model to use as a base model for training using transfer learning, specified as one of these values.

"" — Start training with an uninitialized Cellpose network.
Absolute path — Start training with a custom trained model by specifying the absolute path to the model on your machine.
Name of a pretrained Cellpose Library model — Start training with a pretrained cellpose model, specified as one of these options. To learn more about the pretrained models and their training data, see the Cellpose Library Documentation.
- "cyto"
- "cyto2"
- "CP"
- "CPx"
- "nuclei"
- "livecell"
- "LC1"
- "LC2"
- "LC3"
- "LC4"
- "tissuenet"
- "TN1"
- "TN2"
- "TN3"

Data Types: char | string

`ModelFolder` — Pretrained model folder path
string scalar | character vector

Pretrained model folder path, specified as a string scalar or character vector. This argument must be the full path to a folder containing the Cellpose model you want to train. By default, ModelFolder is a subfolder called cellposeModels within the folder returned by the userpath function. This argument has no effect when you train an uninitialized model by specifying PretrainedModel as "".

Data Types: char | string

`DetectableCellDiameter` — Detectable cell diameter
`30` (default) | numeric scalar

Detectable cell diameter, specified as a numeric scalar. This argument specifies the cell diameter that you want the trained model to detect. This argument only has an effect when you train an uninitialized model by specifying PreTrainedModel as "". If you start training from a pretrained model, the detectable cell diameter of the newly trained model is the same as that of the pretrained model. This argument corresponds to the diam_mean parameter in the Cellpose Library.

`ExecutionEnvironment` — Hardware resource used for training
`"auto"` (default) | `"cpu"` | `"gpu"`

Hardware resource used for training, specified as one of these values.

"auto" — Use a GPU if one is available. Otherwise, use the CPU.
"cpu" — Use the CPU.
"gpu" — Use the GPU.

The "gpu" option requires Parallel Computing Toolbox™. To use a GPU for deep learning, you must also have a supported GPU device. For information on supported devices, see GPU Computing Requirements (Parallel Computing Toolbox). If you choose the "gpu" option and Parallel Computing Toolbox or a suitable GPU is not available, then the function returns an error.

Data Types: char | string

`LearningRate` — Initial learning rate
`0.2` (default) | numeric scalar

Initial learning rate used for training, specified as a numeric scalar. This argument corresponds to the learning_rate parameter in the Cellpose Library.

`WeightDecay` — Weight decay
`0.00001` (default) | numeric scalar

Weight decay, specified as a numeric scalar. This argument corresponds to the weight_decay parameter in the Cellpose Library.

`CheckpointPath` — Path for saving checkpoint model files
string scalar | character vector

Path for saving the checkpoint model files, specified as a string scalar or character vector. By default, trainCellpose saves intermediate model files in the same parent folder as outputModelFile, within a subfolder named model. This argument corresponds to the save_path parameter in the Cellpose Library.

Data Types: char | string

`CheckpointFrequency` — Frequency for saving checkpoint model files
`100` (default) | positive integer

Frequency for saving checkpoint model files during training, specified as a positive integer, in epochs. The function saves model files every CheckpointFrequency epochs. This argument corresponds to the save_every parameter in the Cellpose Library.

`MaxEpochs` — Maximum number of epochs for training
`500` (default) | positive integer

Maximum number of epochs to use for training, specified as a positive integer. This argument corresponds to the n_epochs parameter in the Cellpose Library.

`GPUBatchSize` — GPU batch size
`8` (default) | positive integer

GPU batch size, specified as a positive integer. This argument has an effect only when training on a GPU. The batch size specifies the number of images per batch. Increasing the batch size increases speed, but also increases memory requirements. This argument corresponds to the batchsize parameter in the Cellpose Library.

References

[1] Stringer, Carsen, Tim Wang, Michalis Michaelos, and Marius Pachitariu. “Cellpose: A Generalist Algorithm for Cellular Segmentation.” Nature Methods 18, no. 1 (January 2021): 100–106. https://doi.org/10.1038/s41592-020-01018-x.

[2] Pachitariu, Marius, and Carsen Stringer. “Cellpose 2.0: How to Train Your Own Model.” Nature Methods 19, no. 12 (December 2022): 1634–41. https://doi.org/10.1038/s41592-022-01663-4.

Version History

Introduced in R2023b

trainCellpose

Syntax

Description

Examples

Train Cellpose Model

Input Arguments

`dataFolder` — Path to data folder
string scalar | character vector

`outputModelFile` — Output model file
string scalar | character vector

Name-Value Arguments

`ImageSuffix` — Training image suffix
`""` (default) | string scalar | character vector

`MainChannel` — Main channel to segment
`"average"` (default) | `"R"` | `"G"` | `"B"`

`AuxiliaryChannel` — Auxiliary channel
`"none"` (default) | `"R"` | `"G"` | `"B"`

`LabelSuffix` — Label suffix
`"_labels"` (default) | string scalar | character vector

`PretrainedModel` — Pretrained model
`"cyto2"` (default) | `""` | string scalar | character vector

`ModelFolder` — Pretrained model folder path
string scalar | character vector

`DetectableCellDiameter` — Detectable cell diameter
`30` (default) | numeric scalar

`ExecutionEnvironment` — Hardware resource used for training
`"auto"` (default) | `"cpu"` | `"gpu"`

`LearningRate` — Initial learning rate
`0.2` (default) | numeric scalar

`WeightDecay` — Weight decay
`0.00001` (default) | numeric scalar

`CheckpointPath` — Path for saving checkpoint model files
string scalar | character vector

`CheckpointFrequency` — Frequency for saving checkpoint model files
`100` (default) | positive integer

`MaxEpochs` — Maximum number of epochs for training
`500` (default) | positive integer

`GPUBatchSize` — GPU batch size
`8` (default) | positive integer

References

Version History

See Also

Topics

External Websites

trainCellpose

Syntax

Description

Examples

Train Cellpose Model

Input Arguments

dataFolder — Path to data folder string scalar | character vector

outputModelFile — Output model file string scalar | character vector

Name-Value Arguments

ImageSuffix — Training image suffix "" (default) | string scalar | character vector

MainChannel — Main channel to segment "average" (default) | "R" | "G" | "B"

AuxiliaryChannel — Auxiliary channel "none" (default) | "R" | "G" | "B"

LabelSuffix — Label suffix "_labels" (default) | string scalar | character vector

PretrainedModel — Pretrained model "cyto2" (default) | "" | string scalar | character vector

ModelFolder — Pretrained model folder path string scalar | character vector

DetectableCellDiameter — Detectable cell diameter 30 (default) | numeric scalar

ExecutionEnvironment — Hardware resource used for training "auto" (default) | "cpu" | "gpu"

LearningRate — Initial learning rate 0.2 (default) | numeric scalar

WeightDecay — Weight decay 0.00001 (default) | numeric scalar

CheckpointPath — Path for saving checkpoint model files string scalar | character vector

CheckpointFrequency — Frequency for saving checkpoint model files 100 (default) | positive integer

MaxEpochs — Maximum number of epochs for training 500 (default) | positive integer

GPUBatchSize — GPU batch size 8 (default) | positive integer

References

Version History

See Also

Topics

External Websites

`dataFolder` — Path to data folder
string scalar | character vector

`outputModelFile` — Output model file
string scalar | character vector

`ImageSuffix` — Training image suffix
`""` (default) | string scalar | character vector

`MainChannel` — Main channel to segment
`"average"` (default) | `"R"` | `"G"` | `"B"`

`AuxiliaryChannel` — Auxiliary channel
`"none"` (default) | `"R"` | `"G"` | `"B"`

`LabelSuffix` — Label suffix
`"_labels"` (default) | string scalar | character vector

`PretrainedModel` — Pretrained model
`"cyto2"` (default) | `""` | string scalar | character vector

`ModelFolder` — Pretrained model folder path
string scalar | character vector

`DetectableCellDiameter` — Detectable cell diameter
`30` (default) | numeric scalar

`ExecutionEnvironment` — Hardware resource used for training
`"auto"` (default) | `"cpu"` | `"gpu"`

`LearningRate` — Initial learning rate
`0.2` (default) | numeric scalar

`WeightDecay` — Weight decay
`0.00001` (default) | numeric scalar

`CheckpointPath` — Path for saving checkpoint model files
string scalar | character vector

`CheckpointFrequency` — Frequency for saving checkpoint model files
`100` (default) | positive integer

`MaxEpochs` — Maximum number of epochs for training
`500` (default) | positive integer

`GPUBatchSize` — GPU batch size
`8` (default) | positive integer