flagx.gating
This module provides classification methods for automated flow cytometry gating.
SomClassifier
MLPClassifier
- class flagx.gating.MLPClassifier(*args: Any, **kwargs: Any)
Bases:
BaseEstimator,ClassifierMixinA three layer perceptron (MLP) classifier.
This classifier wraps a fully connected neural network implemented in PyTorch while exposing a scikit-learn–style API. The model supports multi-class classification, automatic device selection (CPU or GPU), and provides the methods
fit(),predict(),predict_proba(),score(),save(), andload(). For model training CrossEntropyLoss and the Adam optimizer are used.- classes_
Class labels after re-indexing to integers starting from 0.
- Type:
np.ndarray or None
- class_counts_
Class counts from the training data.
- Type:
np.ndarray or None
- og_classes_
Original class labels before re-indexing.
- Type:
np.ndarray or None
- class_priors_
Empirical class priors.
- Type:
np.ndarray or None
- new_to_og_classes_dict_
Mapping from new integer labels back to original labels.
- Type:
dict[int, Any] or None
- data_set_
PyTorch tensor dataset constructed during fitting.
- Type:
TensorDataset or None
- data_loader_
PyTorch DataLoader used for minibatch training.
- Type:
DataLoader or None
- model_
Neural network model.
- Type:
nn.Module or None
- criterion_
Loss function, PyTorch CrossEntropyLoss.
- Type:
nn.Module or None
- optimizer_
PyTorch Adam optimizer with learning rate 0.001.
- Type:
Optimizer or None
- training_log_
Logged losses of the training run.
- Type:
dict[str, int | list[int | float]] or None
- is_fitted_
Whether the classifier has been fitted.
- Type:
bool
- Parameters:
layer_sizes (Tuple[int, ...]) – Sizes of the hidden layers in the fully connected neural network.
n_epochs (int) – Number of training epochs.
loss_params (dict[str, Any] or None) – Parameters passed to the PyTorch’s CrossEntropyLoss.
optimizer_params (dict[str, Any] or None) – Parameters passed to the PyTorch’s Adam optimizer. If None, defaults to
{'lr': 0.001}.data_loader_params (dict[str, Any] or None) – Parameters passed to the PyTorch DataLoader. If None, defaults to
{'batch_size': 128, 'shuffle': True, 'num_workers': 1}.validation_fraction (float) – Fraction of the training data used as validation set. Defaults to 0.1.
early_stopping (bool) – Whether early stopping is used or not. If early_stopping is True and validation_fraction=0.0, the training loss is used as an early stopping criterion. Defaults to False.
tol (float) – Tolerance for early stopping. When the validation/training loss is not improving by at least tol for n_iter_no_change consecutive iterations, training is stopped early.
n_iter_no_change (int) – Maximum number of epochs to not meet tol improvement.
device (str or None) – Device to use for training (e.g.,
'cpu','cuda','cuda:0'). If None, CUDA is used when available, otherwise falls back to CPU.verbosity (int) – Verbosity level for training logs.
- fit(X: numpy.ndarray, y: numpy.ndarray) typing_extensions.Self
Fit the MLP classifier to the provided training data.
- Parameters:
X (np.ndarray) – Feature matrix of shape (n_samples, n_features).
y (np.ndarray) – Target labels of shape (n_samples,).
- Returns:
The fitted classifier instance.
- Return type:
Self
- Raises:
ValueError – If X and y have incompatible shapes.
- predict(X: numpy.ndarray) numpy.ndarray
Predict class labels for the given input samples.
- Parameters:
X (np.ndarray) – Feature matrix of shape (n_samples, n_features).
- Returns:
Predicted class labels using the original label encoding.
- Return type:
np.ndarray
- Raises:
NotFittedError – If
predict()is used before callingfit().
- predict_proba(X: numpy.ndarray) numpy.ndarray
Predict class probabilities for the given samples.
- Parameters:
X (np.ndarray) – Feature matrix of shape (n_samples, n_features).
- Returns:
Array of shape (n_samples, n_classes) containing class probabilities.
- Return type:
np.ndarray
- Raises:
NotFittedError – If
predict()is used before callingfit().
- score(X: numpy.ndarray, y: numpy.ndarray, sample_weight: numpy.ndarray | None = None)
Compute the macro F1 score of the classifier on the given dataset.
- Parameters:
X (np.ndarray) – Feature matrix of shape (n_samples, n_features).
y (np.ndarray) – True labels.
sample_weight (np.ndarray or None) – Optional sample weights.
- Returns:
Macro-averaged F1 score.
- Return type:
float
- Raises:
NotFittedError – If
score()is used before callingfit().
- save(filename: str = 'mlp_classifier.pkl', filepath: str | None = None) None
Save the fitted classifier to disk using
torch.save.- Parameters:
filename (str) – Name of the file to save the model to.
filepath (str or None) – Directory where the file will be saved. Defaults to current working directory.
- Returns:
None
- classmethod load(filename: str = 'mlp_classifier.pkl', filepath: str | None = None, map_location: str | torch.device = 'cpu') typing_extensions.Self
Load a previously saved classifier from disk.
- Parameters:
filename (str) – Name of the saved file.
filepath (str or None) – Directory containing the saved file. Defaults to current working directory.
map_location (str or torch.device) – Device mapping for loading the model (e.g.,
'cpu'or'cuda').
- Returns:
The loaded classifier instance.
- Return type:
Self
Neural Network Models
- class flagx.gating.FCNNModel(*args: Any, **kwargs: Any)
Bases:
ModuleFully connected neural network with arbitrary number of hidden linear layers of arbitrary size.
All but the output layer uses ReLU activations. Softmax is intentionally omitted from the final layer because
torch.nn.CrossEntropyLossexpects raw logits.The default parameters follow the configuration described in:
DGCyTOF: Deep learning with graphic cluster visualization to predict cell types of single cell mass cytometry data (Cheng et al., 2022).
Their implementation can be found at:
https://github.com/lijcheng12/DGCyTOF/blob/main/Code_Study/DGCyTOF/CyTOF2/CyTOF2.ipynb (22/27/2025).
- Parameters:
in_size (int) – Number of input features.
out_size (int) – Number of output classes.
layer_sizes (Tuple[int, ...], optional) – Sizes of the hidden layers. Defaults to (128, 64, 32).
- layers
List of fully connected linear layers.
- Type:
nn.ModuleList
- forward(x: torch.Tensor) torch.Tensor
Forward pass of the FCNN model. ReLU activation is applied after each layer except after the output layer.
- Parameters:
x (torch.Tensor) – Input data tensor.
- Returns:
Raw output logits with shape (batch_size, out_size).
- Return type:
torch.Tensor