cytopy.flow.cell_classifier.utils¶

Functions:

`assert_population_labels`(ref, expected_labels)	Given some reference FileGroup and the expected population labels, check the validity of the labels and return list of valid populations only.
`auto_weights`(y)	Estimate optimal weights from a list of class labels.
`calc_metrics`(metrics, y_true[, y_pred, y_score])	Given a list of Scikit-Learn supported metrics (https://scikit-learn.org/stable/modules/model_evaluation.html) or callable functions with signature ‘y_true’, ‘y_pred’ and ‘y_score’, return a dictionary of results after checking that the required inputs are provided.
`check_downstream_populations`(ref, …)	Check that in the ordered list of population labels, all populations are downstream of the given ‘root’ population.
`confusion_matrix_plots`(classifier, x, y, …)	Generate a figure of two heatmaps showing a confusion matrix, one normalised by support one showing raw values, displaying a classifiers performance.
`multilabel`(ref, root_population, …)	Load the root population DataFrame from the reference FileGroup (assumed to be the first population in ‘population_labels’).
`singlelabel`(ref, root_population, …)	Load the root population DataFrame from the reference FileGroup (assumed to be the first population in ‘population_labels’).

cytopy.flow.cell_classifier.utils.assert_population_labels(ref, expected_labels: list)¶

Given some reference FileGroup and the expected population labels, check the validity of the labels and return list of valid populations only.

Parameters

ref (FileGroup) –
expected_labels (list) –

Returns

Return type

List

Raises

AssertionError – Ref missing expected populations

cytopy.flow.cell_classifier.utils.auto_weights(y: numpy.ndarray)¶

Estimate optimal weights from a list of class labels.

Parameters: y (numpy.ndarray) –
Returns: Dictionary of class weights {label: weight}
Return type: dict

cytopy.flow.cell_classifier.utils.calc_metrics(metrics: list, y_true: numpy.array, y_pred: Optional[numpy.array] = None, y_score: Optional[numpy.array] = None) → dict¶

Given a list of Scikit-Learn supported metrics (https://scikit-learn.org/stable/modules/model_evaluation.html) or callable functions with signature ‘y_true’, ‘y_pred’ and ‘y_score’, return a dictionary of results after checking that the required inputs are provided.

Parameters

metrics (list) – List of string values; names of required metrics
y_true (numpy.ndarray) – True labels or binary label indicators. The binary and multiclass cases expect labels with shape (n_samples,) while the multilabel case expects binary label indicators with shape (n_samples, n_classes).
y_pred (numpy.ndarray) – Estimated targets as returned by a classifier
y_score (numpy.ndarray) – Target scores. In the binary and multilabel cases, these can be either probability estimates or non-thresholded decision values (as returned by decision_function on some classifiers). In the multiclass case, these must be probability estimates which sum to 1. The binary case expects a shape (n_samples,), and the scores must be the scores of the class with the greater label. The multiclass and multilabel cases expect a shape (n_samples, n_classes). In the multiclass case, the order of the class scores must correspond to the order of labels, if provided, or else to the numerical or lexicographical order of the labels in y_true.

Returns

Dictionary of performance metrics

Return type

dict

Raises

AssertionError – F1 score requested yet y_pred is missing
AttributeError – Requested metric requires probability scores and y_score is None
ValueError – Invalid metric provided; possibly missing signatures: ‘y_true’, ‘y_score’ or ‘y_pred’

cytopy.flow.cell_classifier.utils.check_downstream_populations(ref, root_population: str, population_labels: list) → None¶

Check that in the ordered list of population labels, all populations are downstream of the given ‘root’ population.

Parameters

ref (FileGroup) –
root_population (str) –
population_labels (list) –

Returns

Return type

None

Raises

AssertionError – One or more populations not downstream of root

cytopy.flow.cell_classifier.utils.confusion_matrix_plots(classifier, x: pandas.core.frame.DataFrame, y: numpy.ndarray, class_labels: list, cmap: Optional[str] = None, figsize: tuple = (8, 20), **kwargs)¶

Generate a figure of two heatmaps showing a confusion matrix, one normalised by support one showing raw values, displaying a classifiers performance. Returns Matplotlib.Figure object.

Parameters

classifier (object) – Scikit-Learn classifier
x (Pandas.DataFrame) – Feature space
y (numpy.ndarray) – Labels
class_labels (list) – Class labels (as they should be displayed on the axis)
cmap (str) – Colour scheme, defaults to Matplotlib Blues
figsize (tuple (default=(10,5))) – Size of the figure
kwargs – Additional keyword arguments passed to sklearn.metrics.plot_confusion_matrix

Returns

Return type

Matplotlib.Figure

cytopy.flow.cell_classifier.utils.multilabel(ref, root_population: str, population_labels: list, features: list) -> (<class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>)¶

Load the root population DataFrame from the reference FileGroup (assumed to be the first population in ‘population_labels’). Then iterate over the remaining population creating a dummy matrix of population affiliations for each row of the root population.

Parameters

ref (FileGroup) –
root_population (str) –
population_labels (list) –
features (list) –

Returns

Root population flourescent intensity values, population affiliations (dummy matrix)

Return type

(Pandas.DataFrame, Pandas.DataFrame)

cytopy.flow.cell_classifier.utils.singlelabel(ref, root_population: str, population_labels: list, features: list) -> (<class 'pandas.core.frame.DataFrame'>, <class 'numpy.ndarray'>)¶

Load the root population DataFrame from the reference FileGroup (assumed to be the first population in ‘population_labels’). Then iterate over the remaining population creating a Array of population affiliations; each cell (row) is associated to their terminal leaf node in the FileGroup population tree.

Parameters

root_population –
ref (FileGroup) –
population_labels (list) –
features (list) –

Returns

Root population flourescent intensity values, labels

Return type

(Pandas.DataFrame, numpy.ndarray)