cytopy.flow.cell_classifier.utils¶
Functions:
|
Given some reference FileGroup and the expected population labels, check the validity of the labels and return list of valid populations only. |
|
Estimate optimal weights from a list of class labels. |
|
Given a list of Scikit-Learn supported metrics (https://scikit-learn.org/stable/modules/model_evaluation.html) or callable functions with signature ‘y_true’, ‘y_pred’ and ‘y_score’, return a dictionary of results after checking that the required inputs are provided. |
|
Check that in the ordered list of population labels, all populations are downstream of the given ‘root’ population. |
|
Generate a figure of two heatmaps showing a confusion matrix, one normalised by support one showing raw values, displaying a classifiers performance. |
|
Load the root population DataFrame from the reference FileGroup (assumed to be the first population in ‘population_labels’). |
|
Load the root population DataFrame from the reference FileGroup (assumed to be the first population in ‘population_labels’). |
-
cytopy.flow.cell_classifier.utils.
assert_population_labels
(ref, expected_labels: list)¶ Given some reference FileGroup and the expected population labels, check the validity of the labels and return list of valid populations only.
- Parameters
ref (FileGroup) –
expected_labels (list) –
- Returns
- Return type
List
- Raises
AssertionError – Ref missing expected populations
-
cytopy.flow.cell_classifier.utils.
auto_weights
(y: numpy.ndarray)¶ Estimate optimal weights from a list of class labels.
- Parameters
y (numpy.ndarray) –
- Returns
Dictionary of class weights {label: weight}
- Return type
dict
-
cytopy.flow.cell_classifier.utils.
calc_metrics
(metrics: list, y_true: numpy.array, y_pred: Optional[numpy.array] = None, y_score: Optional[numpy.array] = None) → dict¶ Given a list of Scikit-Learn supported metrics (https://scikit-learn.org/stable/modules/model_evaluation.html) or callable functions with signature ‘y_true’, ‘y_pred’ and ‘y_score’, return a dictionary of results after checking that the required inputs are provided.
- Parameters
metrics (list) – List of string values; names of required metrics
y_true (numpy.ndarray) – True labels or binary label indicators. The binary and multiclass cases expect labels with shape (n_samples,) while the multilabel case expects binary label indicators with shape (n_samples, n_classes).
y_pred (numpy.ndarray) – Estimated targets as returned by a classifier
y_score (numpy.ndarray) – Target scores. In the binary and multilabel cases, these can be either probability estimates or non-thresholded decision values (as returned by decision_function on some classifiers). In the multiclass case, these must be probability estimates which sum to 1. The binary case expects a shape (n_samples,), and the scores must be the scores of the class with the greater label. The multiclass and multilabel cases expect a shape (n_samples, n_classes). In the multiclass case, the order of the class scores must correspond to the order of labels, if provided, or else to the numerical or lexicographical order of the labels in y_true.
- Returns
Dictionary of performance metrics
- Return type
dict
- Raises
AssertionError – F1 score requested yet y_pred is missing
AttributeError – Requested metric requires probability scores and y_score is None
ValueError – Invalid metric provided; possibly missing signatures: ‘y_true’, ‘y_score’ or ‘y_pred’
-
cytopy.flow.cell_classifier.utils.
check_downstream_populations
(ref, root_population: str, population_labels: list) → None¶ Check that in the ordered list of population labels, all populations are downstream of the given ‘root’ population.
- Parameters
ref (FileGroup) –
root_population (str) –
population_labels (list) –
- Returns
- Return type
None
- Raises
AssertionError – One or more populations not downstream of root
-
cytopy.flow.cell_classifier.utils.
confusion_matrix_plots
(classifier, x: pandas.core.frame.DataFrame, y: numpy.ndarray, class_labels: list, cmap: Optional[str] = None, figsize: tuple = (8, 20), **kwargs)¶ Generate a figure of two heatmaps showing a confusion matrix, one normalised by support one showing raw values, displaying a classifiers performance. Returns Matplotlib.Figure object.
- Parameters
classifier (object) – Scikit-Learn classifier
x (Pandas.DataFrame) – Feature space
y (numpy.ndarray) – Labels
class_labels (list) – Class labels (as they should be displayed on the axis)
cmap (str) – Colour scheme, defaults to Matplotlib Blues
figsize (tuple (default=(10,5))) – Size of the figure
kwargs – Additional keyword arguments passed to sklearn.metrics.plot_confusion_matrix
- Returns
- Return type
Matplotlib.Figure
-
cytopy.flow.cell_classifier.utils.
multilabel
(ref, root_population: str, population_labels: list, features: list) -> (<class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>)¶ Load the root population DataFrame from the reference FileGroup (assumed to be the first population in ‘population_labels’). Then iterate over the remaining population creating a dummy matrix of population affiliations for each row of the root population.
- Parameters
ref (FileGroup) –
root_population (str) –
population_labels (list) –
features (list) –
- Returns
Root population flourescent intensity values, population affiliations (dummy matrix)
- Return type
(Pandas.DataFrame, Pandas.DataFrame)
-
cytopy.flow.cell_classifier.utils.
singlelabel
(ref, root_population: str, population_labels: list, features: list) -> (<class 'pandas.core.frame.DataFrame'>, <class 'numpy.ndarray'>)¶ Load the root population DataFrame from the reference FileGroup (assumed to be the first population in ‘population_labels’). Then iterate over the remaining population creating a Array of population affiliations; each cell (row) is associated to their terminal leaf node in the FileGroup population tree.
- Parameters
root_population –
ref (FileGroup) –
population_labels (list) –
features (list) –
- Returns
Root population flourescent intensity values, labels
- Return type
(Pandas.DataFrame, numpy.ndarray)