cytopy.flow.transform

Cytometry data has to be transformed prior to analysis. There are multiple techniques for transformation of data, the most popular being the biexponential transform. cytopy employs multiple methods using the FlowUtils package (https://github.com/whitews/FlowUtils), including the Logicle transform, a modified version of the biexponential transform.

Copyright 2020 Ross Burton

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Classes:

AsinhTransformer(m, a, t)

Implementation of inverse hyperbolic sine function, authored by Scott White (FlowUtils v0.8).

HyperlogTransformer(w, m, a, t)

Implementation of Hyperlog transform is authored by Scott White (FlowUtils v0.8).

LogTransformer(base, m, t, **kwargs)

Apply log transform to data, either using parametrized log transform as defined in GatingML 2.0 specification (implemented by Scott White in FlowUtils v0.8) or using natural log, base 2 or base 10.

LogicleTransformer(w, m, a, t)

Implementation of Logicle transform is authored by Scott White (FlowUtils v0.8).

Normalise(norm, axis)

Normalise data using Scikit-Learn normalize function (https://bit.ly/2YBfe3o) Norms are stored in the attribute ‘norms’ and normalisation reversed by passing the transformed data to the ‘inverse’ method.

Scaler(method, **kwargs)

Utility object for applying Scikit-Learn transformers to a chosen dataset.

Transformer(transform_function, …)

Base class for Transformer object.

Exceptions:

TransformError

Functions:

apply_transform(data, features[, method, …])

Apply a transformation to the given DataFrame and the chosen columns (features).

apply_transform_map(data, feature_method[, …])

Wrapper function to cytopy.flow.transform.apply_transform; takes a dictionary (feature_method) where each key is the name of a feature and the value the transform to be applied to that feature.

remove_negative_values(data, features)

For each feature (as given in ‘features’) in data, check for negative values and replace with minimum value in range for that feature.

safe_range(data, x)

Return the minimum and maximum values in a range, ignore negative values

class cytopy.flow.transform.AsinhTransformer(m: float = 4.5, a: float = 0.0, t: int = 262144)

Implementation of inverse hyperbolic sine function, authored by Scott White (FlowUtils v0.8).

Parameters
  • m (float (default=4.5)) – Number of decades the true logarithmic scale approaches at the high end of the scale

  • a (float (default=0)) – Additional number of negative decades

  • t (int (default=262144)) – Top of the linear scale

class cytopy.flow.transform.HyperlogTransformer(w: float = 0.5, m: float = 4.5, a: float = 0.0, t: int = 262144)

Implementation of Hyperlog transform is authored by Scott White (FlowUtils v0.8).

Hyperlog transformation, implemented as defined in the GatingML 2.0 specification:

hyperlog(x, T, W, M, A) = root(EH(y, T, W, M, A) − x)

where EH is defined as:

EH(y, T, W, M, A) = ae^(by) + cy − f

The Hyperlog transformation was originally defined in the publication: Bagwell CB. Hyperlog-a flexible log-like transform for negative, zero, and positive valued data. Cytometry A., 2005:64(1):34–42.

Parameters
  • w (float (default=0.5)) – Approximate number of decades in the linear region

  • m (float (default=4.5)) – Number of decades the true logarithmic scale approaches at the high end of the scale

  • a (float (default=0)) – Additional number of negative decades

  • t (int (default=262144)) – Top of the linear scale

class cytopy.flow.transform.LogTransformer(base: str = 'parametrized', m: float = 4.5, t: int = 262144, **kwargs)

Apply log transform to data, either using parametrized log transform as defined in GatingML 2.0 specification (implemented by Scott White in FlowUtils v0.8) or using natural log, base 2 or base 10.

Parameters
  • base (str or int (default="parametrized")) – Method to be used, should either be ‘parametrized’, 10, 2, or ‘natural’

  • m (float (default=4.5)) – Number of decades the true logarithmic scale approaches at the high end of the scale

  • t (int (default=262144)) – Top of the linear scale

Raises

TransformError – Invalid LogTransformer method

Methods:

inverse_scale(data, features)

Apply inverse of log transform to features (columns) of given dataframe, under the assumption that these features have previously been transformed with LogTransformer

scale(data, features)

Scale features (columns) of given dataframe using log transform

inverse_scale(data: pandas.core.frame.DataFrame, features: list)

Apply inverse of log transform to features (columns) of given dataframe, under the assumption that these features have previously been transformed with LogTransformer

Parameters
  • data (Pandas.DataFrame) –

  • features (list) –

Returns

Return type

Pandas.DataFrame

scale(data: pandas.core.frame.DataFrame, features: list)

Scale features (columns) of given dataframe using log transform

Parameters
  • data (Pandas.DataFrame) –

  • features (list) –

Returns

Return type

Pandas.DataFrame

class cytopy.flow.transform.LogicleTransformer(w: float = 0.5, m: float = 4.5, a: float = 0.0, t: int = 262144)

Implementation of Logicle transform is authored by Scott White (FlowUtils v0.8). Logicle transformation, implemented as defined in the GatingML 2.0 specification:

logicle(x, T, W, M, A) = root(B(y, T, W, M, A) − x)

where B is a modified bi-exponential function defined as:

B(y, T, W, M, A) = ae^(by) − ce^(−dy) − f

The Logicle transformation was originally defined in the publication: Moore WA and Parks DR. Update for the logicle data scale including operational code implementations. Cytometry A., 2012:81A(4):273–277.

Parameters
  • w (float (default=0.5)) – Approximate number of decades in the linear region

  • m (float (default=4.5)) – Number of decades the true logarithmic scale approaches at the high end of the scale

  • a (float (default=0)) – Additional number of negative decades

  • t (int (default=262144)) – Top of the linear scale

class cytopy.flow.transform.Normalise(norm: str = 'l2', axis: int = 1)

Normalise data using Scikit-Learn normalize function (https://bit.ly/2YBfe3o) Norms are stored in the attribute ‘norms’ and normalisation reversed by passing the transformed data to the ‘inverse’ method.

Parameters
  • norm (Numpy Array) – An array of norms along given axis for X

  • axis (int (default=1)) –

    Axis to apply normalisation along. If 1, independently normalize each sample, otherwise (if 0)

    normalize each feature.

Methods:

inverse(data, features)

Perform inverse of normalisation to given dataframe.

inverse(data: pandas.core.frame.DataFrame, features: list)

Perform inverse of normalisation to given dataframe. Returns copy of DataFrame with inverse normalisation applied to chosen columns (features)

Parameters
  • data (Pandas.DataFrame) –

  • features (List) –

Returns

Return type

Pandas.DataFrame

Raises
  • AssertionError – Shape of given dataframe does not match the data

  • ValueError – Inverse called prior to normalisation

class cytopy.flow.transform.Scaler(method: str = 'standard', **kwargs)

Utility object for applying Scikit-Learn transformers to a chosen dataset. Following transformations supported; method and corresponding Scikit-Learn class:

  • “standard” - sklearn.preprocessing.StandardScaler

  • “minmax” - sklearn.preprocessing.MinMaxScaler

  • “robust” - sklearn.preprocessing.RobustScaler

  • “maxabs” - sklearn.preprocessing.MaxAbsScaler

  • “quantile” - sklearn.preprocessing.QuantileTransformer

  • “yeo_johnson” - sklearn.preprocessing.PowerTransformer

  • “box_cox” - sklearn.preprocessing.PowerTransformer

(PowerTransformer method argument will be ‘yeo-johnson’ or ‘box-cox’ according to the chosen method)

See relevant Scikit-Learn documentation for guidance on a particular method: https://scikit-learn.org/stable/modules/classes.html#module-sklearn.preprocessing

User should initialise object with ‘method’ according to the above and provide any additional keyword arguments, relevant to the chosen object, as kwargs.

method

Name of scikit-learn transformer to use

Type

str

Methods:

inverse(data, features)

Given dataframe and a list of columns (features) that has been previously transformed, apply inverse transform.

set_params(**kwargs)

Sets parameters of underlying Scikit-Learn method

inverse(data: pandas.core.frame.DataFrame, features: list)

Given dataframe and a list of columns (features) that has been previously transformed, apply inverse transform. Returns copy of DataFrame with transformation reversed.

Parameters
  • data (Pandas.DataFrame) –

  • features (List) –

Returns

Return type

Pandas.DataFrame

Raises

TransformError – If the chosen Scikit-Learn method does not support inverse transform

set_params(**kwargs)

Sets parameters of underlying Scikit-Learn method

Parameters

kwargs – Additional keyword arguments passed to ‘set_params’ call

Returns

Return type

None

exception cytopy.flow.transform.TransformError
class cytopy.flow.transform.Transformer(transform_function: callable, inverse_function: callable, **kwargs)

Base class for Transformer object.

Parameters
  • transform (callable) – Transform function

  • inverse (callable) – Inverse transformation function

  • kwargs – Keyword arguments passed to transform/inverse transform

transform
Type

callable

inverse
Type

callable

kwargs
Type

dict

Methods:

inverse_scale(data, features)

Apply inverse scale to features (columns) of given dataframe, under the assumption that these features have previously been transformed with this Transformer

scale(data, features)

Scale features (columns) of given dataframe

inverse_scale(data: pandas.core.frame.DataFrame, features: list)

Apply inverse scale to features (columns) of given dataframe, under the assumption that these features have previously been transformed with this Transformer

Parameters
  • data (Pandas.DataFrame) –

  • features (list) –

Returns

Return type

Pandas.DataFrame

Raises

TransformError – Chosen inverse transform function is missing the arguments channel_indices or channels. cytopy uses the FlowUtils class for transformations. See FlowUtils documentation for details.

scale(data: pandas.core.frame.DataFrame, features: list)

Scale features (columns) of given dataframe

Parameters
  • data (Pandas.DataFrame) –

  • features (list) –

Returns

Return type

Pandas.DataFrame

Raises

TransformError – Chosen transform function is missing the arguments channel_indices or channels. cytopy uses the FlowUtils class for transformations. See FlowUtils documentation for details.

cytopy.flow.transform.apply_transform(data: pandas.core.frame.DataFrame, features: list, method: str = 'logicle', return_transformer: bool = False, **kwargs)

Apply a transformation to the given DataFrame and the chosen columns (features). Transformation method is specified using the ‘method’ argument and should be one of:

  • logicle: see cytopy.flow.transform.LogicleTransformer

  • hyperlog: see cytopy.flow.transform.HyperlogTransformer

  • asinh: see cytopy.flow.transform.AsinhTransformer

  • log: see cytopy.flow.transform.LogTransformer

Parameters
  • data (Pandas.DataFrame) –

  • features (List or dict) – Column names to be transformed

  • method (str (default='logicle')) – Transformation method

  • return_transformer (bool (default=False)) – If True, Transformer object is also returned

  • kwargs – Additional keyword arguments passed to respective Transformer

Returns

Copy of the DataFrame with chosen features transformed and Transformer object if return_transformer is True

Return type

Pandas.DataFrame or (Pandas.DataFrame and Transformer)

Raises

TransformError – Raised if invalid transform method requested

cytopy.flow.transform.apply_transform_map(data: pandas.core.frame.DataFrame, feature_method: dict, kwargs: Optional[dict] = None)

Wrapper function to cytopy.flow.transform.apply_transform; takes a dictionary (feature_method) where each key is the name of a feature and the value the transform to be applied to that feature.

Parameters
  • data (Pandas.DataFrame) –

  • feature_method (dict) –

  • kwargs (dict) – Additional keyword arguments passed to apply_transform

Returns

DataFrame with feature transformed

Return type

Pandas.DataFrame

cytopy.flow.transform.remove_negative_values(data: pandas.core.frame.DataFrame, features: list)

For each feature (as given in ‘features’) in data, check for negative values and replace with minimum value in range for that feature.

Parameters
  • data (Pandas.DataFrame) –

  • features (list) –

Returns

Modified DataFrame without negative values

Return type

Pandas.DataFrame

Raises

TransformError – All values for a given feature are negative

cytopy.flow.transform.safe_range(data: pandas.core.frame.DataFrame, x: str)

Return the minimum and maximum values in a range, ignore negative values

Parameters
  • data (Pandas.DataFrame) –

  • x (str) – Column of interest

Returns

Min, max

Return type

(float, float)

Raises

AssertionError – If all values in x are negative