cytopy.flow.transform¶
Cytometry data has to be transformed prior to analysis. There are multiple techniques for transformation of data, the most popular being the biexponential transform. cytopy employs multiple methods using the FlowUtils package (https://github.com/whitews/FlowUtils), including the Logicle transform, a modified version of the biexponential transform.
Copyright 2020 Ross Burton
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Classes:
|
Implementation of inverse hyperbolic sine function, authored by Scott White (FlowUtils v0.8). |
|
Implementation of Hyperlog transform is authored by Scott White (FlowUtils v0.8). |
|
Apply log transform to data, either using parametrized log transform as defined in GatingML 2.0 specification (implemented by Scott White in FlowUtils v0.8) or using natural log, base 2 or base 10. |
|
Implementation of Logicle transform is authored by Scott White (FlowUtils v0.8). |
|
Normalise data using Scikit-Learn normalize function (https://bit.ly/2YBfe3o) Norms are stored in the attribute ‘norms’ and normalisation reversed by passing the transformed data to the ‘inverse’ method. |
|
Utility object for applying Scikit-Learn transformers to a chosen dataset. |
|
Base class for Transformer object. |
Exceptions:
Functions:
|
Apply a transformation to the given DataFrame and the chosen columns (features). |
|
Wrapper function to cytopy.flow.transform.apply_transform; takes a dictionary (feature_method) where each key is the name of a feature and the value the transform to be applied to that feature. |
|
For each feature (as given in ‘features’) in data, check for negative values and replace with minimum value in range for that feature. |
|
Return the minimum and maximum values in a range, ignore negative values |
-
class
cytopy.flow.transform.
AsinhTransformer
(m: float = 4.5, a: float = 0.0, t: int = 262144)¶ Implementation of inverse hyperbolic sine function, authored by Scott White (FlowUtils v0.8).
- Parameters
m (float (default=4.5)) – Number of decades the true logarithmic scale approaches at the high end of the scale
a (float (default=0)) – Additional number of negative decades
t (int (default=262144)) – Top of the linear scale
-
class
cytopy.flow.transform.
HyperlogTransformer
(w: float = 0.5, m: float = 4.5, a: float = 0.0, t: int = 262144)¶ Implementation of Hyperlog transform is authored by Scott White (FlowUtils v0.8).
Hyperlog transformation, implemented as defined in the GatingML 2.0 specification:
hyperlog(x, T, W, M, A) = root(EH(y, T, W, M, A) − x)
where EH is defined as:
EH(y, T, W, M, A) = ae^(by) + cy − f
The Hyperlog transformation was originally defined in the publication: Bagwell CB. Hyperlog-a flexible log-like transform for negative, zero, and positive valued data. Cytometry A., 2005:64(1):34–42.
- Parameters
w (float (default=0.5)) – Approximate number of decades in the linear region
m (float (default=4.5)) – Number of decades the true logarithmic scale approaches at the high end of the scale
a (float (default=0)) – Additional number of negative decades
t (int (default=262144)) – Top of the linear scale
-
class
cytopy.flow.transform.
LogTransformer
(base: str = 'parametrized', m: float = 4.5, t: int = 262144, **kwargs)¶ Apply log transform to data, either using parametrized log transform as defined in GatingML 2.0 specification (implemented by Scott White in FlowUtils v0.8) or using natural log, base 2 or base 10.
- Parameters
base (str or int (default="parametrized")) – Method to be used, should either be ‘parametrized’, 10, 2, or ‘natural’
m (float (default=4.5)) – Number of decades the true logarithmic scale approaches at the high end of the scale
t (int (default=262144)) – Top of the linear scale
- Raises
TransformError – Invalid LogTransformer method
Methods:
inverse_scale
(data, features)Apply inverse of log transform to features (columns) of given dataframe, under the assumption that these features have previously been transformed with LogTransformer
scale
(data, features)Scale features (columns) of given dataframe using log transform
-
inverse_scale
(data: pandas.core.frame.DataFrame, features: list)¶ Apply inverse of log transform to features (columns) of given dataframe, under the assumption that these features have previously been transformed with LogTransformer
- Parameters
data (Pandas.DataFrame) –
features (list) –
- Returns
- Return type
Pandas.DataFrame
-
scale
(data: pandas.core.frame.DataFrame, features: list)¶ Scale features (columns) of given dataframe using log transform
- Parameters
data (Pandas.DataFrame) –
features (list) –
- Returns
- Return type
Pandas.DataFrame
-
class
cytopy.flow.transform.
LogicleTransformer
(w: float = 0.5, m: float = 4.5, a: float = 0.0, t: int = 262144)¶ Implementation of Logicle transform is authored by Scott White (FlowUtils v0.8). Logicle transformation, implemented as defined in the GatingML 2.0 specification:
logicle(x, T, W, M, A) = root(B(y, T, W, M, A) − x)
where B is a modified bi-exponential function defined as:
B(y, T, W, M, A) = ae^(by) − ce^(−dy) − f
The Logicle transformation was originally defined in the publication: Moore WA and Parks DR. Update for the logicle data scale including operational code implementations. Cytometry A., 2012:81A(4):273–277.
- Parameters
w (float (default=0.5)) – Approximate number of decades in the linear region
m (float (default=4.5)) – Number of decades the true logarithmic scale approaches at the high end of the scale
a (float (default=0)) – Additional number of negative decades
t (int (default=262144)) – Top of the linear scale
-
class
cytopy.flow.transform.
Normalise
(norm: str = 'l2', axis: int = 1)¶ Normalise data using Scikit-Learn normalize function (https://bit.ly/2YBfe3o) Norms are stored in the attribute ‘norms’ and normalisation reversed by passing the transformed data to the ‘inverse’ method.
- Parameters
norm (Numpy Array) – An array of norms along given axis for X
axis (int (default=1)) –
- Axis to apply normalisation along. If 1, independently normalize each sample, otherwise (if 0)
normalize each feature.
Methods:
inverse
(data, features)Perform inverse of normalisation to given dataframe.
-
inverse
(data: pandas.core.frame.DataFrame, features: list)¶ Perform inverse of normalisation to given dataframe. Returns copy of DataFrame with inverse normalisation applied to chosen columns (features)
- Parameters
data (Pandas.DataFrame) –
features (List) –
- Returns
- Return type
Pandas.DataFrame
- Raises
AssertionError – Shape of given dataframe does not match the data
ValueError – Inverse called prior to normalisation
-
class
cytopy.flow.transform.
Scaler
(method: str = 'standard', **kwargs)¶ Utility object for applying Scikit-Learn transformers to a chosen dataset. Following transformations supported; method and corresponding Scikit-Learn class:
“standard” - sklearn.preprocessing.StandardScaler
“minmax” - sklearn.preprocessing.MinMaxScaler
“robust” - sklearn.preprocessing.RobustScaler
“maxabs” - sklearn.preprocessing.MaxAbsScaler
“quantile” - sklearn.preprocessing.QuantileTransformer
“yeo_johnson” - sklearn.preprocessing.PowerTransformer
“box_cox” - sklearn.preprocessing.PowerTransformer
(PowerTransformer method argument will be ‘yeo-johnson’ or ‘box-cox’ according to the chosen method)
See relevant Scikit-Learn documentation for guidance on a particular method: https://scikit-learn.org/stable/modules/classes.html#module-sklearn.preprocessing
User should initialise object with ‘method’ according to the above and provide any additional keyword arguments, relevant to the chosen object, as kwargs.
-
method
¶ Name of scikit-learn transformer to use
- Type
str
Methods:
inverse
(data, features)Given dataframe and a list of columns (features) that has been previously transformed, apply inverse transform.
set_params
(**kwargs)Sets parameters of underlying Scikit-Learn method
-
inverse
(data: pandas.core.frame.DataFrame, features: list)¶ Given dataframe and a list of columns (features) that has been previously transformed, apply inverse transform. Returns copy of DataFrame with transformation reversed.
- Parameters
data (Pandas.DataFrame) –
features (List) –
- Returns
- Return type
Pandas.DataFrame
- Raises
TransformError – If the chosen Scikit-Learn method does not support inverse transform
-
set_params
(**kwargs)¶ Sets parameters of underlying Scikit-Learn method
- Parameters
kwargs – Additional keyword arguments passed to ‘set_params’ call
- Returns
- Return type
None
-
exception
cytopy.flow.transform.
TransformError
¶
-
class
cytopy.flow.transform.
Transformer
(transform_function: callable, inverse_function: callable, **kwargs)¶ Base class for Transformer object.
- Parameters
transform (callable) – Transform function
inverse (callable) – Inverse transformation function
kwargs – Keyword arguments passed to transform/inverse transform
-
transform
¶ - Type
callable
-
inverse
¶ - Type
callable
-
kwargs
¶ - Type
dict
Methods:
inverse_scale
(data, features)Apply inverse scale to features (columns) of given dataframe, under the assumption that these features have previously been transformed with this Transformer
scale
(data, features)Scale features (columns) of given dataframe
-
inverse_scale
(data: pandas.core.frame.DataFrame, features: list)¶ Apply inverse scale to features (columns) of given dataframe, under the assumption that these features have previously been transformed with this Transformer
- Parameters
data (Pandas.DataFrame) –
features (list) –
- Returns
- Return type
Pandas.DataFrame
- Raises
TransformError – Chosen inverse transform function is missing the arguments channel_indices or channels. cytopy uses the FlowUtils class for transformations. See FlowUtils documentation for details.
-
scale
(data: pandas.core.frame.DataFrame, features: list)¶ Scale features (columns) of given dataframe
- Parameters
data (Pandas.DataFrame) –
features (list) –
- Returns
- Return type
Pandas.DataFrame
- Raises
TransformError – Chosen transform function is missing the arguments channel_indices or channels. cytopy uses the FlowUtils class for transformations. See FlowUtils documentation for details.
-
cytopy.flow.transform.
apply_transform
(data: pandas.core.frame.DataFrame, features: list, method: str = 'logicle', return_transformer: bool = False, **kwargs)¶ Apply a transformation to the given DataFrame and the chosen columns (features). Transformation method is specified using the ‘method’ argument and should be one of:
logicle: see cytopy.flow.transform.LogicleTransformer
hyperlog: see cytopy.flow.transform.HyperlogTransformer
asinh: see cytopy.flow.transform.AsinhTransformer
log: see cytopy.flow.transform.LogTransformer
- Parameters
data (Pandas.DataFrame) –
features (List or dict) – Column names to be transformed
method (str (default='logicle')) – Transformation method
return_transformer (bool (default=False)) – If True, Transformer object is also returned
kwargs – Additional keyword arguments passed to respective Transformer
- Returns
Copy of the DataFrame with chosen features transformed and Transformer object if return_transformer is True
- Return type
Pandas.DataFrame or (Pandas.DataFrame and Transformer)
- Raises
TransformError – Raised if invalid transform method requested
-
cytopy.flow.transform.
apply_transform_map
(data: pandas.core.frame.DataFrame, feature_method: dict, kwargs: Optional[dict] = None)¶ Wrapper function to cytopy.flow.transform.apply_transform; takes a dictionary (feature_method) where each key is the name of a feature and the value the transform to be applied to that feature.
- Parameters
data (Pandas.DataFrame) –
feature_method (dict) –
kwargs (dict) – Additional keyword arguments passed to apply_transform
- Returns
DataFrame with feature transformed
- Return type
Pandas.DataFrame
-
cytopy.flow.transform.
remove_negative_values
(data: pandas.core.frame.DataFrame, features: list)¶ For each feature (as given in ‘features’) in data, check for negative values and replace with minimum value in range for that feature.
- Parameters
data (Pandas.DataFrame) –
features (list) –
- Returns
Modified DataFrame without negative values
- Return type
Pandas.DataFrame
- Raises
TransformError – All values for a given feature are negative
-
cytopy.flow.transform.
safe_range
(data: pandas.core.frame.DataFrame, x: str)¶ Return the minimum and maximum values in a range, ignore negative values
- Parameters
data (Pandas.DataFrame) –
x (str) – Column of interest
- Returns
Min, max
- Return type
(float, float)
- Raises
AssertionError – If all values in x are negative