cytopy.data.experiment¶

The experiment module houses the Experiment class, used to define cytometry based experiments that can consist of one or more biological specimens. An experiment should be defined for each cytometry staining panel used in your analysis and the single cell data (contained in *.fcs files) added to the experiment using the ‘add_new_sample’ method. Experiments should be created using the Project class (see cytopy.data.projects). All functionality for experiments and Panels are housed within this module.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Classes:

`Experiment`(args, *values)	Container for Cytometry experiment.
`NormalisedName`(args, *kwargs)	Defines a standardised name for a channel or marker and provides method for testing if a channel/marker should be associated to standard
`Panel`(args, *kwargs)	Document representation of channel/marker definition for an experiment.

Functions:

`check_duplication`(x)	Internal method.
`check_excel_template`(path)	Check excel template and if valid return pandas dataframes
`check_pairing`(channel_marker, ref_mappings)	Internal method.
`compenstate`(x, spill_matrix)	Compensate the given data, x, using the spillover matrix by solving for their linear combination.
`duplicate_mappings`(mappings)	Check for duplicates in a list of dictionaries describing channel/marker mappings.
`load_control_population_from_experiment`(…)	Load Population from a given control from samples in the given Experiment and generate a standard exploration dataframe that contains the columns ‘sample_id’, ‘subject_id’, and initialises additional columns with null values if specified (additional_columns).
`load_population_data_from_experiment`(…[, …])	Load Population from samples in the given Experiment and generate a standard exploration dataframe that contains the columns ‘sample_id’, ‘subject_id’, ‘meta_label’ and initialises additional columns with null values if specified (additional_columns).
`missing_channels`(mappings, channels[, errors])	Check a list of channel/marker dictionaries for missing channels according to the reference channels given.
`query_normalised_list`(x, ref)	Internal method for querying a channel/marker against a reference list of NormalisedName’s
`standardise_names`(channel_marker, …)	Given a dictionary detailing a channel/marker pair ({“channel”: str, “marker”: str}) standardise its contents using the reference material provided.

class cytopy.data.experiment.Experiment(*args, **values)¶

Container for Cytometry experiment. The correct way to generate and load these objects is using the Project.add_experiment method (see cytopy.data.project.Project). This object provides access to all experiment-wide functionality. New files can be added to an experiment using the add_new_sample method.

experiment_id¶

Unique identifier for experiment

Type: str, required

panel¶

Panel object describing associated channel/marker pairs

Type: ReferenceField, required

fcs_files¶

Reference field for associated files

Type: ListField

flags¶

Warnings associated to experiment

Type: str, optional

notes¶

Additional free text comments

Type: str, optional

Miscellaneous:

`DoesNotExist`
`MultipleObjectsReturned`

Methods:

`control_counts`([ax])	Generates a barplot of total counts of each control in Experiment FileGroup’s
`delete`([signal_kwargs])	Delete Experiment; will delete all associated FileGroups.
`delete_all_populations`(sample_id)	Delete population data associated to experiment.
`filter_samples_by_subject`(query)	Filter FileGroups associated to this experiment based on some subject meta-data
`generate_panel`(panel_definition)	Associate a panel to this Experiment, either by fetching an existing panel using the given panel name or by generating a new panel using the panel definition provided (path to a valid template).
`get_sample`(sample_id)	Given a sample ID, return the corresponding FileGroup object
`list_samples`([valid_only])	Generate a list IDs of file groups associated to experiment
`merge_populations`(mergers)	For each FileGroup in sequence, merge populations.
`population_statistics`([populations])	Generates a Pandas DataFrame of population statistics for all FileGroups of an Experiment, for the given populations or all available populations if ‘populations’ is None.
`remove_sample`(sample_id)	Remove sample (FileGroup) from experiment.
`sample_exists`(sample_id)	Returns True if the given sample_id exists in Experiment

exception DoesNotExist¶

exception MultipleObjectsReturned¶

control_counts(ax: Optional[matplotlib.axes._axes.Axes] = None) → matplotlib.axes._axes.Axes¶

Generates a barplot of total counts of each control in Experiment FileGroup’s

Parameters: ax (Matplotlib.Axes, optional) –
Returns
Return type: Matplotlib.Axes

delete(signal_kwargs=None, **write_concern)¶

Delete Experiment; will delete all associated FileGroups.

Returns
Return type: None

delete_all_populations(sample_id: str) → None¶

Delete population data associated to experiment. Give a value of ‘all’ for sample_id to remove all population data for every sample.

Parameters: sample_id (str) – Name of sample to remove populations from’; give a value of ‘all’ for sample_id to remove all population data for every sample.
Returns
Return type: None

filter_samples_by_subject(query: str) → list¶

Filter FileGroups associated to this experiment based on some subject meta-data

Parameters: query (str or mongoengine.queryset.visitor.Q) – Query to make on Subject
Returns
Return type: List

generate_panel(panel_definition: str) → None¶

Associate a panel to this Experiment, either by fetching an existing panel using the given panel name or by generating a new panel using the panel definition provided (path to a valid template).

Parameters: panel_definition (str) – Path to a panel definition
Returns
Return type: None
Raises: ValueError – Panel definition is not a string or dict

get_sample(sample_id: str) → cytopy.data.fcs.FileGroup ¶

Given a sample ID, return the corresponding FileGroup object

Parameters: sample_id (str) – Sample ID for search
Returns
Return type: FileGroup
Raises: MissingSampleError – If requested sample is not found in the experiment

list_samples(valid_only: bool = True) → list¶

Generate a list IDs of file groups associated to experiment

Parameters: valid_only (bool) – If True, returns only valid samples (samples without ‘invalid’ flag)
Returns: List of IDs of file groups associated to experiment
Return type: List

merge_populations(mergers: dict)¶

For each FileGroup in sequence, merge populations. Given dictionary should contain a key corresponding to the new population name and value being a list of populations to merge. If one or more populations are missing, then available populations will be merged.

Parameters: mergers (dict) –
Returns
Return type: None

population_statistics(populations: Optional[list] = None) → pandas.core.frame.DataFrame¶

Generates a Pandas DataFrame of population statistics for all FileGroups of an Experiment, for the given populations or all available populations if ‘populations’ is None.

Parameters: populations (list, optional) –
Returns
Return type: Pandas.DataFrame

remove_sample(sample_id: str)¶

Remove sample (FileGroup) from experiment.

Parameters: sample_id (str) – ID of sample to remove
Returns
Return type: None

sample_exists(sample_id: str) → bool¶

Returns True if the given sample_id exists in Experiment

Parameters: sample_id (str) – Name of sample to search for
Returns: True if exists, else False
Return type: bool

class cytopy.data.experiment.NormalisedName(*args, **kwargs)¶

Defines a standardised name for a channel or marker and provides method for testing if a channel/marker should be associated to standard

standard¶

the “standard” name i.e. the nomenclature we used for a channel/marker in this panel

Type: str, required

regex_str¶

regular expression used to test if a term corresponds to this standard

Type: str

permutations¶

String values that have direct association to this standard (comma seperated values)

Type: str

case_sensitive¶

is the nomenclature case sensitive? This would be false for something like ‘CD3’ for example, where ‘cd3’ and ‘CD3’ are synonymous

Type: bool, (default=False)

Methods:

query(x)

Given a term ‘x’, determine if ‘x’ is synonymous to this standard.

query(x: str) → str¶

Given a term ‘x’, determine if ‘x’ is synonymous to this standard. If so, return the standardised name.

Parameters: x (str) – search term
Returns: Standardised name if synonymous to standard, else None
Return type: str or None

class cytopy.data.experiment.Panel(*args, **kwargs)¶

Document representation of channel/marker definition for an experiment. A panel, once associated to an experiment will standardise data upon input; when an fcs file is created in the database, it will be associated to an experiment and the channel/marker definitions in the fcs file will be mapped to the associated panel.

markers¶

list of marker names; see NormalisedName

Type: EmbeddedDocListField

channels¶

list of channels; see NormalisedName

Type: EmbeddedDocListField

mappings¶

list of channel/marker mappings; see ChannelMap

Type: EmbeddedDocListField

initiation_date¶

date of creationfiles[‘controls’]

Type: DateTime

Methods:

`create_from_dict`(x)	Populate panel attributes from a python dictionary
`create_from_excel`(path)	Populate panel attributes from an excel template
`list_channels`()	List of channels associated to panel
`list_markers`()	List of channels associated to panel

create_from_dict(x: dict)¶

Populate panel attributes from a python dictionary

Parameters: x (dict) – dictionary object containing panel definition
Returns
Return type: None
Raises: AssertionError – If invalid dictionary template

create_from_excel(path: str) → None¶

Populate panel attributes from an excel template

Parameters: path (str) – path of file
Returns
Return type: None
Raises: AssertionError – If file path is invalid

list_channels() → list¶

List of channels associated to panel

Returns
Return type: List

list_markers() → list¶

List of channels associated to panel

Returns
Return type: List

cytopy.data.experiment.check_duplication(x: list) → bool¶

Internal method. Given a list check for duplicates. Warning generated for duplicates.

Parameters: x (list) –
Returns: True if duplicates are found, else False
Return type: bool

cytopy.data.experiment.check_excel_template(path: str) -> (<class 'pandas.core.frame.DataFrame'>, <class 'pandas.core.frame.DataFrame'>)¶

Check excel template and if valid return pandas dataframes

Parameters: path (str) – file path for excel template
Returns: tuple of pandas dataframes (nomenclature, mappings) or None
Return type: (Pandas.DataFrame, Pandas.DataFrame) or None
Raises: AssertionError – If duplicate entries or missing entries in excel template

cytopy.data.experiment.check_pairing(channel_marker: dict, ref_mappings: List[cytopy.data.mapping.ChannelMap]) → bool¶

Internal method. Given a channel and marker check that a valid pairing exists in the list of given mappings.

Parameters

channel_marker (dict) –
ref_mappings (list) – List of ChannelMap objects

Returns

True if pairing exists, else False

Return type

bool

cytopy.data.experiment.compenstate(x: numpy.ndarray, spill_matrix: numpy.ndarray) → numpy.ndarray¶

Compensate the given data, x, using the spillover matrix by solving for their linear combination.

Parameters

x (numpy.ndarray) –
spill_matrix (numpy.ndarray) –

Returns

Return type

numpy.ndarray

cytopy.data.experiment.duplicate_mappings(mappings: List[dict])¶

Check for duplicates in a list of dictionaries describing channel/marker mappings. Raise AssertionError if duplicates found.

Parameters: mappings (list) –
Returns
Return type: None
Raises: AssertionError – If duplicate channel/marker found

cytopy.data.experiment.load_control_population_from_experiment(experiment: cytopy.data.experiment.Experiment, population: str, ctrl: str, transform: str = 'logicle', transform_kwargs: Optional[dict] = None, sample_ids: Optional[list] = None, verbose: bool = True, additional_columns: Optional[list] = None)¶

Load Population from a given control from samples in the given Experiment and generate a standard exploration dataframe that contains the columns ‘sample_id’, ‘subject_id’, and initialises additional columns with null values if specified (additional_columns).

Parameters

experiment (Experiment) –
population (str) –
ctrl (str,) –
transform (str) –
transform_kwargs (dict, optional) –
sample_ids (list, optional) –
verbose (bool (default=True)) –
additional_columns (list, optional) –

Returns

Return type

Pandas.DataFrame

cytopy.data.experiment.load_population_data_from_experiment(experiment: cytopy.data.experiment.Experiment, population: str, transform: str = 'logicle', transform_kwargs: Optional[dict] = None, sample_ids: Optional[list] = None, verbose: bool = True, additional_columns: Optional[list] = None)¶

Load Population from samples in the given Experiment and generate a standard exploration dataframe that contains the columns ‘sample_id’, ‘subject_id’, ‘meta_label’ and initialises additional columns with null values if specified (additional_columns).

Parameters

experiment (Experiment) –
population (str) –
transform (str) –
transform_kwargs (dict, optional) –
sample_ids (list, optional) –
verbose (bool (default=True)) –
additional_columns (list, optional) –

Returns

Return type

Pandas.DataFrame

cytopy.data.experiment.missing_channels(mappings: List[dict], channels: List[cytopy.data.experiment.NormalisedName], errors: str = 'raise')¶

Check a list of channel/marker dictionaries for missing channels according to the reference channels given.

Parameters

mappings (list) –
channels (list) –
errors (str) –

Returns

Return type

None

Raises

KeyError – If channel is missing

cytopy.data.experiment.query_normalised_list(x: str, ref: List[cytopy.data.experiment.NormalisedName]) → str¶

Internal method for querying a channel/marker against a reference list of NormalisedName’s

Parameters

x (str or None) – channel/marker to query
ref (list) – list of NormalisedName objects for reference search

Returns

Standardised name

Return type

str

Raises

AssertionError – If no or multiple matches found in query

cytopy.data.experiment.standardise_names(channel_marker: dict, ref_channels: List[cytopy.data.experiment.NormalisedName], ref_markers: List[cytopy.data.experiment.NormalisedName], ref_mappings: List[cytopy.data.mapping.ChannelMap]) → dict¶

Given a dictionary detailing a channel/marker pair ({“channel”: str, “marker”: str}) standardise its contents using the reference material provided.

Parameters

channel_marker (dict) –
ref_channels (list) –
ref_markers (list) –
ref_mappings (list) –

Returns

Return type

dict

Raises

ValueError – If channel and marker are missing