alleleTools.format.alleleTable module

Allele Table Data Structure Module.

This module defines the AlleleTable class, which provides a standardized data structure for storing and manipulating allele data along with associated phenotype and covariate information.

class alleleTools.format.alleleTable.AlleleTable(alleles: ~pandas.core.frame.DataFrame = Empty DataFrame Columns: [] Index: [], phenotype: ~pandas.core.series.Series = Series([], dtype: object), covariates: ~pandas.core.frame.DataFrame = Empty DataFrame Columns: [] Index: [])[source]

Bases: object

A standardized data structure for storing allele data with metadata.

This class provides a unified interface for handling allele data from polymorphic genes, along with associated phenotype and covariate information. It serves as the core data structure for allele analysis workflows.

alleles

Main allele data with samples as rows and genes/alleles as columns

Type:

pd.DataFrame

phenotype

Phenotype information indexed by sample ID

Type:

pd.Series

covariates

Covariate data with samples as rows and covariates as columns

Type:

pd.DataFrame

Example

>>> table = AlleleTable()
>>> # Load allele data
>>> table.alleles = pd.DataFrame(...)
>>> # Add phenotype information
>>> table.phenotype = pd.Series(...)
load_phenotype(phenotype_file: str) None[source]

Load phenotype information from a file into the AlleleTable.

Parameters:

phenotype_file (str) – The path to the phenotype file. It should be a whitespace-separated values file with a header. It should contain columns “IID” and “phenotype”.

classmethod open(filename: str, sep: str = '\t') AlleleTable[source]

Load allele data from a CSV file into the AlleleTable.

Parameters:

filename (str) – The path to the input CSV file.

remove_phenotype_zero() None[source]

Remove samples with phenotype value equal to zero from the AlleleTable.

set_phenotype(phenotype: Series) None[source]

Stores the phenotype series in the AlleleTable

Parameters:

phenotype (pd.Series) – A series with the phenotypes for the AlleleTable. It should have the samples’ IDs as index and the phenotype values.

to_csv(filename: str, header: bool = True, population: str = '')[source]

Export the allele table to a CSV file.

Parameters:
  • filename (str) – The name of the output CSV file.

  • header (bool) – Flag to store the file with column names or not

  • population (str) – Adds an extra column in the position left to phenotype with a population name. Currently, only one population per allele table is supported.