alleleTools.format.alleleTable module
Allele Table Data Structure Module.
This module defines the AlleleTable class, which provides a standardized data structure for storing and manipulating allele data along with associated phenotype and covariate information.
- class alleleTools.format.alleleTable.AlleleTable(alleles: ~pandas.core.frame.DataFrame = Empty DataFrame Columns: [] Index: [], phenotype: ~pandas.core.series.Series = Series([], dtype: object), covariates: ~pandas.core.frame.DataFrame = Empty DataFrame Columns: [] Index: [])[source]
Bases:
objectA standardized data structure for storing allele data with metadata.
This class provides a unified interface for handling allele data from polymorphic genes, along with associated phenotype and covariate information. It serves as the core data structure for allele analysis workflows.
- alleles
Main allele data with samples as rows and genes/alleles as columns
- Type:
pd.DataFrame
- phenotype
Phenotype information indexed by sample ID
- Type:
pd.Series
- covariates
Covariate data with samples as rows and covariates as columns
- Type:
pd.DataFrame
Example
>>> table = AlleleTable() >>> # Load allele data >>> table.alleles = pd.DataFrame(...) >>> # Add phenotype information >>> table.phenotype = pd.Series(...)
- load_phenotype(phenotype_file: str) None[source]
Load phenotype information from a file into the AlleleTable.
- Parameters:
phenotype_file (str) – The path to the phenotype file. It should be a whitespace-separated values file with a header. It should contain columns “IID” and “phenotype”.
- classmethod open(filename: str, sep: str = '\t') AlleleTable[source]
Load allele data from a CSV file into the AlleleTable.
- Parameters:
filename (str) – The path to the input CSV file.
- remove_phenotype_zero() None[source]
Remove samples with phenotype value equal to zero from the AlleleTable.
- set_phenotype(phenotype: Series) None[source]
Stores the phenotype series in the AlleleTable
- Parameters:
phenotype (pd.Series) – A series with the phenotypes for the AlleleTable. It should have the samples’ IDs as index and the phenotype values.
- to_csv(filename: str, header: bool = True, population: str = '')[source]
Export the allele table to a CSV file.
- Parameters:
filename (str) – The name of the output CSV file.
header (bool) – Flag to store the file with column names or not
population (str) – Adds an extra column in the position left to phenotype with a population name. Currently, only one population per allele table is supported.