alleleTools.allele module

class alleleTools.allele.Allele(gene: str, fields: List[str], confidence: str = '', gene_delimiter: str = '*', field_delimiter: str = ':', suffix: str = '')[source]

Bases: object

Class representing an HLA allele with parsing and comparison capabilities.

This class handles the parsing of HLA allele nomenclature and provides methods for comparison, resolution management, and string representation. It supports confidence scores from algorithms like HiSat.

gene

Gene name (e.g., ‘A’, ‘B’, ‘DRB1’)

Type:

str

fields

List of allele fields (e.g., [‘01’, ‘01’, ‘01’])

Type:

List[str]

confidence

Optional confidence score from genotyping algorithm

Type:

float

Parameters:
  • code (str | list) – Allele string to parse (e.g., ‘A*01:01:01’ or ‘A*01:01(0.95)’). Alternatively, you can pass a list of fields along with the argument gene

  • gene (str) – If the gene name is not present in code you should pass the here here.

Raises:

Exception – If the allele string cannot be parsed

Example

>>> allele = Allele('A*01:01:01')
>>> print(allele.gene)  # 'A'
>>> print(allele.fields)  # ['01', '01', '01']
>>> print(len(allele))  # 3
compare(allele: Allele) AlleleMatchStatus[source]

Compare this allele with another allele.

Performs detailed comparison considering gene name, field values, and resolution levels.

Parameters:

allele (Allele) – The allele to compare with

Returns:

The result of the comparison

Return type:

CmpResult

Example

>>> a1 = Allele('A*01:01')
>>> a2 = Allele('A*01:01:01')
>>> a1.compare(a2)  # CmpResult.MORE_RESOLUTION
get_fields() List[str][source]
truncate(new_resolution: int) Allele[source]

Reduce the resolution of the allele to the specified level.

Parameters:

new_resolution (int) – Target resolution level (number of fields)

Note

If new_resolution is greater than current resolution, no change is made.

class alleleTools.allele.AlleleMatchStatus(value)[source]

Bases: Enum

Enumeration for allele comparison results.

Defines the possible outcomes when comparing two alleles: - NOT_EQUAL: Alleles are different (different genes or field values) - EQUAL: Alleles are identical in gene and all fields - LESS_RESOLUTION: Current allele has fewer fields than compared allele - MORE_RESOLUTION: Current allele has more fields than compared allele

EQUAL = 2
LESS_RESOLUTION = 3
MORE_RESOLUTION = 4
NOT_EQUAL = 1
class alleleTools.allele.AlleleParser(gene_family: str, config_file: str = '')[source]

Bases: object

Configurable allele parser that supports multiple parsing strategies. This class loads parsing configurations from a JSON file.

The configuration can be overwritten by providing a custom config file during initialization (config_file parameter).

Use:

parser = AlleleParser(gene_family=”hla”, config_file=”custom_config.json”) allele = parser.parse(“A*01:02:03”)

get_delimiters() Tuple[str, str][source]
parse(code: str) Allele[source]
set_gene_family(gene_family: str)[source]
class alleleTools.allele.DelimitedParser(gene_delimiter: str, field_delimiter: str)[source]

Bases: ParsingStrategy

Parser for alleles using simple delimiters. Two parameters are needed:
  • gene_delimiter: character that separates the gene name from the fields

  • field_delimiter: character that separates the different fields

Example

For allele “A*01:02:03”, gene_delimiter=”*”, field_delimiter=”:”

Results:
  1. gene = “A”

  2. fields = [“01”, “02”, “03”]

parse(text: str) Allele[source]
class alleleTools.allele.FieldTree(name: str)[source]

Bases: object

Tree structure to represent the hierarchical organization of allele fields.

Each node in the tree corresponds to a field value at a particular position in the allele nomenclature. The tree is used to count and organize the occurrence of each field value across a set of alleles.

field

The field value at this node.

Type:

str

support

Number of times this field value has been added at this position, which indicates how many times it has been genotyped by different tools.

Type:

int

children

List of child nodes

Type:

List[FieldTree]

representing subsequent fields.

Example

For alleles A*01:01 and A*01:02, the tree will have a root ‘A’, a child ‘01’, and two children ‘01’ and ‘02’ under it.

add(fields: list, weight: float = 1.0)[source]

Add a sequence of fields to the tree, incrementing counts and creating nodes as needed.

Parameters:
  • fields (list) – List of field values (str) to add as a path in the

  • tree.

Example

tree.add([‘01’, ‘01’]) will add/increment nodes for ‘01’ at two levels.

add_batch(batch: List[list], weight: float = 1.0)[source]

Helper method to append a list of field lists to the current tree.

Parameters:

batch (List[list]) – A list containing lists of field values (str).

get_consensus(min_support: float, gene_delimiter: str = '*', field_delimiter: str = ':', max_support: float = 0) Tuple[List[str], List[float]][source]

Gets a list of up to two possible consensus solutions that meet the criteria of the minimum support. This is basically a tree search algorithm.

Two additional parameters can be provided to format the allele strings.

Parameters:
  • min_support (float) – minimum proportion of support required. Each program contributes votes equally, the amount of times the allele has been genotyped is stored in each node of the tree. This value is used to filter consensus alleles based on their support values.

  • gene_delimiter (str) – Delimiter between gene name and fields.

  • field_delimiter (str) – Delimiter between fields.

Returns:

Consensus alleles and their support

values.

Return type:

Tuple[List[str], List[float]]

merge_tree(tree: FieldTree)[source]

Merges a foreign tree into the current tree. The top level should match with this tree.

Parameters:

tree (FieldTree) – The foreign tree.

set_support(new_weight: float, recursive: bool = True) None[source]

Change the support value. When recursive is True, it will change the support value for all children.

class alleleTools.allele.ParsingStrategy[source]

Bases: ABC

Abstract base class for allele parsing strategies.

abstract parse(text: str) Allele[source]
class alleleTools.allele.RegexParser(pattern: str, field_delimiter: str, gene_delimiter: str)[source]

Bases: ParsingStrategy

Parser for alleles using regular expressions. It only requires the regex pattern. It should contain named groups for ‘gene’, ‘field1’, ‘field2’, etc. An optional group ‘confidence’ can also be included to capture confidence scores.

field_delimiter and gene_delimiter can also be specified. These parameters will only be used to format the the allele string representation.

This parser is more flexible and robust than the DelimitedParser, as it can handle more complex allele formats. However, it requires knowledge of regular expressions.

Example

For allele “A*01”, pattern = r”(?:(?P<gene>w+)*)?(?P<field1>d{2})”

Results:
  1. gene = “A”

  2. fields = [“01”]

parse(text: str) Allele[source]
class alleleTools.allele.Solution(allele: str, support: float)[source]

Bases: object