alleleTools.format.vcf2allele module

VCF to Allele Table Conversion Module.

This module converts VCF (Variant Call Format) files containing HLA or KIR allele data into allele tables. It supports various output formats including pyHLA and PyPop compatible formats.

Usage:

To generate the input file from imputation, run: # Extract only relevant alleles bcftools view –include ‘ID~”HLA”’ IMPUTED.vcf > HLA.vcf # Convert the extracted alleles to a table altools convert vcf2allele HLA.vcf –output out.alt

Author: Nicolás Mendoza Mejía (2023)

class alleleTools.format.vcf2allele.VCFalleles(alleles: DataFrame, formats: DataFrame)[source]

Bases: object

Class for processing and analyzing VCF allele data.

This class handles the parsing and analysis of allele information from VCF format data, including genotype determination, resolution analysis, and ploidy-based filtering.

df

Processed allele data with genotype information indexed by gene and allele names.

Type:

pd.DataFrame

Parameters:
  • alleles (pd.DataFrame) – Raw allele data from VCF

  • formats (pd.DataFrame) – Format information from VCF header

sort_and_fill(extensive=False)[source]

Sort alleles by gene and determine final genotype calls.

For each gene, determines whether the genotype is homozygous or heterozygous and selects the appropriate alleles based on confidence scores and resolution.

Parameters:

extensive (bool) – Whether to perform extensive search for low-confidence alleles (default: False)

Returns:

List of selected allele names for all genes

Return type:

List[str]

alleleTools.format.vcf2allele.call_function(args)[source]

Main function to execute VCF to allele table conversion.

This function orchestrates the conversion process by: 1. Loading and preprocessing the VCF file 2. Extracting genotype information 3. Converting to allele format 4. Adding phenotype data if provided 5. Writing the output file

Parameters:

args – Parsed command line arguments containing: - input: Path to input VCF file - output: Path to output file - rm_prefix: Prefix to remove from allele names - separator: Separator between gene and allele names - extensive_search: Whether to perform extensive allele search - phe: Optional phenotype file path - output_header: Whether to include header in output - population: Population identifier for PyPop compatibility

alleleTools.format.vcf2allele.setup_parser(subparsers)[source]

Set up the argument parser for the vcf2allele command.

Parameters:

subparsers – The subparsers object to add this command to.

Returns:

The configured parser for vcf2allele.

Return type:

argparse.ArgumentParser