CAPalyzer¶

The CAPalyzer is a package to parse and process the results produced by the main CAP pipeline. It performs two key tasks:

builds data tables that summarize the results of modules for multiple samples
parses those data tables and provides processing tools

API¶

Table Builder¶

class cap2.capalyzer.table_builder.cap_table_builder.CAPFileSource[source]¶

This abstract class provides an interface to get filepaths and other raw data.

metadata()[source]¶: Return a DataFrame containing metadata for the samples.

module_files(module_name, field_name)[source]¶: Return an iterable 2-ples of (sample_name, local_path) for modules of specified type.

sample_names()[source]¶: Return a list of sample names (strings).

class cap2.capalyzer.table_builder.CAPTableBuilder(name, file_source)[source]¶

This class builds summary tables for a set of samples.

covid_fast_detect_read_counts()[source]¶: Return a table of read counts by taxa from covid fast detect.

fast_taxa_read_counts()[source]¶: Return a table of read counts by taxa from fast taxa.

metadata()[source]¶: Return a metadata table for these samples.

sample_names()[source]¶: Return a list of sample names (strings).

strain_pileup(organism, sparse=1)[source]¶: Return a table of pileups for a strain.

taxa_read_counts()[source]¶: Return a table of read counts by taxa.

cap2.capalyzer.table_builder.parsers.parse_pileup(local_path, sparse=1)[source]¶

Return a pandas dataframe with info from a pileup file.

sparse is an int >= 1 if sparse is > 1 values will be averaged making the table more smaller.

cap2.capalyzer.table_builder.parsers.parse_taxa_report(local_path, **kwargs)[source]¶