CAPalyzer
The CAPalyzer is a package to parse and process the results produced by the main CAP pipeline. It performs two key tasks:
- builds data tables that summarize the results of modules for multiple samples
- parses those data tables and provides processing tools
API
Table Builder
-
class
cap2.capalyzer.table_builder.cap_table_builder.
CAPFileSource
[source]
This abstract class provides an interface to get
filepaths and other raw data.
-
metadata
()[source]
Return a DataFrame containing metadata for the samples.
-
module_files
(module_name, field_name)[source]
Return an iterable 2-ples of (sample_name, local_path) for modules of specified type.
-
sample_names
()[source]
Return a list of sample names (strings).
-
class
cap2.capalyzer.table_builder.
CAPTableBuilder
(name, file_source)[source]
This class builds summary tables for a set of samples.
-
covid_fast_detect_read_counts
()[source]
Return a table of read counts by taxa from covid fast detect.
-
fast_taxa_read_counts
()[source]
Return a table of read counts by taxa from fast taxa.
-
metadata
()[source]
Return a metadata table for these samples.
-
sample_names
()[source]
Return a list of sample names (strings).
-
strain_pileup
(organism, sparse=1)[source]
Return a table of pileups for a strain.
-
taxa_read_counts
()[source]
Return a table of read counts by taxa.
-
cap2.capalyzer.table_builder.parsers.
parse_pileup
(local_path, sparse=1)[source]
Return a pandas dataframe with info from a pileup file.
sparse is an int >= 1 if sparse is > 1 values will be averaged
making the table more smaller.
-
cap2.capalyzer.table_builder.parsers.
parse_taxa_report
(local_path, **kwargs)[source]