epitopepredict package¶

Submodules¶

epitopepredict.analysis module¶

epitopepredict analysis methods Created September 2013 Copyright (C) Damien Farrell This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

epitopepredict.analysis.align_blast_results(df, aln=None, idkey='accession', productkey='definition')[source]¶: Get gapped alignment from blast results using muscle aligner.

epitopepredict.analysis.alignment_to_dataframe(aln)[source]¶

epitopepredict.analysis.create_nmers(df, genome, length=20, seqkey='translation', key='nmer', how='split', margin=0)[source]¶

Get n-mer peptide surrounding a set of sequences using the host: protein sequence.

Parameters:

df – input dataframe with sequence name and start/end coordinates
genome – genome dataframe with host sequences
length – length of nmer to return
seqkey – column name of sequence to be processed
how – method to create the n-mer, split will try to split up the sequence into overlapping n-mes of length is larger than size center will center the peptide
margin – do not split sequences below length+margin

Returns:

pandas Series with nmer values

epitopepredict.analysis.dbscan(B=None, x=None, dist=7, minsize=4)[source]¶: Use dbscan algorithm to cluster binder positions

epitopepredict.analysis.epitope_conservation(peptides, alnrows=None, proteinseq=None, blastresult=None, blastdb=None, perc_ident=50, equery='srcdb_refseq[Properties]')[source]¶

Find and visualise conserved peptides in a set of aligned sequences. :param peptides: a list of peptides/epitopes :param alnrows: a dataframe of previously aligned sequences e.g. custom strains :param proteinseq: a sequence to blast and get an alignment for :param blastresult: a file of saved blast results in plain csv format :param equery: blast query string

Returns:	Matrix of 0 or 1 for conservation for each epitope/protein variant

epitopepredict.analysis.find_clusters(binders, dist=None, min_binders=2, min_size=12, max_size=50, genome=None, colname='peptide')[source]¶

Get clusters of binders for a set of binders. :param binders: dataframe of binders :param dist: distance over which to apply clustering :param min_binders: minimum binders to be considered a cluster :param min_size: smallest cluster length to return :param max_size: largest cluster length to return :param colname: name for cluster sequence column

Returns:	a pandas Series with the new n-mers (may be longer than the initial dataframe if splitting)

epitopepredict.analysis.find_conserved_peptide(peptide, recs)[source]¶: Find sequences where a peptide is conserved

epitopepredict.analysis.find_conserved_sequences(seqs, alnrows)[source]¶

Find if sub-sequences are conserved in given set of aligned sequences :param seqs: a list of sequences to find :param alnrows: a dataframe of aligned protein sequences

Returns:	a pandas DataFrame of 1 or 0 values for each protein/search sequence

epitopepredict.analysis.get_AAcontent(df, colname, amino_acids=None)[source]¶: Amino acid composition for dataframe with sequences

epitopepredict.analysis.get_orthologs(seq, db=None, expect=1, hitlist_size=400, equery=None, email='')[source]¶

Fetch orthologous sequences using remote or local blast and return the records as a dataframe.

Parameters:	seq – sequence to blast db – the name of a local blast db expect – expect value equery – Entrez Gene Advanced Search options, (see http://www.ncbi.nlm.nih.gov/books/NBK3837/)
Returns:	blast results in a pandas dataframe

epitopepredict.analysis.get_overlaps(df1, df2, label='overlap', how='inside')[source]¶

Overlaps for 2 sets of sequences where the positions in host sequence are stored in each dataframe as ‘start’ and ‘end’ columns

Parameters:	df1 – first set of sequences, a pandas dataframe with columns called start/end or pos df2 – second set of sequences label – label for overlaps column how – may be ‘any’ or ‘inside’
Returns:	First DataFrame with no. of overlaps stored in a new column

epitopepredict.analysis.get_seqdepot(seq)[source]¶: Fetch seqdepot annotation for sequence

epitopepredict.analysis.get_species_name(s)[source]¶: Find [species name] in blast result definition

epitopepredict.analysis.isoelectric_point(df)[source]¶

epitopepredict.analysis.net_charge(df, colname)[source]¶: Net peptide charge for dataframe with sequences

epitopepredict.analysis.peptide_properties(df, colname='peptide')[source]¶: Find hydrophobicity and net charge for peptides

epitopepredict.analysis.prediction_coverage(expdata, binders, key='sequence', perc=50, verbose=False)[source]¶

Determine hit rate of predictions in experimental data by finding how many top peptides are needed to cover % positives :param expdata: dataframe of experimental data with peptide sequence and name column :param binders: dataframe of ranked binders created from predictor :param key: column name in expdata for sequence

Returns:	fraction of predicted binders required to find perc total response

epitopepredict.analysis.randomize_dataframe(df, seed=8)[source]¶: Randomize order of dataframe

epitopepredict.analysis.save_to_excel(df, n=94, filename='peptide_lists')[source]¶: Save a dataframe to excel with option of writing in chunks.

epitopepredict.analysis.signalP(infile=None, genome=None)[source]¶: Get signal peptide predictions

epitopepredict.analysis.test()[source]¶

epitopepredict.analysis.test_conservation(label, gname)[source]¶: Conservation analysis

epitopepredict.analysis.test_features()[source]¶: test feature handling

epitopepredict.analysis.testrun(gname)[source]¶

epitopepredict.analysis.tmhmm(fastafile=None, infile=None)[source]¶: Get TMhmm predictions :param fastafile: fasta input file to run :param infile: text file with tmhmm prediction output

epitopepredict.app module¶

epitopepredict.base module¶

MHC prediction base module for core classes Created November 2013 Copyright (C) Damien Farrell This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

class epitopepredict.base.BasicMHCIPredictor(data=None, scoring=None)[source]¶

Bases: epitopepredict.base.Predictor

Built-in basic MHC-I predictor. Should be used as a fallback if no other predictors available.

check_install()[source]¶

get_alleles()[source]¶: Get available alleles - override

predict(peptides, allele='HLA-A*01:01', name='temp', **kwargs)[source]¶: Encode and predict peptides with saved regressor

predict_peptides(peptides, **kwargs)[source]¶: Override so we can call train models before predictions.

predict_sequences(recs, **kwargs)[source]¶: Override so we can call train models before predictions.

prepare_data(df, name, allele)[source]¶: Put raw prediction data into DataFrame and rank, override for custom processing. Can be overriden for custom data.

supported_lengths()[source]¶: Return supported peptide lengths

class epitopepredict.base.DataFrameIterator(files)[source]¶

Bases: object

Simple iterator to get dataframes from a path out of memory

next()[source]¶

class epitopepredict.base.DummyPredictor(data=None, scoring=None)[source]¶

Bases: epitopepredict.base.Predictor

Returns random scores. Used for testing

predict(peptides, allele='HLA-A*01:01', name='temp', **kwargs)[source]¶: Does the actual scoring of a sequence. Should be overriden. Should return a pandas DataFrame

class epitopepredict.base.IEDBMHCIIPredictor(data=None)[source]¶

Bases: epitopepredict.base.Predictor

Using IEDB MHC-II method, requires tools to be installed locally

check_install()[source]¶

get_alleles()[source]¶: Get available alleles - override

predict(sequence=None, peptides=None, length=15, overlap=None, show_cmd=False, allele='HLA-DRB1*01:01', method='IEDB_recommended', name='', **kwargs)[source]¶: Use IEDB MHC-II python module to get predictions. Requires that the IEDB MHC-II tools are installed locally. A sequence argument is provided since the cmd line only accepts whole sequence to be fragmented.

prepare_data(rows, name)[source]¶: Read data from raw output

class epitopepredict.base.IEDBMHCIPredictor(data=None, method='IEDB_recommended')[source]¶

Bases: epitopepredict.base.Predictor

Using IEDB tools method, requires iedb-mhc1 tools. Tested with version 2.17

check_install()[source]¶

get_allele_data()[source]¶

get_alleles()[source]¶: Get available alleles from model_list file and convert to standard names

predict(sequence=None, peptides=None, length=11, overlap=1, allele='HLA-A*01:01', name='', method=None, show_cmd=False, **kwargs)[source]¶

Use IEDB MHCI python module to get predictions. Requires that the IEDB MHC tools are installed locally :param sequence: a sequence to be predicted :param peptides: a list of arbitrary peptides instead of single sequence

Returns:	pandas dataframe

prepare_data(rows, name)[source]¶: Prepare data from results

class epitopepredict.base.MHCFlurryPredictor(data=None, **kwargs)[source]¶

Bases: epitopepredict.base.Predictor

Predictor using MHCFlurry for MHC-I predictions. Requires you to install the python package mhcflurry with dependencies. see https://github.com/hammerlab/mhcflurry

check_install()[source]¶

convert_allele_name(r)[source]¶

get_alleles()[source]¶: Get available alleles - override

predict(peptides=None, overlap=1, show_cmd=False, allele='HLA-A0101', name='', **kwargs)[source]¶: Uses mhcflurry python classes for prediction

predict_peptides(peptides, **kwargs)[source]¶: Override so we can call train models before predictions.

predict_sequences(recs, **kwargs)[source]¶: Override so we can switch off multi threading.

class epitopepredict.base.NetMHCIIPanPredictor(data=None)[source]¶

Bases: epitopepredict.base.Predictor

netMHCIIpan v3.0 predictor

allele_mapping(allele)[source]¶

check_install()[source]¶

convert_allele_name(a)[source]¶: Convert allele names to internally used form

get_alleles()[source]¶: Get available alleles

predict(peptides, allele='HLA-DRB1*0101', name='temp', pseudosequence=None, show_cmd=False, **kwargs)[source]¶: Call netMHCIIpan command line.

prepare_data(df, name)[source]¶: Prepare netmhciipan results as a dataframe

read_result(temp)[source]¶: Read raw results from netMHCIIpan output

class epitopepredict.base.NetMHCPanPredictor(data=None, scoring='affinity')[source]¶

Bases: epitopepredict.base.Predictor

netMHCpan 4.1b predictor see http://www.cbs.dtu.dk/services/NetMHCpan/ Default scoring is affinity predictions. To get newer scoring behaviour pass scoring=’ligand’ to constructor.

check_install()[source]¶

convert_allele_name(a)[source]¶: Convert allele names to internally used form

get_alleles()[source]¶: Get available alleles

predict(peptides, allele='HLA-A*01:01', name='temp', pseudosequence=None, show_cmd=False, **kwargs)[source]¶: Call netMHCpan command line.

prepare_data(df, name)[source]¶: Prepare netmhcpan results

read_result(temp)[source]¶: Read raw results from netMHCpan 4.1b output

read_result_legacy(temp)[source]¶: Read raw results from netMHCpan 4.0 output

class epitopepredict.base.Predictor(data=None)[source]¶

Bases: object

Base class to handle generic predictor methods, usually these will wrap methods from other modules and/or call command line predictors. Subclass for specific functionality

allele_summary(cutoff=5)[source]¶: Allele based summary

check_alleles(alleles)[source]¶

check_install()[source]¶

cleanup()[source]¶: Remove temp files from predictions

evaluate(df, key, value, operator='<')[source]¶: Evaluate binders less than or greater than a cutoff. This method is called by all predictors to get binders

format_row(x)[source]¶

get_allele_cutoffs(cutoff=0.95)[source]¶: Get per allele percentile cutoffs using precalculated quantile vales.

get_alleles()[source]¶: Get available alleles - override

get_binders(cutoff=0.95, cutoff_method='default', path=None, name=None, drop_columns=False, limit=None, **kwargs)[source]¶

Get the top scoring binders. If using default cutoffs are derived from the pre-defined percentile cutoffs for some known antigens. For per protein cutoffs the rank can used instead. This will give slightly different results. :param path: use results in a path instead of loading at once, conserves memory :param cutoff: percentile cutoff (default), absolute score or a rank value within each sequence :param cutoff_method: ‘default’, ‘score’ or ‘rank’ :param name: name of a specific protein/sequence

Returns:	binders above cutoff in all alleles, pandas dataframe

get_global_rank(score, allele)[source]¶: Get an allele specific score percentile from precalculated quantile data.

get_names()[source]¶: Get names of sequences currently stored as predictions

get_quantile_data()[source]¶: Get peptide score rank from quantile data.

get_ranking(df)[source]¶: Add a ranking column according to scorekey

get_scores(allele)[source]¶: Return peptides and scores only for an allele

get_unique_cores(binders=False)[source]¶: Get only unique cores

load(path=None, names=None, compression='infer', file_limit=None)[source]¶: Load results from path or single file. See results_from_csv for args.

plot(name, **kwargs)[source]¶: Use module level plotting.mpl_plot_tracks method for predictor plot :param name: :param n: min no. of alleles to be visible :param perc: percentile cutoff for score :param cutoff_method: method to use for cutoffs

predict(sequence=None, peptides=None, length=9, overlap=1, allele='', name='')[source]¶: Does the actual scoring of a sequence. Should be overriden. Should return a pandas DataFrame

predict_peptides(peptides, threads=1, path=None, overwrite=True, name=None, **kwargs)[source]¶

Predict a set of individual peptides without splitting them. This is a wrapper for _predict_peptides to allow multiprocessing. :param peptides: list of peptides :param alleles: list of alleles to predict :param drop_columns: only keep default columns

Returns:	dataframe with results

predict_proteins(args, **kwargs)[source]¶: Alias to predict_sequences

predict_sequences(recs, alleles=[], path=None, verbose=False, names=None, key='locus_tag', seqkey='translation', threads=1, **kwargs)[source]¶

Get predictions for a set of proteins over multiple alleles that allows running in parallel using the threads parameter. This is a wrapper for _predictSequences with the same args.

Args:

recs: list or dataframe with sequences path: if provided, save results to this file threads: number of processors key: seq/protein name key seqkey: key for sequence column length: length of peptide to split sequence into

Returns:

a dataframe of predictions over multiple proteins

prepare_data(result, name, allele)[source]¶: Put raw prediction data into DataFrame and rank, override for custom processing. Can be overriden for custom data.

print_heading()[source]¶

promiscuous_binders(binders=None, name=None, cutoff=0.95, cutoff_method='default', n=1, unique_core=True, limit=None, **kwargs)[source]¶

Use params for getbinders if no binders provided? :param binders: can provide a precalculated list of binders :param name: specific protein, optional :param value: to pass to get_binders :param cutoff_method: ‘rank’, ‘score’ or ‘global’ :param cutoff: cutoff for get_binders (rank, score or percentile) :param n: min number of alleles :param unique_core: removes peptides with duplicate cores and picks the most :param limit: limit the number of peptides per protein, default None :param promiscuous and highest ranked, used for mhc-II predictions:

Returns:	a pandas dataframe

protein_summary()[source]¶

proteins()[source]¶

ranked_binders(names=None, how='median', cutoff=None)[source]¶: Get the median/mean rank of each binder over all alleles. :param names: list of protein names, otherwise all current data used :param how: method to use for rank selection, ‘median’ (default), :param ‘best’ or ‘mean’,: :param cutoff: apply a rank cutoff if we want to filter (optional)

reshape(name=None)[source]¶: Return pivoted data over alleles for summary use

save(prefix='_', filename=None, compression=None)[source]¶: Save all current predictions dataframe with some metadata :param prefix: if writing to a path, the prefix name :param filename: if saving all to a single file :param compression: a string representing the compression to use, :param allowed values are ‘gzip’, ‘bz2’, ‘xz’.:

save_msgpack(filename=None)[source]¶: Save as msgpack format - experimental

seqs_to_dataframe(seqs)[source]¶

summarize()[source]¶: Summarise currently loaded data

supported_lengths()[source]¶: Return supported peptide lengths

class epitopepredict.base.TEpitopePredictor(data=None, **kwargs)[source]¶

Bases: epitopepredict.base.Predictor

Predictor using TepitopePan QM method

check_alleles(alleles)[source]¶

get_alleles()[source]¶: Get available alleles - override

predict(peptides=None, length=9, overlap=1, allele='HLA-DRB1*0101', name='', pseudosequence=None, **kwargs)[source]¶: Does the actual scoring of a sequence. Should be overriden. Should return a pandas DataFrame

supported_lengths()[source]¶: Return supported peptide lengths

epitopepredict.base.check_snap()[source]¶: Check if inside a snap

epitopepredict.base.clean_sequence(seq)[source]¶: clean a sequence of invalid characters before prediction

epitopepredict.base.compare_predictors(p1, p2, by='allele', cutoff=5, n=2)[source]¶: Compare predictions from 2 different predictors. :param p1, p2: predictors with prediction results for the same :param set of sequences andalleles: :param by: how to group the correlation plots

epitopepredict.base.first(x)[source]¶

epitopepredict.base.get_coords(df)[source]¶: Get start end coords from position and length of peptides

epitopepredict.base.get_dqp_list(a)[source]¶: Get DRB list in standard format

epitopepredict.base.get_drb_list(a)[source]¶: Get DRB list in standard format

epitopepredict.base.get_filenames(path, names=None, file_limit=None)[source]¶

epitopepredict.base.get_iedb_request(seq, alleles='HLA-DRB1*01:01', method='consensus3')[source]¶

epitopepredict.base.get_length(data)[source]¶: Get peptide length of a dataframe of predictions

epitopepredict.base.get_nearest(df)[source]¶: Get nearest binder

epitopepredict.base.get_overlapping(index, s, length=9, cutoff=25)[source]¶: Get all mutually overlapping kmers within a cutoff area

epitopepredict.base.get_pos(x)[source]¶

epitopepredict.base.get_predictor(name='tepitope', **kwargs)[source]¶: Get a predictor object using it’s name. Valid predictor names are held in the predictors attribute.

epitopepredict.base.get_predictor_classes()[source]¶: Get predictor classes in this module.

epitopepredict.base.get_preset_alleles(name)[source]¶: A list of the possible preset alleles

epitopepredict.base.get_quantiles(predictor)[source]¶: Get quantile score values per allele in set of predictions. Used for making pre-defined cutoffs. :param predictor: predictor with set of predictions

epitopepredict.base.get_sequence(seqfile)[source]¶: Get sequence from fasta file

epitopepredict.base.get_standard_mhc1(name)[source]¶: Taken from iedb mhc1 utils.py

epitopepredict.base.get_standard_mhc2(x)[source]¶

epitopepredict.base.plot_summary_heatmap(p, kind='default', name=None)[source]¶: Plot heatmap of binders using summary dataframe.

epitopepredict.base.predict_peptides_worker(P, recs, kwargs)[source]¶

epitopepredict.base.predict_proteins_worker(P, recs, kwargs)[source]¶

epitopepredict.base.protein_summary(pred, peptides, name)[source]¶: formatted protein summary table

epitopepredict.base.read_defaults()[source]¶: Get some global settings such as program paths from config file

epitopepredict.base.reshape_data(pred, peptides=None, name=None, values='score')[source]¶: Create summary table per binder/allele with cutoffs applied. :param pred: predictor with data :param cutoff: percentile cutoff :param n: number of alleles

epitopepredict.base.results_from_csv(path=None, names=None, compression='infer', file_limit=None)[source]¶: Load results for multiple csv files in a folder or a single file. :param path: name of a csv file or directory with one or more csv files :param names: names of proteins to load :param file_limit: limit to load only the this number of proteins

epitopepredict.base.seq_from_binders(df)[source]¶

epitopepredict.base.sequence_from_peptides(df)[source]¶: Derive sequence from set of peptides

epitopepredict.base.set_netmhcpan_cmd(path=None)[source]¶: Setup the netmhcpan command to point directly to the binary. This is a workaround for running inside snaps. Avoids using the tcsh script.

epitopepredict.base.split_peptides(df, length=9, seqkey='sequence', newcol='peptide')[source]¶: Split sequences in a dataframe into peptide fragments

epitopepredict.base.summarize(data)[source]¶: Summarise prediction data

epitopepredict.base.summarize_by_protein(pred, pb)[source]¶: Heatmaps or tables of binders per protein/allele

epitopepredict.base.write_fasta(sequences, id=None, filename='tempseq.fa')[source]¶: Write a fasta file of sequences

epitopepredict.cluster module¶

epitopepredict.config module¶

epitopepredict config Created March 2016 Copyright (C) Damien Farrell This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

epitopepredict.config.check_options(opts)[source]¶: Check for missing default options in dict. Meant to handle incomplete config files

epitopepredict.config.create_config_parser_from_dict(data=None, sections=['base', 'iedbtools'], **kwargs)[source]¶: Helper method to create a ConfigParser from a dict of the form shown in baseoptions

epitopepredict.config.get_options(cp)[source]¶: Makes sure boolean opts are parsed

epitopepredict.config.parse_config(conffile=None)[source]¶: Parse a configparser file

epitopepredict.config.print_options(options)[source]¶: Print option key/value pairs

epitopepredict.config.write_config(conffile='default.conf', defaults={})[source]¶: Write a default config file

epitopepredict.config.write_default_config()[source]¶: Write a default config to users .config folder. Used to add global settings.

epitopepredict.neo module¶

Command line script for neo epitope prediction Created March 2018 Copyright (C) Damien Farrell

class epitopepredict.neo.NeoEpitopeWorkFlow(opts={})[source]¶

Bases: object

Class for implementing a neo epitope workflow.

combine_samples(labels)[source]¶: Put peptides from multiple files in one table

get_file_labels(files)[source]¶

run()[source]¶: Run workflow for multiple samples and prediction methods.

setup()[source]¶: Setup main parameters

epitopepredict.neo.anchor_mutated(x)[source]¶

epitopepredict.neo.check_ensembl(release='75')[source]¶: Check pyensembl ref genome cached. Needed for running in snap

epitopepredict.neo.check_imports()[source]¶

epitopepredict.neo.combine_wt_scores(x, y, key)[source]¶: Combine mutant peptide and matching wt/self binding scores from a set of predictions. Assumes both dataframes were run with the same alleles. :param x,y: pandas dataframes with matching prediction results :param key:

epitopepredict.neo.dataframe_to_vcf(df, outfile)[source]¶: Write a dataframe of variants to a simple vcf file. Dataframe requires the following columns: #CHROM’,’POS’,’ID’,’REF’,’ALT’

epitopepredict.neo.effects_to_dataframe(effects)[source]¶

epitopepredict.neo.effects_to_pickle(effects, filename)[source]¶: serialize variant effects collections

epitopepredict.neo.fetch_ensembl_release(path=None, release='75')[source]¶: Get pyensembl genome files

epitopepredict.neo.find_matches(df, blastdb, cpus=4, verbose=False)[source]¶

Get similarity measures for peptides to a self proteome. Does a local blast to the proteome and finds most similar matches. These can then be scored. :param df: dataframe of peptides :param blastdb: path to protein blastdb

Returns:	‘sseq’,’mismatch’
Return type:	dataframe with extra columns

epitopepredict.neo.get_alleles(f)[source]¶: Get input alleles

epitopepredict.neo.get_closest_match(x)[source]¶: Create columns with closest matching peptide. If no wt peptide use self match. vector method

epitopepredict.neo.get_closest_matches(df, verbose=False, cpus=1)[source]¶: Find peptide similarity metrics

epitopepredict.neo.get_mutant_sequences(variants=None, effects=None, reference=None, peptides=True, drop_duplicates=True, length=11, verbose=False)[source]¶

Get mutant proteins or peptide fragments from vcf or maf file. :param variants: varcode variant collection :param effects: non-synonmymous effects, alternative to variants :param peptides: get peptide fragments around mutation

Returns:	pandas dataframe with mutated peptide sequence and source information

epitopepredict.neo.get_variant_class(effect)[source]¶

epitopepredict.neo.get_variants_effects(variants, verbose=False, gene_expression_dict=None)[source]¶: Get all effects from a list of variants. :returns: list of varcode variant effect objects

epitopepredict.neo.load_variants(vcf_file=None, maf_file=None, max_variants=None)[source]¶: Load variants from vcf file

epitopepredict.neo.make_blastdb(url, name=None, filename=None, overwrite=False)[source]¶: Download protein sequences and a make blast db. Uses datacache module.

epitopepredict.neo.make_human_blastdb()[source]¶: Human proteome blastdb

epitopepredict.neo.make_virus_blastdb()[source]¶: Human virus blastdb

epitopepredict.neo.pbmec_score(seq1, seq2)[source]¶: Score with PBMEC matrix

epitopepredict.neo.peptides_from_effect(eff, length=11, peptides=True, verbose=False)[source]¶: Get mutated peptides from a single effect object. :returns: dataframe with peptides and variant info

epitopepredict.neo.plot_variant_summary(data)[source]¶

epitopepredict.neo.predict_binding(df, predictor='netmhcpan', alleles=[], verbose=False, cpus=1, cutoff=0.95, cutoff_method='default')[source]¶

Predict binding scores for mutated and wt peptides (if present) from supplied variants.

Parameters:	df – pandas dataframe with peptide sequences, requires at least 2 columns ‘peptide’ - the mutant peptide ‘wt’ - a corresponding wild type peptide data could be generated from get_mutant_sequences or from an external program (this) – predictor – mhc binding prediction method alleles – list of alleles
Returns:	dataframe with mutant and wt binding scores for all alleles

epitopepredict.neo.print_help()[source]¶

epitopepredict.neo.read_names(filename)[source]¶: read plain text file of items

epitopepredict.neo.run_vep(vcf_file, out_format='vcf', assembly='GRCh38', cpus=4, path=None)[source]¶: Run ensembl VEP on a vcf file for use with pvacseq. see https://www.ensembl.org/info/docs/tools/vep/script/index.html

epitopepredict.neo.score_peptides(df, rf=None)[source]¶: Score peptides with a classifier. Returns a prediction probability.

epitopepredict.neo.self_matches(df, **kwargs)[source]¶

epitopepredict.neo.self_similarity(x, matrix='blosum62')[source]¶

epitopepredict.neo.show_predictors()[source]¶

epitopepredict.neo.summary_plots(df)[source]¶: summary plots for testing results

epitopepredict.neo.test_run()[source]¶: Test run for sample vcf file

epitopepredict.neo.varcode_test()[source]¶

epitopepredict.neo.variants_from_csv(csv_file, sample_id=None, reference=None)[source]¶

Variants from csv file.

Parameters:	csv_file – csv file with following column names- chromosome, position, reference_allele, alt_allele, gene_name, transcript_id, sample_id sample_id – if provided, select variants only for this id reference – ref genome used for variant calling

epitopepredict.neo.virus_matches(df, **kwargs)[source]¶

epitopepredict.neo.virus_similarity(x, matrix='blosum62')[source]¶

epitopepredict.neo.wt_similarity(x, matrix='blosum62')[source]¶

epitopepredict.peptutils module¶

Module implementing peptide sequence/structure utilities. Created March 2013 Copyright (C) Damien Farrell This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

epitopepredict.peptutils.compare_anchor_positions(x1, x2)[source]¶: Check if anchor positions in 9-mers are mutated

epitopepredict.peptutils.create_fragments(protfile=None, seq=None, length=9, overlap=1, quiet=True)[source]¶: generate peptide fragments from a sequence

epitopepredict.peptutils.create_random_peptides(size=100, length=9)[source]¶: Create random peptide structures of given length

epitopepredict.peptutils.create_random_sequences(size=100, length=9)[source]¶: Create library of all possible peptides given length

epitopepredict.peptutils.get_AAfraction(seq, amino_acids=None)[source]¶: Get fraction of give amino acids in a sequence

epitopepredict.peptutils.get_AAsubstitutions(template)[source]¶

Get all the possible sequences from substituting every AA

into the given sequence at each position. This gives a total of: 19 by n amino acid positions.

epitopepredict.peptutils.get_all_fragments(exp, length=11)[source]¶

epitopepredict.peptutils.get_fragments(seq=None, overlap=1, length=11, **kwargs)[source]¶: Generate peptide fragments from a sequence. :returns: dataframe of peptides with position column.

epitopepredict.peptutils.main()[source]¶

epitopepredict.peptutils.net_charge(seq)[source]¶: Get net charge of a peptide sequence

epitopepredict.plotting module¶

epitopepredict plotting Created February 2016 Copyright (C) Damien Farrell This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

epitopepredict.plotting.binders_to_coords(df)[source]¶: Convert binder results to dict of coords for plotting

epitopepredict.plotting.bokeh_pie_chart(df, title='', radius=0.5, width=400, height=400, palette='Spectral')[source]¶: Bokeh pie chart

epitopepredict.plotting.bokeh_plot_bar(preds, name=None, allele=None, title='', width=None, height=100, palette='Set1', tools=True, x_range=None)[source]¶: Plot bars combining one or more prediction results for a set of peptides in a protein/sequence

epitopepredict.plotting.bokeh_plot_grid(pred, name=None, width=None, palette='Blues', **kwargs)[source]¶: Plot heatmap of binding results for a predictor.

epitopepredict.plotting.bokeh_plot_sequence(preds, name=None, n=2, cutoff=0.95, cutoff_method='default', width=1000, color_sequence=False, title='')[source]¶: Plot sequence view of binders

epitopepredict.plotting.bokeh_plot_tracks(preds, title='', n=2, name=None, cutoff=0.95, cutoff_method='default', width=None, height=None, x_range=None, tools=True, palette='Set1', seqdepot=None, exp=None)[source]¶

Plot binding predictions as parallel tracks of blocks for each allele. This uses Bokeh. :param title: plot title :param n: min alleles to display :param name: name of protein to show if more than one in data

Returns: a bokeh figure for embedding or displaying in a notebook

epitopepredict.plotting.bokeh_summary_plot(df, savepath=None)[source]¶: Summary plot

epitopepredict.plotting.bokeh_test(n=20, height=400)[source]¶

epitopepredict.plotting.bokeh_vbar(x, height=200, title='', color='navy')[source]¶

epitopepredict.plotting.draw_labels(labels, coords, ax)[source]¶: Add labels on axis

epitopepredict.plotting.get_bokeh_colors(palette='Set1')[source]¶

epitopepredict.plotting.get_seq_from_binders(P, name=None)[source]¶: Get sequence from binder data. Probably better to store the sequences in the object?

epitopepredict.plotting.get_seqdepot_annotation(genome, key='pfam27')[source]¶: Get seqdepot annotations for a set of proteins in dataframe.

epitopepredict.plotting.get_sequence_colors(seq)[source]¶: Get colors for a sequence

epitopepredict.plotting.plot_bars(P, name, chunks=1, how='median', cutoff=20, color='black')[source]¶: Bar plots for sequence using median/mean/total scores. :param P: predictor with data :param name: name of protein sequence :param chunks: break sequence up into 1 or more chunks :param how: method to calculate score bar value :param perc: percentile cutoff to show peptide

epitopepredict.plotting.plot_bcell(plot, pred, height, ax=None)[source]¶: Line plot of iedb bcell results

epitopepredict.plotting.plot_binder_map(P, name, values='rank', cutoff=20, chunks=1, cmap=None)[source]¶: Plot heatmap of binders above a cutoff by rank or score. :param P: predictor object with data :param name: name of protein to plot :param values: data column to use for plot data, ‘score’ or ‘rank’ :param cutoff: cutoff if using rank as values :param chunks: number of plots to split the sequence into

epitopepredict.plotting.plot_heatmap(df, ax=None, figsize=(6, 6), **kwargs)[source]¶: Plot a generic heatmap

epitopepredict.plotting.plot_multiple(preds, names, kind='tracks', regions=None, genome=None, **kwargs)[source]¶: Plot results for multiple proteins

epitopepredict.plotting.plot_overview(genome, coords=None, cols=2, colormap='Paired', legend=True, figsize=None)[source]¶: Plot regions of interest in a group of protein sequences. Useful for seeing how your binders/epitopes are distributed in a small genome or subset of genes. :param genome: dataframe with protein sequences :param coords: a list/dict of tuple lists of the form {protein name: [(start,length)..]} :param cols: number of columns for plot, integer

epitopepredict.plotting.plot_regions(coords, ax, color='red', label='', alpha=0.6)[source]¶: Highlight regions in a prot binder plot

epitopepredict.plotting.plot_seqdepot(annotation, ax)[source]¶: Plot sedepot annotations - replace with generic plot coords track

epitopepredict.plotting.plot_tracks(preds, name, n=1, cutoff=0.95, cutoff_method='default', regions=None, legend=False, colormap='Paired', figsize=None, ax=None, **kwargs)[source]¶: Plot binders as bars per allele using matplotlib. :param preds: list of one or more predictors :param name: name of protein to plot :param n: number of alleles binder should be found in to be displayed :param cutoff: percentile cutoff to determine binders to show

epitopepredict.plotting.seqdepot_to_coords(sd, key='pfam27')[source]¶: Convert seqdepot annotations to coords for plotting

epitopepredict.sequtils module¶

Sequence utilities and genome annotation methods Created November 2013 Copyright (C) Damien Farrell This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

epitopepredict.sequtils.alignment_to_dataframe(aln)[source]¶: Sequence alignment to dataframe

epitopepredict.sequtils.blast_sequences(database, seqs, labels=None, **kwargs)[source]¶

Blast a set of sequences to a local or remote blast database

Parameters:	database – local or remote blast db name ‘nr’, ‘refseq_protein’, ‘pdb’, ‘swissprot’ are valide remote dbs seqs – sequences to query, list of strings or Bio.SeqRecords labels – list of id names for sequences, optional but recommended
Returns:	pandas dataframe with top blast results

epitopepredict.sequtils.check_tags(df)[source]¶: Check genbank tags to make sure they are not empty

epitopepredict.sequtils.clustal_alignment(filename=None, seqs=None, command='clustalw')[source]¶: Align 2 sequences with clustal

epitopepredict.sequtils.convert_sequence_format(infile, outformat='embl')[source]¶: convert sequence files using SeqIO

epitopepredict.sequtils.dataframe_to_fasta(df, seqkey='translation', idkey='locus_tag', descrkey='description', outfile='out.faa')[source]¶: Genbank features to fasta file

epitopepredict.sequtils.dataframe_to_seqrecords(df, seqkey='sequence', idkey='id')[source]¶: dataframe to list of Bio.SeqRecord objects

epitopepredict.sequtils.distance_tree(filename=None, seqs=None, ref=None)[source]¶: Basic phylogenetic tree for an alignment

epitopepredict.sequtils.draw_genome_map(infile, filename=None)[source]¶: Draw whole circular genome

epitopepredict.sequtils.embl_to_dataframe(infile, cds=False)[source]¶

epitopepredict.sequtils.ete_tree(aln)[source]¶: Tree showing alleles

epitopepredict.sequtils.fasta_format_from_feature(feature)[source]¶: Get fasta formatted sequence from a genome feature

epitopepredict.sequtils.fasta_to_dataframe(infile, header_sep=None, key='locus_tag', seqkey='translation')[source]¶: Get fasta proteins into dataframe

epitopepredict.sequtils.features_summary(df)[source]¶: Genbank dataframe summary

epitopepredict.sequtils.features_to_dataframe(recs, cds=False, select='all')[source]¶: Get genome records from a biopython features object into a dataframe returns a dataframe with a row for each cds/entry. :param recs: seqrecords object :param cds: only return cds :param select: ‘first’ record or ‘all’

epitopepredict.sequtils.fetch_protein_sequences(searchterm, filename='found.fa')[source]¶

Fetch protein seqs using ncbi esearch and save results to a fasta file. :param searchterm: entrez search term :param filename: fasta file name to save results

Returns:	sequence records as a dataframe

epitopepredict.sequtils.find_keyword(f)[source]¶: Get keyword from a field

epitopepredict.sequtils.format_alignment(aln)[source]¶

epitopepredict.sequtils.genbank_to_dataframe(infile, cds=False)[source]¶: Get genome records from a genbank file into a dataframe returns a dataframe with a row for each cds/entry

epitopepredict.sequtils.get_blast_results(filename)[source]¶: Get blast results into dataframe. Assumes column names from local_blast method. :returns: dataframe

epitopepredict.sequtils.get_cds(df)[source]¶: Get CDS with transaltions from genbank dataframe

epitopepredict.sequtils.get_feature_qualifier(f, qualifier)[source]¶

epitopepredict.sequtils.get_genes_by_location(genome, feature, within=20)[source]¶: Gets all featues within a given distance of a gene

epitopepredict.sequtils.get_identity(aln)[source]¶: Get sequence identity of alignment for overlapping region only

epitopepredict.sequtils.get_sequence(genome, name)[source]¶: Get the sequence for a protein in a dataframe with genbank/sequence data

epitopepredict.sequtils.get_translation(feature, genome, cds=True)[source]¶: Check the translation of a cds feature

epitopepredict.sequtils.index_genbank_features(gb_record, feature_type, qualifier)[source]¶: Index features by qualifier value for easy access

epitopepredict.sequtils.local_blast(database, query, output=None, maxseqs=50, evalue=0.001, compress=False, cmd='blastp', cpus=2, show_cmd=False, **kwargs)[source]¶

Blast a local database. :param database: local blast db name :param query: sequences to query, list of strings or Bio.SeqRecords

Returns:	pandas dataframe with top blast results

epitopepredict.sequtils.muscle_alignment(filename=None, seqs=None)[source]¶: Align 2 sequences with muscle

epitopepredict.sequtils.needle_alignment(seq1, seq2, outfile='needle.txt')[source]¶: Align 2 sequences with needle

epitopepredict.sequtils.pairwise_alignment(rec1, rec2)[source]¶

epitopepredict.sequtils.remote_blast(db, query, maxseqs=50, evalue=0.001, **kwargs)[source]¶: Remote blastp. :param query: fasta file with sequence to blast :param db: database to use - nr, refseq_protein, pdb, swissprot

epitopepredict.sequtils.show_alignment(aln, diff=False, offset=0)[source]¶

Show a sequence alignment

Args:: aln: alignment diff: whether to show differences

epitopepredict.sequtils.show_alignment_html(alnrows, seqs, width=80, fontsize=15, label='name')[source]¶

Get html display of sub-sequences on multiple protein alignment. :param alnrows: a dataframe of aligned sequences :param seqs: sub-sequences/epitopes to draw if present :param label: key from dataframe to use as label for sequences

Returns:	html code

epitopepredict.tepitope module¶

Module that implements the TEPITOPEPan method. Includes methods for pickpocket and pseudosequence similarity calcaulation. References: [1] L. Zhang, Y. Chen, H.-S. Wong, S. Zhou, H. Mamitsuka, and S. Zhu, “TEPITOPEpan: extending TEPITOPE for peptide binding prediction covering over 700 HLA-DR molecules.,” PLoS One, vol. 7, no. 2, p. e30483, Jan. 2012. [2] H. Zhang, O. Lund, and M. Nielsen, “The PickPocket method for predicting binding specificities for receptors based on receptor pocket similarities: application to MHC-peptide binding.” Bioinformatics, vol. 25, no. 10, pp. 1293-9, May 2009. Created January 2014 Copyright (C) Damien Farrell This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

epitopepredict.tepitope.allelenumber(x)[source]¶

epitopepredict.tepitope.benchmark()[source]¶

epitopepredict.tepitope.compare(file1, file2, alnindex, reduced=True)[source]¶: All vs all for 2 sets of sequence files

epitopepredict.tepitope.compare_alleles(alleles1, alleles2, alnindex, reduced=True, cutoff=0.25, matrix=None, matrix_name='blosum62')[source]¶: Compare 2 sets of alleles for pseudo-seq distances

epitopepredict.tepitope.compare_ref(query1, query2, ref, alnindex)[source]¶: Compare different alleles distances to reference

epitopepredict.tepitope.compare_tepitope_alleles(alnindex)[source]¶: Compare a set of alleles to Tepitope library HLAs

epitopepredict.tepitope.convert_allele_names(seqfile)[source]¶

Convert long IPD names to common form. :param fasta sequence file:

Returns:	new list of seqrecords

epitopepredict.tepitope.create_virtual_pssm(allele)[source]¶: Create virtual matrix from pickpocket profile weights

epitopepredict.tepitope.generate_pssm(expdata)[source]¶: Create pssm for known binding data given a set of n-mers and binding score

epitopepredict.tepitope.get_allele_pocket_sequences(allele)[source]¶: Convenience for getting an allele pocket aas

epitopepredict.tepitope.get_alleles()[source]¶: Get all alleles covered by this method.

epitopepredict.tepitope.get_matrix(name)[source]¶

epitopepredict.tepitope.get_pocket_positions()[source]¶

epitopepredict.tepitope.get_pockets_pseudo_sequence(query, offset=28)[source]¶: Get pockets pseudo-seq from sequence and pocket residues. :param query: query sequence :param offset: seq numbering offset of alignment numbering to pickpocket :param residue values:

epitopepredict.tepitope.get_pseudo_sequence(query, positions=None, offset=28)[source]¶: Get non redundant pseudo-sequence for a query. Assumes input is a sequence from alignment of MHC genes.

epitopepredict.tepitope.get_pssm_score(seq, pssm)[source]¶: Get sequence score for a given pssm

epitopepredict.tepitope.get_pssms()[source]¶: Get tepitope pssm data

epitopepredict.tepitope.get_scores(pssm, sequence=None, peptides=None, length=11, overlap=1)[source]¶: Score multiple fragments of a sequence in seperate fragments

epitopepredict.tepitope.get_similarities(allele, refalleles, alnindex, matrix)[source]¶: Get distances between a query and set of ref pseudo-seqs

epitopepredict.tepitope.main()[source]¶

epitopepredict.tepitope.pickpocket(pos, allele)[source]¶

Derive weights for a query allele using pickpocket method. This uses the: pocket pseudosequences to determine similarity to the reference. This relies on the DRB alignment present in the tepitope folder.

Parameters:	pos – pocket position allele – query allele
Returns:	set of weights for library alleles at this position

epitopepredict.tepitope.reduce_alleles(alleles)[source]¶: Reduce alleles to repr set based on names

epitopepredict.tepitope.score_peptide(seq, pssm)[source]¶: Score a single sequence in 9-mer frames

epitopepredict.tepitope.show_pocket_residues(pdbfile)[source]¶: Test to show the pocket residues in a pdb structure

epitopepredict.tepitope.similarity_score(matrix, ref, query)[source]¶

Similarity for pseudosequences using a substitution matrix. :param matrix: subs matrix as dictionary :param ref: reference sequence :param query: query sequence

Returns:	a similarity value normalized to matrix

epitopepredict.tepitope.test()[source]¶

epitopepredict.tests module¶

class epitopepredict.tests.PredictorTests(methodName='runTest')[source]¶

Bases: unittest.case.TestCase

Basic tests for predictor

quit()[source]¶

setUp()[source]¶: Hook method for setting up the test fixture before exercising it.

test_basicmhc1()[source]¶

test_classes()[source]¶

test_cutoffs()[source]¶

test_fasta()[source]¶: Test fasta predictions

test_features()[source]¶: Test genbank feature handling

test_iedbmhc1()[source]¶: iedbmhc1 test

test_load()[source]¶: Test re-loading predictions

test_mhcflurry()[source]¶: Test mhcflurry predictor

test_multiproc()[source]¶

test_netmhcpan()[source]¶: netMHCpan test

test_peptide_prediction()[source]¶

test_peptide_utils()[source]¶

test_tepitope()[source]¶: Tepitope test

epitopepredict.tests.run()[source]¶

epitopepredict.utilities module¶

epitopepredict.utilities.add_dicts(a, b)[source]¶

epitopepredict.utilities.compress(filename, remove=False)[source]¶: Compress a file with gzip

epitopepredict.utilities.copyfile(source, dest, newname=None)[source]¶: Helper method to copy files

epitopepredict.utilities.copyfiles(path, files)[source]¶

epitopepredict.utilities.filter_iedb_file(filename, field, search)[source]¶: Return filtered iedb data

epitopepredict.utilities.find_filefrom_string(files, string)[source]¶

epitopepredict.utilities.find_files(path, ext='txt')[source]¶: List files in a dir of a specific type

epitopepredict.utilities.find_folders(path)[source]¶

epitopepredict.utilities.get_sequencefrom_pdb(pdbfile, chain='C', index=0)[source]¶: Get AA sequence from PDB

epitopepredict.utilities.get_symmetric_data_frame(m)[source]¶

epitopepredict.utilities.read_iedb(filename, key='Epitope ID')[source]¶: Load iedb peptidic csv file and return dataframe

epitopepredict.utilities.reorder_filenames(files, order)[source]¶: reorder filenames by another list order(seqs)

epitopepredict.utilities.rmse(ar1, ar2)[source]¶: Mean squared error

epitopepredict.utilities.search_pubmed(term, max_count=100)[source]¶

epitopepredict.utilities.symmetrize(m, lower=True)[source]¶: Return symmetric array

epitopepredict.utilities.test()[source]¶

epitopepredict.utilities.venndiagram(names, labels, ax=None, colors=('r', 'g', 'b'), **kwargs)[source]¶: Plot a venn diagram

epitopepredict.web module¶

epitopepredict, methods for supporting web app Created Sep 2017 Copyright (C) Damien Farrell This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

epitopepredict.web.aggregate_summary(data)[source]¶

epitopepredict.web.column_to_url(df, field, path)[source]¶: Add urls to specified field in a dataframe by prepending the supplied path.

epitopepredict.web.create_bokeh_table(path, name)[source]¶: Create table of prediction data

epitopepredict.web.create_figures(preds, name='', kind='tracks', cutoff=5, n=2, cutoff_method='default', **kwargs)[source]¶: Get plots of binders for single protein/sequence

epitopepredict.web.create_sequence_html(preds, name='', classes='', **kwargs)[source]¶

epitopepredict.web.create_widgets()[source]¶

epitopepredict.web.dataframes_to_html(data, classes='')[source]¶: Convert dictionary of dataframes to html tables

epitopepredict.web.dict_to_html(data)[source]¶

epitopepredict.web.get_alleles(preds)[source]¶: get available alleles

epitopepredict.web.get_file_lists(path)[source]¶: Get list of available prediction results in the given path. Tries to check for each possible predictor.

epitopepredict.web.get_predictors(path, name=None)[source]¶: Get a set of predictors under a results path for all or a specific protein.

epitopepredict.web.get_readme()[source]¶

epitopepredict.web.get_results_info(P)[source]¶: Info on sequence used for prediction

epitopepredict.web.get_results_tables(path, name=None, promiscuous=True, limit=None, **kwargs)[source]¶: Get binder results from a results path. :param path: path to results :param name: name of particular protein/sequence :param view: get all binders or just promiscuous

epitopepredict.web.get_scrollable_table(df)[source]¶: Return a scrollable table as a div element to be placed in web page

epitopepredict.web.get_sequences(pred)[source]¶: Get set of sequences from loaded data

epitopepredict.web.get_summary_tables(path, limit=None, **kwargs)[source]¶: Get binder results summary for all proteins in path. :param path: path to results

epitopepredict.web.sequence_to_html_grid(preds, classes='', **kwargs)[source]¶: Put aligned or multiple identical rows in dataframe and convert to grid of aas as html table

epitopepredict.web.sequences_to_html_table(seqs, classes='')[source]¶: Convert seqs to html

epitopepredict.web.tabbed_html(items)[source]¶: Create html for a set of tabbed divs from dict of html code, one for each tab. Uses css classes defined in static/custom.css

epitopepredict.web.test()[source]¶

epitopepredict package¶

Submodules¶

epitopepredict.analysis module¶

epitopepredict.app module¶

epitopepredict.base module¶

epitopepredict.cluster module¶

epitopepredict.config module¶

epitopepredict.neo module¶

epitopepredict.peptutils module¶

epitopepredict.plotting module¶

epitopepredict.sequtils module¶

epitopepredict.tepitope module¶

epitopepredict.tests module¶

epitopepredict.utilities module¶

epitopepredict.web module¶

Module contents¶