Most popular bioinformatics repositories and open source projects

psmc

Implementation of the Pairwise Sequentially Markovian Coalescent (PSMC...

56   131   131  

pydna

Clone with Python! Data structures for double stranded DNA & simulatio...

39   131   131  

OBOFoundry.github.io

Metadata and website for the Open Bio Ontologies Foundry Ontology Regi...

198   130   130  

rasusa

Randomly subsample sequencing reads to a specified coverage

14   130   130  

MSnbase

Base Classes and Functions for Mass Spectrometry and Proteomics

46   130   130  

mag

Assembly and binning of metagenomes

81   130   130  

metaGEM

:gem: An easy-to-use workflow for generating context specific genome-s...

33   130   130  

harmonypy

🎼 Integrate multiple high-dimensional datasets with fuzzy k-means and...

21   129   129  

SigProfilerExtractor

SigProfilerExtractor allows de novo extraction of mutational signature...

45   129   129  

weblogo

WebLogo 3: Sequence Logos redrawn

37   128   128  

VariantSpark

machine learning for genomic variants

44   127   127  

Multi-BioNER

Cross-type Biomedical Named Entity Recognition with Deep Multi-task Le...

27   127   127  

apbs-pdb2pqr

APBS - software for biomolecular electrostatics and solvation

68   126   126  

fqtools

An efficient FASTQ manipulation suite

16   125   125  

lineage

tools for genetic genealogy and the analysis of consumer DNA test resu...

23   125   125  

rnaseqc

Fast, efficient RNA-Seq metrics for quality control and process optimi...

19   125   125  

souporcell

Clustering scRNAseq by genotypes

38   125   125  

SibeliaZ

A fast whole-genome aligner based on de Bruijn graphs

19   124   124  

gencore

Generate duplex/single consensus reads to reduce sequencing noises and...

31   123   123  

seq2science

Automated and customizable preprocessing of Next-Generation Sequencing...

26   123   123  

gatk-sv

A structural variation pipeline for short-read sequencing

58   123   123  

pandora

:chart: PANDORA: Revolutionizing Biomedical Research with Advanced Mac...

18   123   123  

GEOparse

Python library to access Gene Expression Omnibus Database (GEO)

50   122   122  

decontam

Simple statistical identification and removal of contaminants in marke...

23   122   122  

pybel

🌶️ An ecosystem in Python for working with the Biological Expression L...

34   121   121  

ccs

CCS: Generate Highly Accurate Single-Molecule Consensus Reads (HiFi Re...

30   121   121  

Assemblytics

Assemblytics is a bioinformatics tool to detect and analyze structural...

27   120   120  

tskit

Population-scale genomics

63   120   120  

gubbins

Rapid phylogenetic analysis of large samples of recombinant bacterial...

43   120   120  

sage

Proteomics search & quantification so fast that it feels like magic

20   119   119  

BuddySuite

Bioinformatics toolkits for manipulating sequence, alignment, and phyl...

24   118   118  

bio4j

Bio4j abstract model and general entry point to the project

19   117   117  

peddy

genotype :: ped correspondence check, ancestry check, sex check. direc...

36   117   117  

snakefmt

The uncompromising Snakemake code formatter

20   117   117  

muscle

Multiple sequence alignment with top benchmark scores scalable to thou...

14   117   117  

FAMSA

Algorithm for ultra-scale multiple sequence alignments (3M protein seq...

25   117   117  

atropos

An NGS read trimming tool that is specific, sensitive, and speedy. (pr...

14   116   116  

vdjdb-db

🗂️ Git-based TCR database storage & management. Submissions welcome!

28   116   116  

bedtk

A simple toolset for BED files (warning: CLI may change before bedtk b...

13   115   115  

ksw2

Global alignment and alignment extension

24   115   115  

DNAnalyzer

Revolutionizing DNA analysis and making it accessible to all through i...

55   115   115  

kana

Single cell analysis in the browser

10   115   115  

vdjtools

Post-analysis of immune repertoire sequencing data

39   114   114  

svtyper

Bayesian genotyper for structural variants

52   113   113  

CalliNGS-NF

GATK RNA-Seq Variant Calling in Nextflow

50   113   113  

cLoops

Accurate and flexible loops calling tool for 3D genomic data.

17   112   112  

SquiggleKit

SquiggleKit: A toolkit for manipulating nanopore signal data

21   112   112  

Deep-Learning-for-Clustering-in-Bioinformatics

Deep Learning-based Clustering Approaches for Bioinformatics

30   112   112  

graph-network-explainability

Explainability techniques for Graph Networks, applied to a synthetic d...

14   111   111  

awesome-expression-browser

😎 A curated list of software and resources for exploring and visualiz...

37   111   111