Topic

bioinformatics

Repositories (1438)

Awesome-LLMs-meet-genomes
Awesome-LLMs-meet-genomes ychuest

Explore a comprehensive collection of basic theories, applications, papers, and best practices about Large Language Models (LLMs) in genomes.

137
MSnbase
MSnbase lgatto R

Base Classes and Functions for Mass Spectrometry and Proteomics

137
Kun-peng
Kun-peng eric9n Rust

Kun-peng: an ultra-fast, low-memory footprint and accurate taxonomy classifier for all

137
pandora
pandora genular Vue

PANDORA :computer:

137
svtyper
svtyper hall-lab Python

Bayesian genotyper for structural variants

136
plascad
plascad David-OConnor Rust

Plasmid and primer design software

136
BWA-MEME
BWA-MEME kaist-ina C++

BWA-MEME: Faster BWA-MEM2 using learned-index

136
folddisco
folddisco steineggerlab Rust

Fast indexing and search of discontinuous motifs in protein structures

136
ScienceAgentBench
ScienceAgentBench OSU-NLP-Group Python

[ICLR'25] ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery

135
Multi-BioNER
Multi-BioNER yuzhimanhua Python

Cross-type Biomedical Named Entity Recognition with Deep Multi-task Learning (Bioinformatics'19)

135
sinto
sinto timoast Python

Tools for single-cell data processing

135
usearch12
usearch12 rcedgar C++

Open-source usearch

134
cogent3
cogent3 cogent3 Python

Comparative Genomics Toolkit 3

134
apbs-pdb2pqr
apbs-pdb2pqr Electrostatics

APBS - software for biomolecular electrostatics and solvation

132
ontobio
ontobio biolink Python

python library for working with ontologies and ontology associations

132
GeneFuse
GeneFuse OpenGene C

Gene fusion detection and visualization

132
Terpene-Profile-Parser-for-Cannabis-Strains
Terpene-Profile-Parser-for-Cannabis-Strains MaxValue Python

Parser and database to index the terpene profile of different strains of Cannabis from online databases

132
full_spectrum_bioinformatics
full_spectrum_bioinformatics zaneveld Jupyter Notebook

An open-access bioinformatics text

132
blasr
blasr PacificBiosciences C++

BLASR: The PacBio® long read aligner

131
dockstore
dockstore dockstore Java

An app store for scientific workflows, tools, notebooks, and services

131
mygene.info
mygene.info biothings Python

MyGene.info: A BioThings API for gene annotations

131
pairtools
pairtools open2c Python

Extract 3D contacts (.pairs) from sequencing alignments

131
BuddySuite
BuddySuite biologyguy Python

Bioinformatics toolkits for manipulating sequence, alignment, and phylogenetic tree files

129
BASALT
BASALT EMBL-PKU Python

Nature Communications | BASALT (Binning Across a Series of Assemblies Toolkit) for binning and refinement of short- and long-read sequencing data

129
bio-forge
bio-forge TKanX Rust

A high-performance, pure Rust toolkit for standardizing and preparing biomolecular systems (proteins & nucleic acids). It heals missing atoms, resolve...

129
cute-nucleotides
cute-nucleotides Daniel-Liu-c0deb0t Rust

Cute tricks for SIMD vectorized binary encoding and decoding of nucleotides, in Rust.

129
pymol-color-alphafold
pymol-color-alphafold cbalbin-bio Python

PyMOL extension to color AlphaFold structures by confidence (pLDDT).

129
MSFragger
MSFragger Nesvilab HTML

Ultrafast, comprehensive peptide identification for mass spectrometry–based proteomics

129
SquiggleKit
SquiggleKit Psy-Fer Python

SquiggleKit: A toolkit for manipulating nanopore signal data

128
apbs
apbs Electrostatics C

Software for biomolecular electrostatics and solvation calculations

128
graph-network-explainability
graph-network-explainability baldassarreFe Jupyter Notebook

Explainability techniques for Graph Networks, applied to a synthetic dataset and an organic chemistry task. Code for the workshop paper "Explainabilit...

127
ropebwt3
ropebwt3 lh3 C

BWT construction and search

127
gencore
gencore OpenGene C++

Generate duplex/single consensus reads to reduce sequencing noises and remove duplications

127
OmicsClaw
OmicsClaw TianGzlab Python

Conversational & memory-enabled AI research partner for multi-omics analysis. From biological idea to full research paper.

127
atropos
atropos jdidion Python

An NGS read trimming tool that is specific, sensitive, and speedy. (production)

126
ccs
ccs PacificBiosciences

CCS: Generate Highly Accurate Single-Molecule Consensus Reads (HiFi Reads)

126
SciAgent-Skills
SciAgent-Skills jaechang-hits Python

197 bioinformatics & life science skills for Claude Code and AI agents — BixBench 92.0% accuracy. RNA-seq, single-cell, drug discovery, proteomics, an...

126
ctakes
ctakes apache Java

Apache cTAKES is a Natural Language Processing (NLP) platform for clinical text.

126
seqfu2
seqfu2 telatin Nim

:rocket: seqfu - Sequece Fastx Utilities

126
arpeggio
arpeggio PDBeurope Python

Calculation of interatomic interactions in molecular structures

126
MicrobiomeBestPracticeReview
MicrobiomeBestPracticeReview grimmlab Shell

Current Challenges and Best Practice Protocols for Microbiome Analysis using Amplicon and Metagenomic Sequencing

126
philosopher
philosopher Nesvilab Go

PeptideProphet, PTMProphet, ProteinProphet, iProphet, Abacus, and FDR filtering

125
panseg
panseg kreshuklab Python

A tool for cell instance aware segmentation in densely packed 3D volumetric images

125
BioT5
BioT5 QizhiPei Python

BioT5 (EMNLP 2023) and BioT5+ (ACL 2024 Findings)

124
pangraph
pangraph neherlab C

A bioinformatic toolkit to align genome assemblies into pangenome graphs

124
assembly-stats
assembly-stats sanger-pathogens C++

Get assembly statistics from FASTA and FASTQ files

124
arv
arv cslarsen C++

A fast 23andMe DNA parser and inferrer for Python

123
bio4j
bio4j bio4j Java

Bio4j abstract model and general entry point to the project

122
molgenis
molgenis molgenis Java

MOLGENIS - for scientific data: management, exploration, integration and analysis.

122
minced
minced ctSkennerton Java

Mining CRISPRs in Environmental Datasets

122