Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

nlp-architect

A model library for exploring state-of-the-art deep learning topologie...

467   2910   2910  

SimCSE

EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings...

455   2877   2877  

GPT2-chitchat

GPT2 for Chinese chitchat/用于中文闲聊的GPT2模型(实现了DialoGPT的MMI思...

669   2789   2789  

texthero

Text preprocessing, representation and visualization from zero to hero...

232   2750   2750  

text-generation-inference

Large Language Model Text Generation Inference

249   2743   2743  

paper-qa

LLM Chain for answering questions from documents with citations

245   2741   2741  

lingvo

Lingvo

434   2737   2737  

thinc

🔮 A refreshing functional take on deep learning, compatible with your...

276   2731   2731  

neuralcoref

✨Fast Coreference Resolution in spaCy with Neural Networks

470   2711   2711  

modelscope

ModelScope: bring the notion of Model-as-a-Service to life.

288   2692   2692  

eli5

A library for debugging/inspecting machine learning classifiers and ex...

328   2670   2670  

awesome-deeplearning-resources

Deep Learning and deep reinforcement learning research papers and some...

661   2667   2667  

ml-surveys

📋 Survey papers summarizing advances in deep learning, NLP, CV, graphs...

280   2639   2639  

llm-foundry

LLM training code for MosaicML foundation models

263   2638   2638  

knockknock

🚪✊Knock Knock: Get notified when your training ends with only two addi...

225   2625   2625  

Familia

A Toolkit for Industrial Topic Modeling

612   2606   2606  

textlint

The pluggable natural language linter for text and markdown.

158   2594   2594  

sentiment

AFINN-based sentiment analysis for Node.js.

318   2592   2592  

awesome-pretrained-chinese-nlp-models

Awesome Pretrained Chinese NLP Models,高质量中文预训练模型集合

290   2537   2537  

go-openai

OpenAI ChatGPT, GPT-3, GPT-4, DALL·E, Whisper API wrapper for Go

323   2526   2526  

gluon-nlp

NLP made easy

540   2505   2505  

AiLearning-Theory-Applying

快速上手Ai理论及应用实战:基础知识、ML、DL、NLP-BERT、竞赛。含大量注释...

382   2504   2504  

papers

Summaries of machine learning papers

451   2468   2468  

picoGPT

An unnecessarily tiny implementation of GPT-2 in NumPy.

313   2466   2466  

TextAttack

TextAttack 🐙 is a Python framework for adversarial attacks, data augm...

324   2409   2409  

text2vec

text2vec, text to vector. 文本向量表征工具,把文本转化为向量矩阵,实现...

249   2384   2384  

Kashgari

Kashgari is a production-level NLP Transfer learning framework built o...

438   2362   2362  

ml-road

Machine Learning Resources, Practice and Research

904   2327   2327  

rasa_core

Rasa Core is now part of the Rasa repo: An open source machine learnin...

1039   2316   2316  

awesome-DeepLearning

深度学习入门课、资深课、特色课、学术案例、产业实践案例、深度学习知识百...

768   2314   2314  

argilla

✨Argilla: the open-source data curation platform for LLMs

213   2280   2280  

gse

Go efficient multilingual NLP and text segmentation; support English,...

200   2261   2261  

JioNLP

中文 NLP 预处理、解析工具包,准确、高效、易用 A Chinese NLP Preprocess...

299   2244   2244  

Linly

Chinese-LLaMA 、Chinese-Falcon 基础模型;ChatFlow中文对话模型;中文Ope...

180   2233   2233  

aeneas

aeneas is a Python/C library and a set of tools to automagically synch...

217   2216   2216  

electra

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Gene...

340   2205   2205  

Promptify

Prompt Engineering | Use GPT or other prompt based models to get struc...

156   2188   2188  

PyTorch-NLP

Basic Utilities for PyTorch Natural Language Processing (NLP)

260   2181   2181  

spacy-course

👩‍🏫 Advanced NLP with spaCy: A free online course

361   2177   2177  

awesome-sentence-embedding

A curated list of pretrained sentence and word embedding models

259   2142   2142  

uda

Unsupervised Data Augmentation (UDA)

315   2130   2130  

mt-dnn

Multi-Task Deep Neural Networks for Natural Language Understanding

408   2127   2127  

lazynlp

Library to scrape and clean web pages to create massive datasets.

311   2119   2119  

Information-Extraction-Chinese

Chinese Named Entity Recognition with IDCNN/biLSTM+CRF, and Relation E...

820   2112   2112  

scattertext

Beautiful visualizations of how language differs among document types.

281   2105   2105  

nlp

兜哥出品 <一本开源的NLP入门书籍>

534   2095   2095  

DeepKE

An Open Toolkit for Knowledge Graph Extraction and Construction publis...

539   2092   2092  

sru

Training RNNs as Fast as CNNs (https://arxiv.org/abs/1709.02755)

315   2087   2087  

kcws

Deep Learning Chinese Word Segment

672   2086   2086  

textacy

NLP, before and after spaCy

256   2079   2079