Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

refinery

The data scientist's open-source choice to scale, assess and maintain...

51   1229   1229  

DataProfiler

What's in your data? Extract schema, statistics and entities from data...

125   1228   1228  

course

The Hugging Face course on Transformers

412   1221   1221  

trafilatura

Python & command-line tool to gather text on the Web: web crawling/scr...

130   1206   1206  

nlg-eval

Evaluation code for various unsupervised automated metrics for Natural...

203   1205   1205  

textrank

TextRank implementation for Python 3.

262   1194   1194  

underthesea

Underthesea - Vietnamese NLP Toolkit

256   1182   1182  

transformers_tasks

⭐️ NLP Algorithms with transformers lib. Supporting Text-Classificatio...

256   1182   1182  

Repo-2017

My first Python repo with codes in Machine Learning, NLP and Deep Lear...

695   1176   1176  

hmtl

🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural ne...

146   1176   1176  

one-pixel-attack-keras

Keras implementation of "One pixel attack for fooling deep neural netw...

199   1176   1176  

fastText_multilingual

Multilingual word vectors in 78 languages

125   1173   1173  

similarity

similarity: Text similarity calculation Toolkit for Java. 文本相似度计...

289   1160   1160  

tribuo

Tribuo - A Java machine learning library

168   1158   1158  

ktrain

ktrain is a Python library that makes deep learning and AI more access...

265   1150   1150  

bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair En...

93   1136   1136  

nltk_data

NLTK Data

958   1133   1133  

KoBERT

Korean BERT pre-trained cased (KoBERT)

346   1133   1133  

projects

🪐 End-to-end NLP workflows from prototype to production

450   1121   1121  

awesome-relation-extraction

📖 A curated list of awesome resources dedicated to Relation Extraction...

137   1092   1092  

datumbox-framework

Datumbox is an open-source Machine Learning framework written in Java...

290   1087   1087  

docspell

Assist in organizing your piles of documents, resulting from scanners,...

83   1074   1074  

nlp-in-practice

Starter code to solve real world text data problems. Includes: Gensim...

778   1070   1070  

nlp-library

curated collection of papers for the nlp practitioner 📖👩‍🔬

90   1067   1067  

contextualized-topic-models

A python package to run contextualized topic modeling. CTMs combine co...

134   1067   1067  

xmnlp

xmnlp:提供中文分词, 词性标注, 命名体识别,情感分析,文本纠错,文本转...

183   1066   1066  

PPLM

Plug and Play Language Model implementation. Allows to steer topic and...

187   1064   1064  

natasha

Solves basic Russian NLP tasks, API for lower level Natasha projects

95   1063   1063  

zemberek-nlp

NLP tools for Turkish.

207   1061   1061  

AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, base...

111   1061   1061  

PyTorchText

1st Place Solution for Zhihu Machine Learning Challenge . Implementati...

368   1059   1059  

learn-nlp-with-transformers

we want to create a repo to illustrate usage of transformers in chines...

221   1055   1055  

nlp

:memo: This repository recorded my NLP journey.

323   1052   1052  

transformers-interpret

Model explainability that works seamlessly with 🤗 transformers. Explai...

90   1051   1051  

languagemodels

Explore large language models on any computer with 512MB of RAM

74   1031   1031  

Awesome-Chinese-LLM

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型...

107   1028   1028  

Deep-Learning-Experiments

Videos, notes and experiments to understand deep learning

745   1020   1020  

wink-nlp

Developer friendly Natural Language Processing ✨

54   1008   1008  

basaran

Basaran is an open-source alternative to the OpenAI text completion AP...

55   1006   1006  

nlp-with-ruby

Curated List: Practical Natural Language Processing done in Ruby

69   1002   1002  

KGQA-Based-On-medicine

基于医药知识图谱的智能问答系统

277   998   998  

QANet

A Tensorflow implementation of QANet for machine reading comprehension

310   989   989  

GPT2-NewsTitle

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT...

164   986   986  

nlp-paper

自然语言处理领域下的相关论文(附阅读笔记),复现模型以及数据处理等(代...

167   979   979  

beir

A Heterogeneous Benchmark for Information Retrieval. Easy to use, eval...

127   975   975  

paperai

📄 🤖 Semantic search and workflows for medical/scientific papers

85   974   974  

sequence-labeling-BiLSTM-CRF

The BiLSTM-CRF model implementation in Tensorflow, for sequence labeli...

261   970   970  

plato-research-dialogue-system

This is the Plato Research Dialogue System, a flexible platform for de...

196   968   968  

question_generation

Neural question generation using transformers

324   967   967  

awesome-knowledge-graph

A curated list of Knowledge Graph related learning materials, database...

96   964   964