Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

entity-recognition-datasets

A collection of corpora for named entity recognition (NER) and entity...

236   1350   1350  

nlp-tutorial

A list of NLP(Natural Language Processing) tutorials

269   1343   1343  

awesome-ai-ml-dl

Awesome Artificial Intelligence, Machine Learning and Deep Learning as...

331   1305   1305  

TAADpapers

Must-read Papers on Textual Adversarial Attack and Defense

179   1305   1305  

jieba-php

"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chine...

258   1267   1267  

WikiSQL

A large annotated semantic parsing corpus for developing natural langu...

301   1267   1267  

spacy-transformers

🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy

161   1264   1264  

refinery

The data scientist's open-source choice to scale, assess and maintain...

51   1229   1229  

awesome-text-summarization

The guide to tackle with the Text Summarization

208   1215   1215  

trafilatura

Python & command-line tool to gather text on the Web: web crawling/scr...

130   1206   1206  

nlg-eval

Evaluation code for various unsupervised automated metrics for Natural...

203   1205   1205  

textrank

TextRank implementation for Python 3.

262   1194   1194  

pororo

PORORO: Platform Of neuRal mOdels for natuRal language prOcessing

209   1189   1189  

underthesea

Underthesea - Vietnamese NLP Toolkit

256   1182   1182  

Repo-2017

My first Python repo with codes in Machine Learning, NLP and Deep Lear...

695   1176   1176  

hmtl

🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural ne...

146   1176   1176  

fastText_multilingual

Multilingual word vectors in 78 languages

125   1173   1173  

budou

Budou is an automatic organizer tool for beautiful line breaking in CJ...

61   1139   1139  

bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair En...

93   1136   1136  

nltk_data

NLTK Data

958   1133   1133  

natural-language-processing

Resources for "Natural Language Processing" Coursera course.

1976   1125   1125  

projects

🪐 End-to-end NLP workflows from prototype to production

450   1121   1121  

tidytext

Text mining using tidy tools :sparkles::page_facing_up::sparkles:

188   1114   1114  

Awesome-Rust-MachineLearning

This repository is a list of machine learning libraries written in Rus...

58   1098   1098  

conv-emotion

This repo contains implementation of different architectures for emoti...

307   1093   1093  

awesome-relation-extraction

📖 A curated list of awesome resources dedicated to Relation Extraction...

137   1092   1092  

awesome-search

Awesome Search - this is all about the (e-commerce, but not only) sear...

96   1084   1084  

FreeML

A List of Data Science/Machine Learning Resources (Mostly Free)

524   1072   1072  

Speech-Emotion-Analyzer

The neural network model is capable of detecting five different male/f...

391   1071   1071  

nlp-in-practice

Starter code to solve real world text data problems. Includes: Gensim...

778   1070   1070  

bert_score

BERT score for text generation

176   1067   1067  

PPLM

Plug and Play Language Model implementation. Allows to steer topic and...

187   1064   1064  

transformers-interpret

Model explainability that works seamlessly with 🤗 transformers. Explai...

90   1051   1051  

ml-course

Open Machine Learning course

817   1037   1037  

wink-nlp

Developer friendly Natural Language Processing ✨

54   1008   1008  

basaran

Basaran is an open-source alternative to the OpenAI text completion AP...

55   1006   1006  

nlp-with-ruby

Curated List: Practical Natural Language Processing done in Ruby

69   1002   1002  

practical-nlp-code

Official Repository for Code associated with 'Practical Natural Langua...

481   993   993  

this-word-does-not-exist

This Word Does Not Exist

80   975   975  

question_generation

Neural question generation using transformers

324   967   967  

lingua-go

The most accurate natural language detection library for Go, suitable...

55   945   945  

awesome-transformer-nlp

A curated list of NLP resources focused on Transformer networks, atten...

114   939   939  

TextBox

TextBox 2.0 is a text generation library with pre-trained language mod...

98   931   931  

torchdistill

A coding-free framework built on PyTorch for reproducible deep learnin...

100   929   929  

insuranceqa-corpus-zh

:helicopter: 保险行业语料库,聊天机器人

334   928   928  

Summarization-Papers

Summarization Papers

139   919   919  

seqeval

A Python framework for sequence labeling evaluation(named-entity recog...

120   908   908  

obsei

Obsei is a low code AI powered automation tool. It can be used in vari...

121   899   899  

YouTokenToMe

Unsupervised text tokenizer focused on computational efficiency

81   889   889  

jcseg

Jcseg is a light weight NLP framework developed with Java. Provide CJK...

216   880   880