Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

vibrato

🎤 vibrato: Viterbi-based accelerated tokenizer

12   255   255  

rnn.wgan

Code for training and evaluation of the model from "Language Generatio...

76   254   254  

spacyr

R wrapper to spaCy NLP

38   253   253  

nlp-labelling

Labelling platform for text using weak supervision.

18   253   253  

LasUIE

Universal Information Extraction, codes for the NeurIPS-2022 paper: Un...

3   253   253  

papers_we_read

Summaries for exciting works in the field of Deep Learning.

32   252   252  

character-based-cnn

Implementation of character based convolutional neural network

54   252   252  

KOMORAN

Korean Morphological Analyzer by shineware

59   252   252  

corus

Links to Russian corpora + Python functions for loading and parsing

19   252   252  

ChemDataExtractor

Automatically extract chemical information from scientific documents

104   252   252  

BERT-AttributeExtraction

USING BERT FOR Attribute Extraction in KnowledgeGraph. fine-tuning and...

65   250   250  

awesome-nlp-polish

A curated list of resources dedicated to Natural Language Processing (...

33   250   250  

wikipron

Massively multilingual pronunciation mining

59   250   250  

neologdn

Japanese text normalizer for mecab-neologd

18   249   249  

text-cnn-tensorflow

Convolutional Neural Networks for Sentence Classification(TextCNN) imp...

69   248   248  

practical-1

Oxford Deep NLP 2017 course - Practical 1: word2vec

142   248   248  

neat-vision

Neat (Neural Attention) Vision, is a visualization tool for the attent...

29   248   248  

masakhane-mt

Machine Translation for Africa

214   248   248  

gpl

Powerful unsupervised domain adaptation method for dense retrieval. Re...

31   248   248  

LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Langua...

30   248   248  

parsbert

🤗 ParsBERT: Transformer-based Model for Persian Language Understandin...

35   247   247  

ontogpt

GPT-based ontological extraction tools, including SPIRES

32   247   247  

torchnlp

Easy to use NLP library built on PyTorch and TorchText

45   247   247  

multi_rake

Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python

36   247   247  

lemmatization-lists

Machine-readable lists of lemma-token pairs in 23 languages.

93   247   247  

mead-baseline

Deep-Learning Model Exploration and Development for NLP

74   245   245  

code-switching-papers

A curated list of research papers and resources on code-switching

31   245   245  

gpttools

gpttools extends gptstudio for package development to help you documen...

19   244   244  

nlp_made_easy

Explains nlp building blocks in a simple manner.

33   244   244  

chinese_ulmfit

中文ULMFiT 情感分析 文本分类

38   243   243  

gpt-2-tensorflow2.0

OpenAI GPT2 pre-training and sequence prediction implementation in Ten...

78   242   242  

text2text

Text2Text: Crosslingual NLP/G toolkit

31   242   242  

Siamese-LSTM

Siamese LSTM for evaluating semantic similarity between sentences of t...

68   241   241  

AIND-NLP

Coding exercises for the Natural Language Processing concentration, pa...

384   241   241  

backprop

Backprop makes it simple to use, finetune, and deploy state-of-the-art...

12   241   241  

spacy-services

💫 REST microservices for various spaCy-related tasks

76   240   240  

caml-mimic

multilabel classification of EHR notes

109   240   240  

Deep-Learning-Specialization-Coursera

This repo contains the updated version of all the assignments/labs (do...

263   240   240  

cnn-text-classification-tf-chinese

CNN for Chinese Text Classification in Tensorflow

111   239   239  

dmn-tensorflow

Dynamic Memory Networks (https://arxiv.org/abs/1603.01417) in Tensorfl...

86   239   239  

monpa

MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型

25   239   239  

pyrouge

A Python wrapper for the ROUGE summarization evaluation package

70   238   238  

openfoodfacts-ai

This is a tracking repo for all our AI projects. 🍕 🤖🍼

54   238   238  

mmt

Multi-Modal Transformer for Video Retrieval

39   237   237  

prosodic

Prosodic: a metrical-phonological parser, written in Python. For Engli...

40   237   237  

webanno

🆕 Work continues on INCEpTION 👉 https://github.com/inception-project...

96   236   236  

fairseq-gec

Source code for paper: Improving Grammatical Error Correction via Pre-...

68   236   236  

GermanWordEmbeddings

Toolkit to obtain and preprocess German text corpora, train models and...

51   236   236  

snow-owl

:owl: Snow Owl Terminology Server - production-ready, scalable, suppor...

26   236   236  

chatbot

Русскоязычный генеративный чатбот с профилем и фактами

61   235   235