Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

corus

Links to Russian corpora + Python functions for loading and parsing

19   252   252  

ChemDataExtractor

Automatically extract chemical information from scientific documents

104   252   252  

BERT-AttributeExtraction

USING BERT FOR Attribute Extraction in KnowledgeGraph. fine-tuning and...

65   250   250  

awesome-nlp-polish

A curated list of resources dedicated to Natural Language Processing (...

33   250   250  

wikipron

Massively multilingual pronunciation mining

59   250   250  

neologdn

Japanese text normalizer for mecab-neologd

18   249   249  

text-cnn-tensorflow

Convolutional Neural Networks for Sentence Classification(TextCNN) imp...

69   248   248  

practical-1

Oxford Deep NLP 2017 course - Practical 1: word2vec

142   248   248  

neat-vision

Neat (Neural Attention) Vision, is a visualization tool for the attent...

29   248   248  

masakhane-mt

Machine Translation for Africa

214   248   248  

gpl

Powerful unsupervised domain adaptation method for dense retrieval. Re...

31   248   248  

LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Langua...

30   248   248  

torchnlp

Easy to use NLP library built on PyTorch and TorchText

45   247   247  

multi_rake

Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python

36   247   247  

lemmatization-lists

Machine-readable lists of lemma-token pairs in 23 languages.

93   247   247  

parsbert

🤗 ParsBERT: Transformer-based Model for Persian Language Understanding

35   247   247  

ontogpt

GPT-based ontological extraction tools, including SPIRES

32   247   247  

mead-baseline

Deep-Learning Model Exploration and Development for NLP

74   245   245  

code-switching-papers

A curated list of research papers and resources on code-switching

31   245   245  

nlp_made_easy

Explains nlp building blocks in a simple manner.

33   244   244  

gpttools

gpttools extends gptstudio for package development to help you documen...

19   244   244  

chinese_ulmfit

中文ULMFiT 情感分析 文本分类

38   243   243  

gpt-2-tensorflow2.0

OpenAI GPT2 pre-training and sequence prediction implementation in Ten...

78   242   242  

text2text

Text2Text: Crosslingual NLP/G toolkit

31   242   242  

Siamese-LSTM

Siamese LSTM for evaluating semantic similarity between sentences of t...

68   241   241  

AIND-NLP

Coding exercises for the Natural Language Processing concentration, pa...

384   241   241  

backprop

Backprop makes it simple to use, finetune, and deploy state-of-the-art...

12   241   241  

spacy-services

💫 REST microservices for various spaCy-related tasks

76   240   240  

caml-mimic

multilabel classification of EHR notes

109   240   240  

Deep-Learning-Specialization-Coursera

This repo contains the updated version of all the assignments/labs (do...

263   240   240  

cnn-text-classification-tf-chinese

CNN for Chinese Text Classification in Tensorflow

111   239   239  

dmn-tensorflow

Dynamic Memory Networks (https://arxiv.org/abs/1603.01417) in Tensorfl...

86   239   239  

monpa

MONPA 罔拍是一個提供正體中文斷詞、詞性標註以及命名實體辨識的多任務模型

25   239   239  

pyrouge

A Python wrapper for the ROUGE summarization evaluation package

70   238   238  

prosodic

Prosodic: a metrical-phonological parser, written in Python. For Engli...

40   237   237  

mmt

Multi-Modal Transformer for Video Retrieval

39   237   237  

webanno

🆕 Work continues on INCEpTION 👉 https://github.com/inception-project/i...

96   236   236  

fairseq-gec

Source code for paper: Improving Grammatical Error Correction via Pre-...

68   236   236  

snow-owl

:owl: Snow Owl Terminology Server - production-ready, scalable, suppor...

26   236   236  

chatbot

Русскоязычный генеративный чатбот с профилем и фактами

61   235   235  

tableQA

AI Tool for querying natural language on tabular data.

44   235   235  

llm_training_handbook

An open collection of methodologies to help with successful training o...

15   235   235  

fastRAG

Efficient Retrieval Augmentation and Generation Framework

20   235   235  

spacy-lookup

Named Entity Recognition based on dictionaries

41   234   234  

bnlp

BNLP is a natural language processing toolkit for Bengali Language.

49   234   234  

SummerTime

An open-source text summarization toolkit for non-experts. EMNLP'2021...

24   234   234  

open-semantic-etl

Python based Open Source ETL tools for file crawling, document process...

67   234   234  

spacyr

R wrapper to spaCy NLP

41   233   233  

nlp_classification

Implementing nlp papers relevant to classification with PyTorch, gluon...

41   231   231  

MetaLearning4NLP-Papers

A list of recent papers about Meta / few-shot learning methods applied...

25   231   231