Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

spacy-lookup

Named Entity Recognition based on dictionaries

41   234   234  

ua-gec

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrain...

23   232   232  

Web-Database-Analytics

Web scrapping and related analytics using Python tools

163   231   231  

zshot

Zero and Few shot named entity & relationships recognition

14   231   231  

scientific-paper-summarisation

Machine learning models to automatically summarise scientific papers

59   230   230  

question_generator

An NLP system for generating reading comprehension questions

66   229   229  

nlp_profiler

A simple NLP library allows profiling datasets with one or more text c...

35   228   228  

nlvr

Cornell NLVR and NLVR2 are natural language grounding datasets. Each e...

57   227   227  

bert-vocab-builder

Builds wordpiece(subword) vocabulary compatible for Google Research's...

47   226   226  

mishkal

Mishkal is an arabic text vocalization software

68   226   226  

AIDL_KB

A Knowledge Base for the FB Group Artificial Intelligence and Deep Lea...

45   225   225  

MAMS-for-ABSA

A Multi-Aspect Multi-Sentiment Dataset for aspect-based sentiment anal...

60   225   225  

FedNLP

FedNLP: An Industry and Research Integrated Platform for Federated Lea...

42   225   225  

concise-concepts

This repository contains an easy and intuitive approach to few-shot NE...

19   225   225  

neuralqa

NeuralQA: A Usable Library for Question Answering on Large Datasets wi...

33   223   223  

vec4ir

Word Embeddings for Information Retrieval

41   223   223  

hunspell-dict-ko

Korean spellchecking dictionary for Hunspell

40   223   223  

visdial

[CVPR 2017] Torch code for Visual Dialog

66   221   221  

awesome-emotion-recognition-in-conversations

A comprehensive reading list for Emotion Recognition in Conversations

36   217   217  

claf

CLaF: Open-Source Clova Language Framework

37   216   216  

Coursera-Deep-Learning

My notes / works on deep learning from Coursera

229   216   216  

TEXTOIR

TEXTOIR is the first opensource toolkit for text open intent recogniti...

30   216   216  

fromage

🧀 Code and models for the paper "Grounding Language Models to Images...

11   216   216  

KeyphraseVectorizers

Set of vectorizers that extract keyphrases with part-of-speech pattern...

32   215   215  

data-science-toolkit

Collection of stats, modeling, and data science tools in Python and R.

42   215   215  

NLP4Rec-Papers

Paper list of NLP for recommender systems

48   215   215  

udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and...

33   215   215  

Speech_Signal_Processing_and_Classification

Front-end speech processing aims at extracting proper features from sh...

61   215   215  

AGI-Papers

Papers and Book to look at when starting AGI 📚

28   215   215  

forte

Forte is a flexible and powerful ML workflow builder. This is part of...

60   215   215  

open-sesame

A frame-semantic parsing system based on a softmax-margin SegRNN.

65   214   214  

MedQuAD

Medical Question Answering Dataset of 47,457 QA pairs created from 12...

43   214   214  

PIE

Fast + Non-Autoregressive Grammatical Error Correction using BERT. Cod...

40   214   214  

graph-convolution-nlp

Graph Convolution Network for NLP

36   213   213  

Awesome-NLP-Resources

This repository contains landmark research papers in Natural Language...

54   213   213  

delbot

It understands your voice commands, searches news and knowledge source...

68   212   212  

fixy

Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çöz...

18   211   211  

turkish-stemmer-python

:snake: Turkish Language Stemmer for Python

31   211   211  

AdaSeq

AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence...

19   209   209  

rnn_lstm_from_scratch

How to build RNNs and LSTMs from scratch with NumPy.

65   207   207  

Tokenizer

Fast and customizable text tokenization library with BPE and SentenceP...

57   207   207  

polish-nlp-resources

Pre-trained models and language resources for Natural Language Process...

19   207   207  

gpt-j

A GPT-J API to use with python3 to generate text, blogs, code, and mor...

53   206   206  

deeplearning.ai

96   203   203  

prosody

Helsinki Prosody Corpus and A System for Predicting Prosodic Prominenc...

38   202   202  

DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for A...

45   202   202  

LAMDA-SSL

30 Semi-Supervised Learning Algorithms

16   201   201  

DeepLearning.AI-TensorFlow-Developer-Professional-Certificate

DeepLearning.AI TensorFlow Developer Professional Certificate

144   201   201  

phrasal

A large-scale statistical machine translation system written in Java.

89   201   201  

paraphrase_identification

Examine two sentences and determine whether they have the same meaning...

78   201   201