Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

mishkal

Mishkal is an arabic text vocalization software

68   226   226  

MAMS-for-ABSA

A Multi-Aspect Multi-Sentiment Dataset for aspect-based sentiment anal...

60   225   225  

FedNLP

FedNLP: An Industry and Research Integrated Platform for Federated Lea...

42   225   225  

concise-concepts

This repository contains an easy and intuitive approach to few-shot NE...

19   225   225  

GermanWordEmbeddings

Toolkit to obtain and preprocess german corpora, train models using wo...

51   224   224  

neuralqa

NeuralQA: A Usable Library for Question Answering on Large Datasets wi...

33   223   223  

vec4ir

Word Embeddings for Information Retrieval

41   223   223  

hunspell-dict-ko

Korean spellchecking dictionary for Hunspell

40   223   223  

visdial

[CVPR 2017] Torch code for Visual Dialog

66   221   221  

AIDL_KB

A Knowledge Base for the FB Group Artificial Intelligence and Deep Lea...

43   217   217  

awesome-emotion-recognition-in-conversations

A comprehensive reading list for Emotion Recognition in Conversations

36   217   217  

coursera-natural-language-processing-specialization

Programming assignments from all courses in the Coursera Natural Langu...

273   217   217  

claf

CLaF: Open-Source Clova Language Framework

37   216   216  

Coursera-Deep-Learning

My notes / works on deep learning from Coursera

229   216   216  

fromage

🧀 Code and models for the paper "Grounding Language Models to Images f...

11   216   216  

NLP4Rec-Papers

Paper list of NLP for recommender systems

48   215   215  

Speech_Signal_Processing_and_Classification

Front-end speech processing aims at extracting proper features from sh...

61   215   215  

AGI-Papers

Papers and Book to look at when starting AGI 📚

28   215   215  

forte

Forte is a flexible and powerful ML workflow builder. This is part of...

60   215   215  

KeyphraseVectorizers

Set of vectorizers that extract keyphrases with part-of-speech pattern...

32   215   215  

open-sesame

A frame-semantic parsing system based on a softmax-margin SegRNN.

65   214   214  

MedQuAD

Medical Question Answering Dataset of 47,457 QA pairs created from 12...

43   214   214  

PIE

Fast + Non-Autoregressive Grammatical Error Correction using BERT. Cod...

40   214   214  

graph-convolution-nlp

Graph Convolution Network for NLP

36   213   213  

hmni

📛 Fuzzy Name Matching with Machine Learning

43   213   213  

turkish-stemmer-python

:snake: Turkish Language Stemmer for Python

31   211   211  

LM-reasoning

This repository contains a collection of papers and resources on Reaso...

17   211   211  

AdaSeq

AdaSeq: An All-in-One Library for Developing State-of-the-Art Sequence...

19   209   209  

rnn_lstm_from_scratch

How to build RNNs and LSTMs from scratch with NumPy.

65   207   207  

Tokenizer

Fast and customizable text tokenization library with BPE and SentenceP...

57   207   207  

polish-nlp-resources

Pre-trained models and language resources for Natural Language Process...

19   207   207  

delbot

It understands your voice commands, searches news and knowledge source...

70   206   206  

Multi-Type-TD-TSR

Extracting Tables from Document Images using a Multi-stage Pipeline fo...

44   204   204  

udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and...

31   203   203  

SpeechTransProgress

Tracking the progress in end-to-end speech translation

21   203   203  

prosody

Helsinki Prosody Corpus and A System for Predicting Prosodic Prominenc...

38   202   202  

DAMO-ConvAI

DAMO-ConvAI: The official repository which contains the codebase for A...

45   202   202  

phrasal

A large-scale statistical machine translation system written in Java.

89   201   201  

paraphrase_identification

Examine two sentences and determine whether they have the same meaning...

78   201   201  

vntk

Vietnamese NLP Toolkit for Node

59   200   200  

fixy

Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çöz...

20   199   199  

markup

A web-based document annotation tool, powered by GPT-4 :rocket:

32   199   199  

nl2sql

阿里天池首届中文NL2SQL挑战赛top6

49   198   198  

arXivNotes

IssuesにNLP(自然言語処理)に関連するの論文を読んだまとめを書いていま...

8   197   197  

data-science-toolkit

Collection of stats, modeling, and data science tools in Python and R.

42   197   197  

Awesome-NLP-Resources

This repository contains landmark research papers in Natural Language...

53   197   197  

Black-Box-Tuning

ICML'2022: Black-Box Tuning for Language-Model-as-a-Service & EMNLP'20...

20   197   197  

notebooks

Jupyter Notebooks with Deep Learning Tutorials

123   196   196  

displacy-ent

:boom: displaCy-ent.js: An open-source named entity visualiser for the...

43   196   196  

dkpro-core

Collection of software components for natural language processing (NLP...

71   196   196