Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

GermanWordEmbeddings

Toolkit to obtain and preprocess German text corpora, train models and...

51   239   239  

tensorflow_qrnn

QRNN implementation for TensorFlow

38   236   236  

open-sesame

A frame-semantic parsing system based on a softmax-margin SegRNN.

67   236   236  

word2vec-pytorch

Implementation of the first paper on word2vec

94   236   236  

KnowAgent

[NAACL 2025] KnowAgent: Knowledge-Augmented Planning for LLM-Based Ag...

17   235   235  

neuralqa

NeuralQA: A Usable Library for Question Answering on Large Datasets wi...

31   233   233  

GLiREL

Generalist and Lightweight Model for Relation Extraction (Extract any...

16   232   232  

natml-unity

High performance, cross-platform machine learning for Unity Engine.

25   231   231  

TEXTOIR

TEXTOIR is the first opensource toolkit for text open intent recogniti...

30   231   231  

encodechka

The tiniest sentence encoder for Russian language

12   231   231  

presidio-research

This package features data-science related tasks for developing new re...

66   230   230  

visdial

[CVPR 2017] Torch code for Visual Dialog

69   230   230  

bert-vocab-builder

Builds wordpiece(subword) vocabulary compatible for Google Research's...

48   230   230  

PIE

Fast + Non-Autoregressive Grammatical Error Correction using BERT. Cod...

39   230   230  

NLP4Rec-Papers

Paper list of NLP for recommender systems

50   229   229  

TabularSemanticParsing

Translating natural language questions to structured query language (S...

52   229   229  

AutoAct

[ACL 2024] AutoAct: Automatic Agent Learning from Scratch for QA via S...

14   229   229  

AIDL_KB

A Knowledge Base for the FB Group Artificial Intelligence and Deep Lea...

47   228   228  

turkish-stemmer-python

:snake: Turkish Language Stemmer for Python

31   228   228  

vec4ir

Word Embeddings for Information Retrieval

41   225   225  

paraphrase_identification

Examine two sentences and determine whether they have the same meaning...

85   224   224  

FedNLP

FedNLP: An Industry and Research Integrated Platform for Federated Lea...

44   223   223  

Awesome-Biomolecule-Language-Cross-Modeling

Awesome-Biomolecule-Language-Cross-Modeling: a curated list of resourc...

14   220   220  

llama-2-jax

JAX implementation of the Llama 2 model

24   219   219  

data-science-toolkit

Collection of stats, modeling, and data science tools in Python and R.

42   219   219  

claf

CLaF: Open-Source Clova Language Framework

36   218   218  

classy-classification

This repository contains an easy and intuitive approach to few-shot cl...

15   218   218  

vntk

Vietnamese NLP Toolkit for Node

63   217   217  

nl2sql

阿里天池首届中文NL2SQL挑战赛top6

53   217   217  

awesome-NLP-resources

a collection of NLP projects&tools. 自然语言处理方向项目和工具集合。

36   216   216  

Awesome-NLP-Resources

This repository contains landmark research papers in Natural Language...

54   215   215  

AGI-Papers

Papers and Book to look at when starting AGI 📚

28   215   215  

udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and...

34   214   214  

Tree-Transformer

Implementation of the paper Tree Transformer

35   214   214  

fixy

Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çöz...

18   213   213  

awesome-llm-courses

A curated list of awesome online courses about Large Langage Models (L...

16   213   213  

SimplyRetrieve

Lightweight chat AI platform featuring custom knowledge, open-source L...

14   212   212  

graph-convolution-nlp

Graph Convolution Network for NLP

36   212   212  

delbot

It understands your voice commands, searches news and knowledge source...

68   212   212  

phrasal

A large-scale statistical machine translation system written in Java.

88   211   211  

deeplearning.ai

102   211   211  

PersianQA

Persian (Farsi) Question Answering Dataset (+ Models)

18   210   210  

XFUND

XFUND: A Multilingual Form Understanding Benchmark

21   208   208  

LAMDA-SSL

30 Semi-Supervised Learning Algorithms

16   208   208  

DeepLearning.AI-TensorFlow-Developer-Professional-Certificate

DeepLearning.AI TensorFlow Developer Professional Certificate

148   208   208  

PaperScraper

A web scraping tool to systematically extract the text of scientific p...

68   207   207  

awesome-ukrainian-nlp

Curated list of Ukrainian natural language processing (NLP) resources...

20   207   207  

CRF-Layer-on-the-Top-of-BiLSTM

The CRF Layer was implemented by using Chainer 2.0. Please see more de...

51   207   207  

NewsRecommender

A news recommendation system tailored for user communities

88   206   206  

persian-stopwords

Persian (Farsi) Stop Words List

117   206   206