Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

stminsights

A Shiny Application for Inspecting Structural Topic Models

16   116   116  

Pre-modern_Chinese_corpus_dataset

近代汉语语料库数据集 自然语言处理 语料库 古代汉语 古汉语 文言文 数字人...

14   115   115  

klue-transformers-tutorial

KLUE 데이터를 활용한 HuggingFace Transformers 튜토리얼

16   115   115  

datalinguist

Stanford CoreNLP in idiomatic Clojure.

5   115   115  

XFUND

XFUND: A Multilingual Form Understanding Benchmark

15   115   115  

AI_ChatBot_Python

AI ChatBot using Python Tensorflow and Natural Language Processing (NL...

86   115   115  

greek-bert

A Greek edition of BERT pre-trained language model

8   114   114  

Zemberek-Python-Examples

Zemberek Turkish NLP examples written in Python using the JPype packag...

14   113   113  

UniRE

Source code for "UniRE: A Unified Label Space for Entity Relation Extr...

20   113   113  

Sentiment

An example project using a feed-forward neural network for text sentim...

13   112   112  

Personal-Emotional-Stylized-Dialog

A Paper List for Personalized, Emotional, and stylized Dialog

15   112   112  

tamnun-ml

An easy to use open-source library for advanced Deep Learning and Natu...

10   111   111  

lima

The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C...

21   111   111  

mindspore-nlp-tutorial

Natural Language Processing Tutorial for MindSpore Users

23   111   111  

dataset

darija <-> english dataset

48   111   111  

mtdata

A tool that locates, downloads, and extracts machine translation corpo...

16   110   110  

tanl

Structured Prediction as Translation between Augmented Natural Languag...

21   110   110  

mzutils

8   110   110  

chatbot-samples

🤖 聊天机器人,对话模板

39   110   110  

toiro

A comparison tool of Japanese tokenizers

6   109   109  

practical-2

Oxford Deep NLP 2017 course - Practical 2: Text Classification

92   109   109  

COCO-LM

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for...

14   109   109  

Kadot

Natural language processing using unsupervised vectors representation.

9   108   108  

pynlp

A pythonic wrapper for Stanford CoreNLP.

11   108   108  

Hierarchical-Sentiment

Hierarchical Models for Sentiment Analysis in Pytorch

21   107   107  

meProp

meProp: Sparsified Back Propagation for Accelerated Deep Learning (ICM...

18   107   107  

spark-nlp-models

Models and Pipelines for the Spark NLP library

44   107   107  

crossnorm-selfnorm

CrossNorm and SelfNorm for Generalization under Distribution Shifts, I...

7   107   107  

keyATM

An R package for Keyword Assisted Topic Models

13   106   106  

Repo-2016

R, Python and Mathematica Codes in Machine Learning, Deep Learning, Ar...

117   106   106  

quantulum3

Library for unit extraction under active development - fork of quantul...

49   106   106  

awesome-papers

机器学习,深度学习,自然语言处理,计算机视觉方面的顶级期刊会议论文集

34   106   106  

core

Clojure wrapper for the Stanford CoreNLP Java library

32   105   105  

MSR-NLP-Projects

This is a list of open-source projects at Microsoft Research NLP Group

8   105   105  

Python_NLP_Tutorial

This repository provides everything to get started with Python for Tex...

58   105   105  

lingfeat

LingFeat - A Comprehensive Linguistic Features Extraction ToolKit for...

11   105   105  

OKD-Reading-List

Papers for Open Knowledge Discovery

20   104   104  

nlp-with-pytorch

<파이토치로 배우는 자연어 처리>(한빛미디어, 2021)의 소스 코드를 위한...

50   104   104  

gptsh

GPT.sh is a CLI tool built with NodeJs and powered by Open AI's GPT-3....

10   104   104  

edgar-crawler

Download financial reports from SEC's EDGAR quickly. Extract clean tex...

33   104   104  

sentence-similarity

PyTorch implementations of various deep learning models for paraphrase...

25   104   104  

lda-topic-modeling

A PureScript, browser-based implementation of LDA topic modeling.

17   103   103  

ja.text8

Japanese text8 corpus for word embedding.

8   103   103  

pymlask

Emotion analyzer for Japanese text

18   103   103  

spring

SPRING is a seq2seq model for Text-to-AMR and AMR-to-Text (AAAI2021).

23   103   103  

falcon2.0

Falcon 2.0 is a joint entity and relation linking tool over Wikidata.

21   102   102  

TeachingDataScience

Course notes for Data Science related topics, prepared in LaTeX

63   102   102  

meena-chatbot

Google's Meena transformer chatbot implementation

21   102   102  

HumanPrompt

A framework for human-readable prompt-based method with large language...

8   102   102  

HPSG-Neural-Parser

Source code for "Head-Driven Phrase Structure Grammar Parsing on Penn...

25   102   102