Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

NeuroNLP2

Deep neural models for core NLP tasks (Pytorch version)

90   441   441  

medaCy

:hospital: Medical Text Mining and Information Extraction with spaCy

92   439   439  

Deep-Learning-NLP

:satellite: Organized Resources for Deep Learning in Natural Language...

124   436   436  

cmrc2018

A Span-Extraction Dataset for Chinese Machine Reading Comprehension (...

88   436   436  

pykakasi

Lightweight converter from Japanese Kana-kanji sentences into Kana-Rom...

54   434   434  

inseq

Interpretability for sequence generation models 🐛 🔍

38   432   432  

nlp-papers-with-arxiv

Statistics and accepted paper list of NLP conferences with arXiv link

55   431   431  

machine-learning-resources

A curated list of awesome machine learning frameworks, libraries, cour...

126   431   431  

Awesome-Distributed-Deep-Learning

A curated list of awesome Distributed Deep Learning resources.

84   428   428  

awesome-financial-nlp

Researches for Natural Language Processing for Financial Domain

64   426   426  

textaugment

TextAugment: Text Augmentation Library

60   425   425  

ChineseBLUE

Chinese Biomedical Language Understanding Evaluation benchmark (Chines...

82   423   423  

low-resource-languages

Resources for conservation, development, and documentation of low reso...

57   422   422  

ResourceBank_CV_NLP_MLOPS_2022

This repository offers a goldmine of materials for students of compute...

91   421   421  

USC-DS-RelationExtraction

Distantly Supervised Relation Extraction

108   420   420  

contextualSpellCheck

✔️Contextual word checker for better suggestions (not actively maintai...

64   417   417  

whichlang

A blazingly fast and lightweight language detection library for Rust

20   417   417  

NLP-Natural-Language-Processing

Projects and useful articles / links

78   416   416  

ArticutAPI

API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中...

39   413   413  

MedQuAD

Medical Question Answering Dataset of 47,457 QA pairs created from 12...

65   412   412  

dialogflow-javascript-client

JavaScript Web SDK for Dialogflow

171   411   411  

adaptnlp

An easy to use Natural Language Processing library and framework for p...

39   411   411  

edgar-crawler

The only open-source toolkit that can download SEC EDGAR financial rep...

108   411   411  

anlp19

Course repo for Applied Natural Language Processing (Spring 2019)

100   408   408  

nlpnet

A neural network architecture for NLP tasks, using cython for fast per...

104   408   408  

clause

:horse_racing: 聊天机器人,自然语言理解,语义理解

119   407   407  

nagisa

A Japanese tokenizer based on recurrent neural networks

23   402   402  

link-grammar

The CMU Link Grammar natural language parser

118   399   399  

FakeNewsCorpus

A dataset of millions of news articles scraped from a curated list of...

98   398   398  

awesome-python

🐍 Hand-picked awesome Python libraries and frameworks, organised by c...

30   398   398  

airy

💬 Open Source App Framework to build streaming apps with real-time d...

49   395   395  

customizable-gpt-chatbot

A dynamic, scalable AI chatbot built with Django REST framework, suppo...

88   394   394  

NLP101

NLP 101: a resource repository for Deep Learning and Natural Language...

57   392   392  

trade-dst

Source code for transferable dialogue state generator (TRADE, Wu et al...

114   392   392  

Deep-Generative-Models-for-Natural-Language-Processing

DGMs for NLP. A roadmap.

32   391   391  

tf-seq2seq

Sequence to sequence learning using TensorFlow.

108   389   389  

nlp

[UNMANTEINED] Extract values from strings and fill your structs with n...

32   389   389  

HugNLP

CIKM2023 Best Demo Paper Award. HugNLP is a unified and comprehensive...

48   389   389  

dynalang

Code for "Learning to Model the World with Language." ICML 2024 Oral.

28   388   388  

pycantonese

Cantonese Linguistics and NLP

43   388   388  

OmniEvent

A comprehensive, unified and modular event extraction toolkit.

37   387   387  

korean-hate-speech

Korean HateSpeech Dataset

39   386   386  

awesome-bioie

🧫 A curated list of resources relevant to doing Biomedical Informatio...

34   385   385  

beginner_nlp

A curated list of beginner resources in Natural Language Processing

81   384   384  

zshot

Zero and Few shot named entity & relationships recognition

25   383   383  

scriptum

No-Frills Functional Programming Lib Augmenting Javascript/Node.js

20   383   383  

DeCLUTR

The corresponding code from our paper "DeCLUTR: Deep Contrastive Learn...

33   380   380  

FinNLP-Progress

NLP progress in Fintech. A repository to track the progress in Natural...

60   374   374  

gcn-over-pruned-trees

Graph Convolution over Pruned Dependency Trees Improves Relation Extra...

70   373   373  

OPUS-MT-train

Training open neural machine translation models

45   371   371