Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

data-science-portfolio

Portfolio of data science projects completed by me for academic, self...

424   959   959  

bert_language_understanding

Pre-training of Deep Bidirectional Transformers for Language Understan...

214   954   954  

lingua-go

The most accurate natural language detection library for Go, suitable...

55   945   945  

budoux

23   945   945  

chatgpt-comparison-detection

Human ChatGPT Comparison Corpus (HC3), Detectors, and more! 🔥

80   943   943  

awesome-transformer-nlp

A curated list of NLP resources focused on Transformer networks, atten...

114   939   939  

kogpt

KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

135   935   935  

torchdistill

A coding-free framework built on PyTorch for reproducible deep learnin...

100   929   929  

tutorials

AI-related tutorials. Access any of them for free → https://towardsai....

341   920   920  

rasa-ui

Rasa UI is a frontend for the Rasa Framework

315   919   919  

Summarization-Papers

Summarization Papers

139   919   919  

awesome-sentiment-analysis

😀😄😂😭 A curated list of Sentiment Analysis methods, implementations and...

166   899   899  

obsei

Obsei is a low code AI powered automation tool. It can be used in vari...

121   899   899  

YouTokenToMe

Unsupervised text tokenizer focused on computational efficiency

81   889   889  

pointer_summarizer

pytorch implementation of "Get To The Point: Summarization with Pointe...

248   886   886  

K-BERT

Source code of K-BERT (AAAI2020)

203   884   884  

KGQA_HLM

基于知识图谱的《红楼梦》人物关系可视化及问答系统

266   884   884  

jcseg

Jcseg is a light weight NLP framework developed with Java. Provide CJK...

216   880   880  

whatlang-rs

Natural language detection library for Rust. Try demo online: https://...

116   877   877  

Transformers4Rec

Transformers4Rec is a flexible and efficient library for sequential an...

127   867   867  

wikipedia2vec

A tool for learning vector representations of words and entities from...

97   866   866  

gpt-2-Pytorch

Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

211   866   866  

soynlp

한국어 자연어처리를 위한 파이썬 라이브러리입니다. 단어 추출/ 토크나이...

184   853   853  

nlp-notebooks

A collection of notebooks for Natural Language Processing from NLP Tow...

344   847   847  

wit

WIT (Wikipedia-based Image Text) Dataset is a large multimodal multili...

36   838   838  

Chatito

🎯🗯 Dataset generation for AI chatbots, NLP tasks, named entity recogni...

163   837   837  

bert4torch

An elegent pytorch implement of transformers

104   837   837  

clean-text

🧹 Python package for text cleaning

71   836   836  

rpaframework

Collection of open-source libraries and tools for Robotic Process Auto...

142   830   830  

seq2seq-chatbot

Chatbot in 200 lines of code using TensorLayer

317   829   829  

bolt

Bolt is a deep learning library with high performance and heterogeneou...

154   829   829  

awesome-document-understanding

A curated list of resources for Document Understanding (DU) topic

111   828   828  

MemN2N-tensorflow

"End-To-End Memory Networks" in Tensorflow

255   824   824  

WEB_KG

爬取百度百科中文页面,抽取三元组信息,构建中文知识图谱

188   823   823  

openai-kotlin

OpenAI API client for Kotlin with multiplatform and coroutines capabil...

84   821   821  

BERT-keras

Keras implementation of BERT with pre-trained weights

202   820   820  

LightAutoML

LAMA - automatic model creation framework

93   815   815  

NLP-Tutorials

Simple implementations of NLP models. Tutorials are written in Chinese...

300   810   810  

iowncode

A curated collection of iOS, ML, AR resources sprinkled with some UI a...

316   809   809  

awesome-gcn

resources for graph convolutional networks (图卷积神经网络相关资源)

131   808   808  

lightNLP

基于Pytorch和torchtext的自然语言处理深度学习框架。

217   805   805  

TextGAN-PyTorch

TextGAN is a PyTorch framework for Generative Adversarial Networks (GA...

190   800   800  

CodeT5

Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

146   800   800  

TextClassification-Keras

Text classification models implemented in Keras, including: FastText,...

188   799   799  

self-attentive-parser

High-accuracy NLP parser with models for 11 languages.

150   799   799  

aiva

AIVA (A.I. Virtual Assistant): General-purpose virtual assistant for d...

601   794   794  

gector

Official implementation of the papers "GECToR – Grammatical Error Corr...

202   793   793  

inltk

Natural Language Toolkit for Indic Languages aims to provide out of th...

165   791   791  

bigscience

Central place for the engineering/scaling WG: documentation, SLURM scr...

75   790   790  

cltk

The Classical Language Toolkit

318   789   789