Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

HSCRF-pytorch

ACL 2018: Hybrid semi-Markov CRF for Neural Sequence Labeling (http://...

70   304   304  

bert_distill

BERT distillation(基于BERT的蒸馏实验 )

83   304   304  

nlp_newsletter

📰Natural language processing (NLP) newsletter

20   303   303  

sparsezoo

Neural network model repository for highly sparse and sparse-quantized...

20   302   302  

rebel

REBEL is a seq2seq model that simplifies Relation Extraction (EMNLP 20...

47   302   302  

prodigy-openai-recipes

✨ Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-3

26   302   302  

textpipe

Textpipe: clean and extract metadata from text

27   301   301  

Chatette

A powerful dataset generator for Rasa NLU, inspired by Chatito

52   301   301  

news-emotion

📉 金融文本情感分析模型

126   300   300  

pyss3

A Python package implementing a new interpretable machine learning mod...

41   300   300  

rasa_nlu_gq

turn natural language into structured data(支持中文,自定义了N种模型,...

99   298   298  

BERT-of-Theseus

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressin...

38   298   298  

awesome-list-of-awesomes

A curated list of all the Awesome --Topic Name-- lists I've found till...

42   297   297  

XPretrain

Multi-modality pre-training

15   297   297  

simple-effective-text-matching-pytorch

A pytorch implementation of the ACL2019 paper "Simple and Effective Te...

54   296   296  

multi-criteria-cws

Simple Solution for Multi-Criteria Chinese Word Segmentation

85   295   295  

NER-pytorch

LSTM+CRF NER

102   295   295  

OpenUE

OpenUE是一个轻量级知识图谱抽取工具 (An Open Toolkit for Universal Ext...

59   295   295  

ark-nlp

A private nlp coding package, which quickly implements the SOTA soluti...

62   295   295  

insight

Repository for Project Insight: NLP as a Service

44   294   294  

deepsegment

A sentence segmenter that actually works!

57   294   294  

tensor_parallel

Automatically split your PyTorch models on multiple GPUs for training...

14   294   294  

pycantonese

Cantonese Linguistics and NLP

36   293   293  

komputation

Komputation is a neural network framework for the Java Virtual Machine...

13   292   292  

yargy

Rule-based facts extraction for Russian language

41   292   292  

COVID-19-NLP-vis

使用 flask + pyecharts 搭建的新冠肺炎疫情数据可视化交互分析网站平台,...

72   292   292  

cherche

📑 Neural Search

12   292   292  

rc-cnn-dailymail

CNN/Daily Mail Reading Comprehension Task

59   291   291  

transfer-nlp

NLP library designed for reproducible experimentation management

20   291   291  

nlp-data-augmentation

Data Augmentation for NLP. NLP数据增强

40   291   291  

cargo-spellcheck

Checks all your documentation for spelling and grammar mistakes with h...

26   291   291  

hanlp-lucene-plugin

HanLP中文分词Lucene插件,支持包括Solr在内的基于Lucene的系统

100   289   289  

text-classification

Machine Learning and NLP: Text Classification using python, scikit-lea...

212   289   289  

kss

Kss: A Toolkit for Korean sentence segmentation

49   289   289  

NSC

Neural Sentiment Classification

97   288   288  

awesome-nlprojects

List of projects related to Natural Language Processing (NLP) that mak...

92   287   287  

Text-Classification

自然语言处理项目,目标是对文本进行分类。

232   287   287  

languagecrunch

LanguageCrunch NLP server docker image

29   286   286  

Kiwi

Kiwi(지능형 한국어 형태소 분석기)

36   286   286  

similarities

Similarities: a toolkit for similarity calculation and semantic search...

36   284   284  

behemoth

Behemoth is an open source platform for large scale document analysis...

60   283   283  

RNNSharp

RNNSharp is a toolkit of deep recurrent neural network which is widely...

91   283   283  

pyate

PYthon Automated Term Extraction

37   283   283  

discopy

The Python toolkit for computing with string diagrams.

61   283   283  

multifit

The code to reproduce results from paper "MultiFiT: Efficient Multi-li...

56   282   282  

Customer-Chatbot

中文智能客服机器人demo,包含闲聊和专业问答2个部分,支持自定义组件(Chi...

110   282   282  

pixel

Research code for pixel-based encoders of language (PIXEL)

19   282   282  

lda

LDA topic modeling for node.js

43   281   281  

RasaTalk

A chatbot framework for Rasa NLU

87   281   281  

bert-sklearn

a sklearn wrapper for Google's BERT model

70   279   279