Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

OpenAI-CLIP

Simple implementation of OpenAI CLIP model in PyTorch.

50   313   313  

BMList

A List of Big Models

10   312   312  

StruMatchDL

Codes for ICML 2022 paper: Matching Structure for Dual Learning

2   311   311  

stringi

Fast and portable character string processing in R (with the Unicode I...

47   309   309  

mlm-scoring

Python library & examples for Masked Language Model Scoring (ACL 2020)

60   308   308  

megabots

🤖 State-of-the-art, production ready LLM apps made mega-easy, so you...

29   307   307  

fugashi

A Cython MeCab wrapper for fast, pythonic Japanese tokenization and mo...

26   305   305  

PERT

PERT: Pre-training BERT with Permuted Language Model

22   305   305  

Transformers_for_Text_Classification

基于Transformers的文本分类

66   305   305  

HSCRF-pytorch

ACL 2018: Hybrid semi-Markov CRF for Neural Sequence Labeling (http://...

70   304   304  

bert_distill

BERT distillation(基于BERT的蒸馏实验 )

83   304   304  

nlp_newsletter

📰Natural language processing (NLP) newsletter

20   303   303  

sparsezoo

Neural network model repository for highly sparse and sparse-quantized...

20   302   302  

rebel

REBEL is a seq2seq model that simplifies Relation Extraction (EMNLP 20...

47   302   302  

prodigy-openai-recipes

✨ Bootstrap annotation with zero- & few-shot learning via OpenAI GPT-...

26   302   302  

textpipe

Textpipe: clean and extract metadata from text

27   301   301  

Chatette

A powerful dataset generator for Rasa NLU, inspired by Chatito

52   301   301  

news-emotion

📉 金融文本情感分析模型

126   300   300  

RasaTalk

A chatbot framework for Rasa NLU

86   300   300  

rasa_nlu_gq

turn natural language into structured data(支持中文,自定义了N种模型,...

99   298   298  

BERT-of-Theseus

⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressi...

38   298   298  

awesome-list-of-awesomes

A curated list of all the Awesome --Topic Name-- lists I've found till...

42   297   297  

XPretrain

Multi-modality pre-training

15   297   297  

simple-effective-text-matching-pytorch

A pytorch implementation of the ACL2019 paper "Simple and Effective Te...

54   296   296  

multi-criteria-cws

Simple Solution for Multi-Criteria Chinese Word Segmentation

85   295   295  

NER-pytorch

LSTM+CRF NER

102   295   295  

OpenUE

OpenUE是一个轻量级知识图谱抽取工具 (An Open Toolkit for Universal Ext...

59   295   295  

ark-nlp

A private nlp coding package, which quickly implements the SOTA soluti...

62   295   295  

tensor_parallel

Automatically split your PyTorch models on multiple GPUs for training...

14   294   294  

insight

Repository for Project Insight: NLP as a Service

44   294   294  

lda

LDA topic modeling for node.js

49   294   294  

deepsegment

A sentence segmenter that actually works!

57   294   294  

komputation

Komputation is a neural network framework for the Java Virtual Machine...

11   293   293  

pycantonese

Cantonese Linguistics and NLP

36   293   293  

yargy

Rule-based facts extraction for Russian language

41   292   292  

cherche

📑 Neural Search

12   292   292  

rc-cnn-dailymail

CNN/Daily Mail Reading Comprehension Task

59   291   291  

transfer-nlp

NLP library designed for reproducible experimentation management

20   291   291  

nlp-data-augmentation

Data Augmentation for NLP. NLP数据增强

40   291   291  

cargo-spellcheck

Checks all your documentation for spelling and grammar mistakes with h...

26   291   291  

hanlp-lucene-plugin

HanLP中文分词Lucene插件,支持包括Solr在内的基于Lucene的系统

100   289   289  

text-classification

Machine Learning and NLP: Text Classification using python, scikit-lea...

212   289   289  

kss

Kss: A Toolkit for Korean sentence segmentation

49   289   289  

NSC

Neural Sentiment Classification

97   288   288  

SAPConversationalAI

✨ 🤖 🤖 Build your own conversational bot on our Collaborative Bot Pl...

66   287   287  

awesome-nlprojects

List of projects related to Natural Language Processing (NLP) that mak...

92   287   287  

Text-Classification

自然语言处理项目,目标是对文本进行分类。

232   287   287  

languagecrunch

LanguageCrunch NLP server docker image

29   286   286  

Kiwi

Kiwi(지능형 한국어 형태소 분석기)

36   286   286  

similarities

Similarities: a toolkit for similarity calculation and semantic search...

36   284   284