Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

ml-projects

ML based projects such as Spam Classification, Time Series Analysis, T...

106   222   222  

embeddings

Fast, DB Backed pretrained word embeddings for natural language proces...

31   222   222  

razdel

Rule-based token, sentence segmentation for Russian language

28   222   222  

text-dedup

All-in-one text de-duplication

28   221   221  

lm-spanish

Official source for spanish Language Models and resources made @ BSC-T...

18   220   220  

CNN-text-classification-keras

Text Classification by Convolutional Neural Network in Keras

94   220   220  

segmentit

任何 JS 环境可用的中文分词包,fork from leizongmin/node-segment

13   219   219  

KB-ALBERT

KB국민은행에서 제공하는 경제/금융 도메인에 특화된 한국어 ALBERT 모델

44   219   219  

label-sleuth

Open source no-code system for text annotation and building of text cl...

38   218   218  

claf

CLaF: Open-Source Clova Language Framework

36   218   218  

JFastText

Java interface for fastText

99   217   217  

nl2sql

阿里天池首届中文NL2SQL挑战赛top6

53   217   217  

AGI-Papers

Papers and Book to look at when starting AGI 📚

28   215   215  

Cornucopia-LLaMA-Fin-Chinese

聚宝盆(Cornucopia): 基于中文金融知识的LLaMA微调模型;涉及SFT、RLHF、GP...

22   215   215  

OpenGPT

A framework for creating grounded instruction based datasets and train...

25   215   215  

Awesome-NLP-Resources

This repository contains landmark research papers in Natural Language...

54   215   215  

bert-chainer

Chainer implementation of "BERT: Pre-training of Deep Bidirectional Tr...

41   214   214  

radish

C++ model train&inference framework

36   214   214  

udpipe

R package for Tokenization, Parts of Speech Tagging, Lemmatization and...

34   214   214  

fixy

Amacımız Türkçe NLP literatüründeki birçok farklı sorunu bir arada çöz...

18   213   213  

KoGPT2-FineTuning

🔥 Korean GPT-2, KoGPT2 FineTuning cased. 한국어 가사 데이터 학습 🔥

60   213   213  

graph-convolution-nlp

Graph Convolution Network for NLP

36   212   212  

triviaqa

Code for the TriviaQA reading comprehension dataset

40   212   212  

Persian-NER

پیکره بزرگ شناسایی موجودیت‌های نامدار فارسی برچسب خورده

19   212   212  

sharingan

Tool to extract news articles from newspaper and give the context abou...

26   211   211  

unify-emotion-datasets

A Survey and Experiments on Annotated Corpora for Emotion Classificati...

46   211   211  

python-bpe

Byte Pair Encoding for Python!

38   211   211  

laserembeddings

LASER multilingual sentence embeddings as a pip package

27   211   211  

NSP-BERT

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Thr...

39   209   209  

indonesian-NLP-resources

data resource untuk NLP bahasa indonesia

53   209   209  

emailGPT

a quick and easy interface to generate emails with ChatGPT

30   208   208  

RLHF

Implementation of Chinese ChatGPT

23   208   208  

bert_for_corrector

基于bert进行中文文本纠错

47   207   207  

doc-han-att

Hierarchical Attention Networks for Chinese Sentiment Classification

55   207   207  

danlp

DaNLP is a repository for Natural Language Processing resources for th...

34   206   206  

ml_things

This is where I put things I find useful that speed up my work with Ma...

60   206   206  

vnlp

State-of-the-art, lightweight NLP tools for Turkish language. Develope...

17   206   206  

Competition_CAIL

2018中国‘法研杯’法律智能挑战赛(CAIL2018)个人作品

61   205   205  

cutlet

Japanese to romaji converter in Python

19   205   205  

numerizer

A Python module to convert natural language numerics into ints and flo...

23   205   205  

mauve

Package to compute Mauve, a similarity score between neural text and h...

17   205   205  

programming-book-3

Programming books 3: Python、 Machine-Learning、 Deep-Learning、 NLP

82   205   205  

vaporetto

🛥 Vaporetto: Very accelerated pointwise prediction based tokenizer

8   203   203  

gpt-j

A GPT-J API to use with python3 to generate text, blogs, code, and mor...

53   203   203  

neuro

🔮 Neuro.js is machine learning library for building AI assistants and...

32   201   201  

spacy-clausie

Implementation of the ClausIE information extraction system for python...

30   201   201  

transition-amr-parser

SoTA Abstract Meaning Representation (AMR) parsing with word-node alig...

42   201   201  

examples

Analyze the unstructured data with Towhee, such as reverse image searc...

62   201   201  

ernie

Simple State-of-the-Art BERT-Based Sentence Classification with Keras...

31   201   201  

text-emotion-classification

Archived - not answering issues

82   200   200