Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

parallax

Tool for interactive embeddings visualization

24   315   315  

Tokenizer

Fast and customizable text tokenization library with BPE and SentenceP...

74   314   314  

KIVI

[ICML 2024] KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Ca...

32   313   313  

stringi

Fast and portable character string processing in R (with the Unicode I...

49   311   311  

pytorch-transformers-classification

Based on the Pytorch-Transformers library by HuggingFace. To be used a...

97   309   309  

stopwords

Default English stopword lists from many different sources

128   307   307  

insight

Repository for Project Insight: NLP as a Service

46   306   306  

NonAutoregGenProgress

Tracking the progress in non-autoregressive generation (translation, t...

27   306   306  

Contrastive_Learning_Papers

A list of contrastive Learning papers

38   306   306  

naturalcc

NaturalCC: An Open-Source Toolkit for Code Intelligence

57   304   304  

bert-sklearn

a sklearn wrapper for Google's BERT model

70   301   301  

lda

LDA topic modeling for node.js

48   297   297  

deep-learning-nlp-rl-papers

Recent Deep Learning papers in NLU and RL

49   296   296  

mishkal

Mishkal is an arabic text vocalization software

72   294   294  

ToD-BERT

Pre-Trained Models for ToD-BERT

55   293   293  

retvec

RETVec is an efficient, multilingual, and adversarially-robust text ve...

23   293   293  

revery

A personal semantic search engine capable of surfacing relevant bookma...

7   292   292  

WordGCN

ACL 2019: Incorporating Syntactic and Semantic Information in Word Emb...

64   291   291  

BOND

BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant S...

35   291   291  

question_generator

An NLP system for generating reading comprehension questions

74   291   291  

Kevinpro-NLP-demo

All NLP you Need Here. 目前包含15个NLP demo的pytorch实现(大量代码借...

55   288   288  

ineuron-full-stack-data-science-assignments

this repository features assignments and projects from the iNeuron ful...

210   287   287  

SWEM

The Tensorflow code for this ACL 2018 paper: "Baseline Needs More Love...

53   287   287  

knowledge-gpt

Extract knowledge from all information sources using gpt and other lan...

54   286   286  

languagecrunch

LanguageCrunch NLP server docker image

27   285   285  

id-nlp-resource

A list of Indonesian NLP resources.

49   285   285  

ContinualLM

An Extensible Continual Learning Framework Focused on Language Models...

21   284   284  

Mol-Instructions

[ICLR 2024] Mol-Instructions: A Large-Scale Biomolecular Instruction D...

15   283   283  

Good-Papers

I try my best to keep updated cutting-edge knowledge in Machine Learni...

57   283   283  

dialoglue

DialoGLUE: A Natural Language Understanding Benchmark for Task-Oriente...

27   283   283  

goodreads

code samples for the goodreads datasets

62   282   282  

shifterator

Interpretable data visualizations for understanding how texts differ a...

30   280   280  

Multi-Type-TD-TSR

Extracting Tables from Document Images using a Multi-stage Pipeline fo...

53   280   280  

TeachingDataScience

Course notes for Data Science related topics, prepared in LaTeX

146   279   279  

MentalLLaMA

This repository introduces MentaLLaMA, the first open-source instructi...

29   277   277  

Web-Database-Analytics

Web scrapping and related analytics using Python tools

167   277   277  

transfomers-silicon-research

Research and Materials on Hardware implementation of Transformer Model

35   276   276  

nlp-tutorial

Tutorial: Natural Language Processing in Python

148   275   275  

bist-parser

Graph-based and Transition-based dependency parsers based on BiLSTMs

97   275   275  

pytorch_graph-rel

A PyTorch implementation of GraphRel

52   275   275  

KagNet

Knowledge-Aware Graph Networks for Commonsense Reasoning (EMNLP-IJCNLP...

55   274   274  

recurrent-entity-networks

TensorFlow implementation of "Tracking the World State with Recurrent...

65   273   273  

awesome-emotion-recognition-in-conversations

A comprehensive reading list for Emotion Recognition in Conversations

45   272   272  

ScanRefer

[ECCV 2020] ScanRefer: 3D Object Localization in RGB-D Scans using Nat...

28   272   272  

AI_ChatBot_Python

AI ChatBot using Python Tensorflow and Natural Language Processing (NL...

161   272   272  

Black-Box-Tuning

ICML'2022: Black-Box Tuning for Language-Model-as-a-Service & EMNLP'20...

31   270   270  

pytorch-question-answering

Important paper implementations for Question Answering using PyTorch

51   270   270  

awesome-tensorlayer

A curated list of dedicated resources and applications

59   269   269  

extreme-bert

ExtremeBERT is a toolkit that accelerates the pretraining of customize...

15   269   269  

InsTag

InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning

8   267   267