Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

word_forms

Accurately generate all possible forms of an English word e.g "electio...

72   634   634  

nlpia

Examples and libraries for "Natural Language Processing in Action" boo...

264   631   631  

fast_abs_rl

Code for ACL 2018 paper: "Fast Abstractive Summarization with Reinforc...

185   623   623  

small-text

Active Learning for Text Classification in Python

71   621   621  

BotLibre

An open platform for artificial intelligence, chat bots, virtual agent...

229   620   620  

cdQA

⛔ [NOT MAINTAINED] An End-To-End Closed Domain Question Answering Sys...

191   616   616  

graphbrain

Language, Knowledge, Cognition

70   614   614  

ML-ProjectKart

🙌Kart of 234+ projects based on machine learning, deep learning, comp...

248   614   614  

Matterport3DSimulator

AI Research Platform for Reinforcement Learning from Real Panoramic Im...

136   611   611  

Multimodal-Toolkit

Multimodal model for text and tabular data with HuggingFace transforme...

90   605   605  

BERT-Relation-Extraction

PyTorch implementation for "Matching the Blanks: Distributional Simila...

135   598   598  

indic_nlp_library

Resources and tools for Indian language Natural Language Processing

167   597   597  

BioSentVec

BioWordVec & BioSentVec: pre-trained embeddings for biomedical words a...

100   595   595  

stealth

An open source Ruby framework for text and voice chatbots. 🤖

59   591   591  

NLP_Quickbook

NLP in Python with Deep Learning

231   591   591  

TrustLLM

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

58   589   589  

Awesome-Simultaneous-Translation

Paper list of simultaneous translation / streaming translation, includ...

8   588   588  

weixin_public_corpus

微信公众号语料库

164   587   587  

xlnet-Pytorch

Simple XLNet implementation with Pytorch Wrapper

106   581   581  

text_mining_resources

Resources for learning about Text Mining and Natural Language Processi...

198   581   581  

LeakGAN

The codes of paper "Long Text Generation via Adversarial Training with...

183   576   576  

text2sql-data

A collection of datasets that pair questions with SQL queries.

113   576   576  

bluebert

BlueBERT, pre-trained on PubMed abstracts and clinical notes (MIMIC-II...

82   575   575  

trainable-agents

Code and datasets for "Character-LLM: A Trainable Agent for Role-Play...

40   574   574  

CNSurvey

一份中文综述文章列表(自然语言处理&机器学习)

93   573   573  

MultiBench

[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Lea...

85   567   567  

LM-reasoning

This repository contains a collection of papers and resources on Reaso...

35   564   564  

Contrastive-Learning-NLP-Papers

Paper List for Contrastive Learning for Natural Language Processing

61   563   563  

UnifiedSKG

[EMNLP 2022] Unifying and multi-tasking structured knowledge grounding...

60   562   562  

ML-paper-notes

:notebook: Notes and summaries of various ML, Computer Vision & NLP pa...

79   560   560  

Book-SocialMediaMiningPython

Companion code for the book "Mastering Social Media Mining with Python...

265   560   560  

acl-anthology

Data and software for building the ACL Anthology.

352   559   559  

Transformers.jl

Julia Implementation of Transformer models

81   557   557  

Lemur

[ICLR 2024] Lemur: Open Foundation Models for Language Agents

33   555   555  

LMaaS-Papers

Awesome papers on Language-Model-as-a-Service (LMaaS)

32   554   554  

Sherlock

Natural-language event parser for Javascript

34   554   554  

tensorflow-nlp-tutorial

tensorflow를 사용하여 텍스트 전처리부터, Topic Models, BERT, GPT, LLM...

286   552   552  

paperlists

Processed / Cleaned Data for Paper Copilot

23   549   549  

Comprehensive_DL_Tutor

Comprehensive Deep Learning Tutorial : From Zero To Hero

86   548   548  

CS224n-2019-solutions

Complete solutions for Stanford CS224n, winter, 2019

227   547   547  

NLP_bahasa_resources

A Curated List of Dataset and Usable Library Resources for NLP in Baha...

142   541   541  

rebel

REBEL is a seq2seq model that simplifies Relation Extraction (EMNLP 20...

73   540   540  

ner-lstm

Named Entity Recognition using multilayered bidirectional LSTM

181   538   538  

happy-transformer

Happy Transformer makes it easy to fine-tune and perform inference wit...

69   538   538  

Mengzi

Mengzi Pretrained Models

63   537   537  

nlp-notebook

NLP 领域常见任务的实现,包括新词发现、以及基于pytorch的词向量、中文文...

112   533   533  

dont-stop-pretraining

Code associated with the Don't Stop Pretraining ACL 2020 paper

73   533   533  

cookiecutter-spacy-fastapi

Cookiecutter API for creating Custom Skills for Azure Search using Pyt...

64   533   533  

MLInterview

:octocat: A curated awesome list of AI Startups in India & Machine Lea...

170   532   532  

awesome-arabic

A curated list of awesome projects and dev/design resources for suppor...

97   521   521