Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

tensorflow-1.4-billion-password-analysis

Deep Learning model to analyze a large corpus of clear text passwords.

394   1945   1945  

sling

SLING - A natural language frame semantics parser

266   1931   1931  

NCRFpp

NCRF++, a Neural Sequence Labeling Toolkit. Easy use to any sequence l...

445   1898   1898  

awesome-semi-supervised-learning

😎 An up-to-date & curated list of awesome semi-supervised learning pa...

229   1846   1846  

oasis

🏝️ OASIS: Open Agent Social Interaction Simulations with One Million A...

199   1844   1844  

spago

Self-contained Machine Learning and Natural Language Processing librar...

88   1819   1819  

awesome-embedding-models

A curated list of awesome embedding models tutorials, projects and com...

250   1802   1802  

Awesome-FL

Comprehensive and timely academic information on federated learning (p...

202   1801   1801  

bert_score

BERT score for text generation

232   1790   1790  

spacy-models

💫 Models for the spaCy Natural Language Processing (NLP) library

306   1775   1775  

kaggle-CrowdFlower

1st Place Solution for CrowdFlower Product Search Results Relevance Co...

657   1770   1770  

WikiSQL

A large annotated semantic parsing corpus for developing natural langu...

329   1759   1759  

knowledge-graphs

A collection of research on knowledge graphs

298   1759   1759  

lightning-bolts

Toolbox of models, callbacks, and datasets for AI/ML researchers.

318   1735   1735  

LLMCompiler

[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling

123   1729   1729  

nltk_data

NLTK Data

1096   1716   1716  

language

Shared repository for open-sourced projects from the Google AI Languag...

351   1697   1697  

graph4nlp

Graph4nlp is the library for the easy use of Graph Neural Networks for...

204   1688   1688  

transformer-deploy

Efficient, scalable and enterprise-grade CPU/GPU inference server for...

153   1687   1687  

kor

LLM(😽)

94   1684   1684  

text-analytics-with-python

Learn how to process, classify, cluster, summarize, understand syntax,...

851   1679   1679  

sense2vec

🦆 Contextually-keyed word vectors

242   1657   1657  

magnitude

A fast, efficient universal vector embedding utility package.

119   1650   1650  

Chinese-XLNet

Pre-Trained Chinese XLNet(中文XLNet预训练模型)

281   1650   1650  

Style-Transfer-in-Text

Paper List for Style Transfer in Text

195   1627   1627  

DAT8

General Assembly's 2015 Data Science course in Washington, DC

1066   1615   1615  

How-to-use-Transformers

Transformers 库快速入门教程

194   1615   1615  

Transformers-Recipe

🧠 A study guide to learn about Transformers

157   1604   1604  

usaddress

:us: a python library for parsing unstructured United States address s...

304   1591   1591  

pke

Python Keyphrase Extraction module

292   1581   1581  

Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and...

128   1580   1580  

holiday-cn

📅🇨🇳中国法定节假日数据 自动每日抓取国务院公告

173   1574   1574  

awesome-ai-ml-dl

Awesome Artificial Intelligence, Machine Learning and Deep Learning as...

362   1569   1569  

underthesea

Underthesea - Vietnamese NLP Toolkit

287   1565   1565  

TAADpapers

Must-read Papers on Textual Adversarial Attack and Defense

194   1561   1561  

DeepMoji

State-of-the-art deep learning model for analyzing sentiment, emotion,...

318   1552   1552  

entity-recognition-datasets

A collection of corpora for named entity recognition (NER) and entity...

248   1547   1547  

torchdistill

A coding-free framework built on PyTorch for reproducible deep learnin...

135   1539   1539  

Semi-supervised-learning

A Unified Semi-Supervised Learning Codebase (NeurIPS'22)

200   1510   1510  

chatarena

ChatArena (or Chat Arena) is a Multi-Agent Language Game Environments...

146   1496   1496  

WikiChat

WikiChat is an improved RAG. It stops the hallucination of large langu...

132   1493   1493  

tensorflow-nlp

NLP and Text Generation Experiments in TensorFlow 2.x / 1.x

425   1487   1487  

anago

Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-...

365   1485   1485  

awesome-search

Awesome Search - this is all about the (e-commerce, but not only) sear...

129   1475   1475  

curator

Synthetic data curation for post-training and structured data extracti...

119   1474   1474  

OpenNMT-tf

Neural machine translation and sequence learning using TensorFlow

381   1472   1472  

BotSharp

The Open Source Chatbot Framework in .NET

336   1466   1466  

conv-emotion

This repo contains implementation of different architectures for emoti...

336   1457   1457  

NeuronBlocks

NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego

194   1455   1455  

lingua-py

The most accurate natural language detection library for Python, suita...

49   1455   1455