Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

Thai-Sentence-Vector-Benchmark

Benchmark for Thai sentence representation

7   125   125  

awesome-ai-research-intern-list

List of AI Internships

8   125   125  

nlpbuddy

A text analysis application for performing common NLP tasks through a...

28   125   125  

Papers

读过的CV方向的一些论文,图像生成文字、弱监督分割等

20   125   125  

Python_NLP_Tutorial

This repository provides everything to get started with Python for Tex...

65   125   125  

unified-summarization

Official codes for the paper: A Unified Model for Extractive and Abstr...

30   124   124  

weak-supervision-for-NER

Framework to learn Named Entity Recognition models without labelled da...

30   124   124  

cs224n

Stanford CS224n: Natural Language Processing with Deep Learning, Winte...

42   124   124  

slp3-zh

《自然语言处理综论》第三版翻译。

10   124   124  

MKG_Analogy

[ICLR 2023] Multimodal Analogical Reasoning over Knowledge Graphs

12   124   124  

ruTS

Библиотека для извлечения статистик из текстов на русском языке.

21   123   123  

Retrieval-Augmented-Generation-Engine-with-LangChain-and-Streamlit

Powerful web application that combines Streamlit, LangChain, and Pinec...

63   123   123  

UniRE

Source code for "UniRE: A Unified Label Space for Entity Relation Extr...

22   122   122  

Job-Resume-Matching

The idea is to calculate the similarity between the resume and the job...

34   122   122  

chariot

Deliver the ready-to-train data to your NLP model.

9   122   122  

Zemberek-Python-Examples

Zemberek Turkish NLP examples written in Python using the JPype packag...

15   122   122  

toiro

A comparison tool of Japanese tokenizers

9   121   121  

text_analytics

Basic text analytics and natural language processing in Python

55   121   121  

Automatic-Indian-Sign-Language-Translator-ISL

I created an application which takes in live speech or audio recording...

56   121   121  

EN-FA-CS-Dictionary

:speech_balloon: An English-Persian Dictionary of Computer Science and...

19   120   120  

OKD-Reading-List

Papers for Open Knowledge Discovery

24   120   120  

DocumentGPT

DocumentGPT is a web application that allows you to chat over your res...

35   120   120  

dynamic-coattention-network-plus

Dynamic Coattention Network Plus (DCN+) TensorFlow implementation. Que...

36   120   120  

stminsights

A Shiny Application for Inspecting Structural Topic Models

16   120   120  

quantulum

Python library for information extraction of quantities from unstructu...

24   119   119  

Personal-Emotional-Stylized-Dialog

A Paper List for Personalized, Emotional, and stylized Dialog

16   119   119  

eyes

Public Opinion Mining System of Taiwanese Forums

18   119   119  

parallel-decoding

Repository of the paper "Accelerating Transformer Inference for Transl...

8   119   119  

LangPro

Tableau-based Theorem Prover for Natural Logic and Language

13   118   118  

asent

Asent is a python library for performing efficient and transparent sen...

16   118   118  

MachineSoM

[ACL 2024] Exploring Collaboration Mechanisms for LLM Agents: A Socia...

11   118   118  

d2l-mindspore

《动手学深度学习》的MindSpore实现。供MindSpore学习者配合李沐老师课程使...

27   118   118  

commonsense-rc

Code for Yuanfudao at SemEval-2018 Task 11: Three-way Attention and Re...

35   118   118  

estnltk

Open source tools for Estonian natural language processing

23   118   118  

deep-nlp-seminars

Materials for deep NLP course

65   117   117  

COCO-LM

[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for...

12   117   117  

CoLAKE

COLING'2020: CoLAKE: Contextualized Language and Knowledge Embedding

17   117   117  

awesome-early-exiting

A curated list of Early Exiting papers, benchmarks, and misc.

11   117   117  

panml

PanML is a high level generative AI/ML development and analysis librar...

15   117   117  

WorfBench

[ICLR 2025] Benchmarking Agentic Workflow Generation

7   117   117  

datalinguist

Stanford CoreNLP in idiomatic Clojure.

5   116   116  

SpokenNLP

A wide variety of research projects developed by the SpokenNLP team of...

11   116   116  

nlp-fluency

评估自然语言的流畅度

20   116   116  

scratchpad

This framework works as a form of user/machine calibration, with a foc...

15   116   116  

DGAI

Learn Generative AI with PyTorch (Manning Publications, 2024)

52   116   116  

fashion-assistant

Our idea is to combine the power of computer vision model and LLMs. We...

7   116   116  

AI-roadmap

A beginner's roadmap to getting started in Machine Learning, by COPS I...

18   115   115  

cogcomp-nlpy

CogComp's light-weight Python NLP annotators

26   115   115  

pymlask

Emotion analyzer for Japanese text

26   115   115  

practical-2

Oxford Deep NLP 2017 course - Practical 2: Text Classification

93   114   114