Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

DaCy

DaCy: The State of the Art Danish NLP pipeline using SpaCy

20   73   73  

react-client

An React client library for Speechly API

1   72   72  

frog

Frog is an integration of memory-based natural language processing (NL...

10   72   72  

spanish-corpora

Unannotated Spanish 3 Billion Words Corpora

9   72   72  

keyATM

An R package for Keyword Assisted Topic Models

10   72   72  

huggingartists

Lyrics generation with GPT2-based Transformer

9   72   72  

Basic-Machine-Learning

This is a repo of basic Machine Learning what I learn. More to go...

18   72   72  

wen-notes

My notes.

29   71   71  

textrank

Summarise text by finding relevant sentences and keywords using the Te...

11   71   71  

unihandecode

unihandecode is a transliteration library to convert all characters/wo...

9   71   71  

NLG-RL

Accelerated Reinforcement Learning for Sentence Generation by Vocabula...

8   71   71  

eznlp

Easy Natural Language Processing

15   71   71  

nlp-various-tutorials

자연어 처리와 관련한 여러 튜토리얼 저장소

10   70   70  

hmrb

Python Rule Processing Engine 🏺

5   70   70  

rnn-from-scratch

A Recurrent Neural Network implemented from scratch (using only numpy)...

55   70   70  

jrte-corpus

Japanese Realistic Textual Entailment Corpus (NLP 2020, LREC 2020)

5   70   70  

indolem

IndoLEM is a comprehensive Indonesian NLU benchmark, comprising three...

26   70   70  

Chinese_Coreference_Resolution

基于SpanBert的中文指代消解,pytorch实现

15   70   70  

Naive-Resume-Matching

Text Similarity Applied to resume, to compare Resumes with Job Descrip...

45   70   70  

kor2vec

Library for Korean morpheme and word vector representation

15   69   69  

node-synonyms

:ferris_wheel: 中文近义词工具包,聊天机器人

13   69   69  

nlp-akash

Natural Language Processing notes and implementations.

46   69   69  

summarize-radiology-findings

Code and pretrained model for paper "Learning to Summarize Radiology F...

23   69   69  

sister

SImple SenTence EmbeddeR

18   69   69  

doccano-client

A simple client for doccano API.

51   69   69  

htmldate

Fast and robust date extraction from web pages, with Python or on the...

20   69   69  

awesome-NLP-resources

a collection of NLP projects&tools. 自然语言处理方向项目和工具集合。

21   69   69  

get_started_with_deep_learning_for_text_with_allennlp

Getting started with AllenNLP and PyTorch by training a tweet classifi...

17   68   68  

capsnet-nlp

CapsNet for NLP

10   68   68  

convai-bot-1337

NIPS Conversational Intelligence Challenge 2017 Winner System: Skill-b...

17   68   68  

virtual-assistant

Virtual Assistant

27   68   68  

lingua-franca

Mycroft's multilingual text parsing and formatting library

73   68   68  

MAX-Speech-to-Text-Converter

Converts spoken words into text form.

27   68   68  

Shakespearizing-Modern-English

Code for "Jhamtani H.*, Gangal V.*, Hovy E. and Nyberg E. Shakespeariz...

27   68   68  

machine-learning-notebooks

🤖 An authorial set of fundamental python recipes on Machine Learning a...

14   68   68  

FinBERT-QA

Financial Domain Question Answering with pre-trained BERT Language Mod...

26   68   68  

timestring

Parse a human readable time string into a time based value

15   68   68  

semeval22_structured_sentiment

SemEval-2022 Shared Task 10: Structured Sentiment Analysis

40   68   68  

contrastive-active-learning

Code for the EMNLP 2021 Paper "Active Learning by Acquiring Contrastiv...

9   68   68  

News-classification

新闻分类系统&谣言处理系统

38   67   67  

jina-financial-qa-search

12   67   67  

biome-text

Custom Natural Language Processing with big and small models 🌲🌱

7   67   67  

klpt

The Kurdish Language Processing Toolkit

12   67   67  

MLR

Machine Learning Research

13   67   67  

vietnamese-electra

Electra pre-trained model using Vietnamese corpus

11   66   66  

indosum

A benchmark dataset for Indonesian text summarization.

13   66   66  

long-doc-summarization

Long Document Summarization Papers

3   66   66  

qutrub

Qutrub: Arabic verb conjugator

21   66   66  

node-corenlp

CoreNLP @ NodeJS

12   65   65