Most popular natural-language-processing repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

Scan2Cap

[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

18   81   81  

responsibly

Toolkit for Auditing and Mitigating Bias and Fairness of Machine Learn...

16   80   80  

cdQA-annotator

⛔ [NOT MAINTAINED] A web-based annotator for closed-domain question an...

42   80   80  

TwitterScraper

Scrape a User's Twitter data! Bypass the 3,200 tweet API limit for a U...

13   80   80  

classy

classy is a simple-to-use library for building high-performance Machin...

3   80   80  

KoEDA

Korean Easy Data Augmentation

4   80   80  

SimKGC

ACL 2022, SimKGC: Simple Contrastive Knowledge Graph Completion with P...

16   80   80  

monkeylearn-ruby

Official Ruby client for the MonkeyLearn API. Build and consume machin...

13   79   79  

language-models

Build unigram and bigram language models, implement Laplace smoothing...

42   79   79  

spacy-lookups-data

📂 Additional lookup tables and data resources for spaCy

45   79   79  

Transformers-Domain-Adaptation

Adapt Transformer-based language models to new text domains

12   79   79  

CIKM-AnalytiCup-2018

[ACM-CIKM] 2nd place solution at CIKM AnalytiCup 2018, a task for dete...

15   78   78  

EmailParser

remove signature blocks from emails

17   78   78  

chicksexer

A Python package for gender classification.

26   78   78  

slate

A Super-Lightweight Annotation Tool for Experts: Label text in a termi...

9   78   78  

nlp_bangla

হাতেকলমে ন্যাচারাল ল্যাঙ্গুয়েজ প্রসেসিং (এনএলপি) - শুরুর ধারণা

59   78   78  

canrevan

대량의 네이버 뉴스 기사를 수집하는 라이브러리입니다.

14   78   78  

healthsea

Healthsea is a spaCy pipeline for analyzing user reviews of supplement...

12   78   78  

OpusFilter

OpusFilter - Parallel corpus processing toolkit

15   78   78  

google-natural-language-php

PHP Client for Google Natural Language with Extras

12   77   77  

tickle

Natural language parser for recurring events

10   77   77  

lima

The Libre Multilingual Analyzer, a Natural Language Processing (NLP) C...

19   77   77  

touchdown

Cornell Touchdown natural language navigation and spatial reasoning da...

11   77   77  

Multilingual-Latent-Dirichlet-Allocation-LDA

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Wo...

26   77   77  

WARP

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming. O...

14   77   77  

HBMP

Sentence Embeddings in NLI with Iterative Refinement Encoders

16   76   76  

tetre

TETRE: a Toolkit for Exploring Text for Relation Extraction

7   76   76  

stanford-nlp-tagger

PHP wrapper for the Stanford Natural Language Processing library. Supp...

12   76   76  

Awesome-NLP-Research

14   76   76  

CoARiJ

Corpus of Annual Reports in Japan

7   76   76  

big-phoney

Get phonetic spellings and syllable counts for any english word. Works...

13   76   76  

Writing-Styles-Classification-Using-Stylometric-Analysis

✍️ An intelligent system that takes a document and classifies differen...

30   76   76  

nervaluate

Full named-entity (i.e., not tag/token) evaluation metrics based on Se...

13   76   76  

keras-crf-layer

Implementation of CRF layer in Keras.

32   75   75  

vertikin

:eyeglasses: Platform to automatically detect what user might be inter...

18   75   75  

image-recognition-and-information-extraction-from-image-documents

Image Recognition and Information Extraction from Image Documents usin...

31   75   75  

FromScratch

70   75   75  

Turkish-Lemmatizer

Lemmatization for Turkish Language

12   75   75  

pujangga

Pujangga - Indonesian Natural Language Processing Tool with REST API,...

32   75   75  

bert_experimental

code and supplementary materials for a series of Medium articles about...

29   74   74  

Twitter_Geolocation

Geolocating twitter users by the content of their tweets

42   74   74  

pytorch_basic_nmt

A simple yet strong implementation of neural machine translation in py...

22   74   74  

TitleStylist

Source code for our "TitleStylist" paper at ACL 2020

5   74   74  

unsupervised_NER

Self-supervised NER prototype - updated version (69 entity types - 17...

21   74   74  

slp3-zh

《自然语言处理综论》第三版翻译。

6   74   74  

spacy-experimental

🧪 Cutting-edge experimental spaCy components and features

17   74   74  

yelp_comments_classification_nlp

Yelp round-10 review comments classification using deep learning (LSTM...

55   73   73  

tupa

Transition-based UCCA Parser

22   73   73  

bookworm

:books: social networks from novels

17   73   73  

How-to-mine-newsfeed-data-and-extract-interactive-insights-in-Python

A practical guide to topic mining and interactive visualizations

51   73   73