Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

word2vec_pipeline

NLP pipeline using word2vec (preprocessing/embedding/prediction/cluste...

14 107 107

FocusSeq2Seq

[EMNLP 2019] Mixture Content Selection for Diverse Sequence Generation...

19 107 107

kaggle-quora-question-pairs

My solution to Kaggle Quora Question Pairs competition (Top 2%, Privat...

39 106 106

Intra-Bag-and-Inter-Bag-Attentions

Code for NAACL 2019 paper: Distant Supervision Relation Extraction wit...

30 106 106

toeicbert

TOEIC(Test of English for International Communication) solving using p...

24 106 106

bert-stable-fine-tuning

On the Stability of Fine-tuning BERT: Misconceptions, Explanations, an...

15 106 106

Self-Attention-Keras

自注意力与文本分类

34 105 105

MSR-NLP-Projects

This is a list of open-source projects at Microsoft Research NLP Group

8 105 105

onnx_transformers

Accelerated NLP pipelines for fast inference on CPU. Built with Transf...

25 105 105

PyTorch_GBW_LM

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

19 104 104

twitter-sentiment-analysis

Streaming tweets with spark, language detection & sentiment analysis,...

72 104 104

perceiver-io

Unofficial implementation of Perceiver IO

6 104 104

ColXLM

Multilingual Retrieval on Yelp Search Engine ⚡

17 104 104

EN-FR-MLT-tensorflow

English-French Machine Language Translation in Tensorflow

47 103 103

lda-topic-modeling

A PureScript, browser-based implementation of LDA topic modeling.

17 103 103

anuvada

Interpretable Models for NLP using PyTorch

13 102 102

nalaf

NLP framework in python for entity recognition and relationship extrac...

24 102 102

pytorch_notebooks

A collection of PyTorch notebooks for learning and practicing deep lea...

107 102 102

tldr

Text summarizer for golang using LexRank

20 102 102

flat

FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic ann...

15 102 102

HPSG-Neural-Parser

Source code for "Head-Driven Phrase Structure Grammar Parsing on Penn...

25 102 102

fastrtext

R wrapper for fastText

15 101 101

self_dialogue_corpus

The Self-dialogue Corpus - a collection of self-dialogues across music...

24 101 101

botfuel-dialog

Botfuel SDK to build highly conversational chatbots

18 101 101

wiki-split

One million English sentences, each split into two sentences that toge...

4 101 101

ApexNLP

A natural language event parser for java and android.

12 101 101

tweet-stance-prediction

Applying NLP transfer learning techniques to predict Tweet stance towa...

59 101 101

ruimtehol

R package to Embed All the Things! using StarSpace

13 101 101

TRE

[AKBC 19] Improving Relation Extraction by Pre-trained Language Repres...

13 100 100

lemmatizer

Lemmatizer for text in English. Inspired by Python's nltk.corpus.read...

15 100 100

txtai.js

Build AI-powered semantic search applications in JavaScript

3 100 100

texera

Big Data Analytics Using Interactive Workflows

46 99 99

PyTorch

PyTorch tutorials A to Z

47 99 99

transorthogonal-linguistics

Uses a distributed word representation to finds words along the hyperc...

10 98 98

ulm-basenet

Implementation of ULMFit algorithm for text classification via transfe...

19 97 97

dialog-nlu

Tensorflow and Keras implementation of the state of the art researches...

40 96 96

Relation_Extraction

Relation Extraction using Deep learning(CNN)

50 95 95

excelcy

Excel Integration with spaCy. Training NER using Excel/XLSX from PDF,...

9 94 94

nlp-models

NLP research experiments, built on PyTorch within the AllenNLP framewo...

9 94 94

TextMatching

Methods about Deep Learning for Text Matching

37 94 94

monkeylearn

:no_entry: ARCHIVED :no_entry: Accesses the Monkeylearn API for Text C...

17 94 94

fingerprints

Make it easier to compare and cross-reference the names of companies a...

16 94 94

NLP

This is where I put all my work in Natural Language Processing

48 94 94

CrossLingualContextualEmb

Cross-Lingual Alignment of Contextual Word Embeddings

9 92 92

jargon

Tokenizers and lemmatizers for Go

2 92 92

lockebot

LockeBot: a demonstration of implementing a basic question answering b...

29 91 91

KeywordExtraction

Implementation of algorithm in keyword extraction,including TextRank,...

35 90 90

self-attention-classification

document classification using LSTM + self attention

28 90 90

word2vec

word2vec++ is a Distributed Representations of Words (word2vec) librar...

19 90 90

dataflow-opinion-analysis

Opinion Analysis of News, Threaded Conversations, and User Generated C...

27 89 89