Topic

nlp

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

Repositories (1261)

word2vec_pipeline
word2vec_pipeline NIHOPA Python

NLP pipeline using word2vec (preprocessing/embedding/prediction/clustering)

107
FocusSeq2Seq
FocusSeq2Seq clovaai Python

[EMNLP 2019] Mixture Content Selection for Diverse Sequence Generation (Question Generation / Abstractive Summarization)

107
kaggle-quora-question-pairs
kaggle-quora-question-pairs YuriyGuts Jupyter Notebook

My solution to Kaggle Quora Question Pairs competition (Top 2%, Private LB log loss 0.13497).

106
Intra-Bag-and-Inter-Bag-Attentions
Intra-Bag-and-Inter-Bag-Attentions ZhixiuYe Python

Code for NAACL 2019 paper: Distant Supervision Relation Extraction with Intra-Bag and Inter-Bag Attentions

106
toeicbert
toeicbert graykode Python

TOEIC(Test of English for International Communication) solving using pytorch-pretrained-BERT model.

106
bert-stable-fine-tuning
bert-stable-fine-tuning uds-lsv Python

On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines

106
Self-Attention-Keras
Self-Attention-Keras foamliu Python

自注意力与文本分类

105
MSR-NLP-Projects
MSR-NLP-Projects microsoft

This is a list of open-source projects at Microsoft Research NLP Group

105
onnx_transformers
onnx_transformers patil-suraj Jupyter Notebook

Accelerated NLP pipelines for fast inference on CPU. Built with Transformers and ONNX runtime.

105
PyTorch_GBW_LM
PyTorch_GBW_LM rdspring1 Python

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

104
twitter-sentiment-analysis
twitter-sentiment-analysis vspiewak Scala

Streaming tweets with spark, language detection & sentiment analysis, dashboard with Kibana

104
perceiver-io
perceiver-io esceptico Python

Unofficial implementation of Perceiver IO

104
ColXLM
ColXLM hannawong Python

Multilingual Retrieval on Yelp Search Engine ⚡

104
EN-FR-MLT-tensorflow
EN-FR-MLT-tensorflow deep-diver HTML

English-French Machine Language Translation in Tensorflow

103
lda-topic-modeling
lda-topic-modeling lettier PureScript

A PureScript, browser-based implementation of LDA topic modeling.

103
anuvada
anuvada EdGENetworks Python

Interpretable Models for NLP using PyTorch

102
nalaf
nalaf Rostlab Python

NLP framework in python for entity recognition and relationship extraction

102
pytorch_notebooks
pytorch_notebooks omarsar Jupyter Notebook

A collection of PyTorch notebooks for learning and practicing deep learning

102
tldr
tldr JesusIslam Go

Text summarizer for golang using LexRank

102
flat
flat proycon JavaScript

FoLiA Linguistic Annotation Tool -- Flat is a web-based linguistic annotation environment based around the FoLiA format (http://proycon.github.io/foli...

102
HPSG-Neural-Parser
HPSG-Neural-Parser DoodleJZ Python

Source code for "Head-Driven Phrase Structure Grammar Parsing on Penn Treebank" published at ACL 2019

102
fastrtext
fastrtext pommedeterresautee C++

R wrapper for fastText

101
self_dialogue_corpus
self_dialogue_corpus jfainberg Python

The Self-dialogue Corpus - a collection of self-dialogues across music, movies and sports

101
botfuel-dialog
botfuel-dialog Botfuel HTML

Botfuel SDK to build highly conversational chatbots

101
wiki-split
wiki-split google-research-datasets

One million English sentences, each split into two sentences that together preserve the original meaning, extracted from Wikipedia edits.

101
ApexNLP
ApexNLP 6thsolution Java

A natural language event parser for java and android.

101
tweet-stance-prediction
tweet-stance-prediction prrao87 Jupyter Notebook

Applying NLP transfer learning techniques to predict Tweet stance toward a topic

101
ruimtehol
ruimtehol bnosac C++

R package to Embed All the Things! using StarSpace

101
TRE
TRE DFKI-NLP Python

[AKBC 19] Improving Relation Extraction by Pre-trained Language Representations

100
lemmatizer
lemmatizer yohasebe Ruby

Lemmatizer for text in English. Inspired by Python's nltk.corpus.reader.wordnet.morphy

100
txtai.js
txtai.js neuml JavaScript

Build AI-powered semantic search applications in JavaScript

100
texera
texera Texera Java

Big Data Analytics Using Interactive Workflows

99
PyTorch
PyTorch gyunggyung Jupyter Notebook

PyTorch tutorials A to Z

99
transorthogonal-linguistics
transorthogonal-linguistics thoppe Python

Uses a distributed word representation to finds words along the hyperchord of two input words.

98
ulm-basenet
ulm-basenet bkj Python

Implementation of ULMFit algorithm for text classification via transfer learning

97
dialog-nlu
dialog-nlu MahmoudWahdan Jupyter Notebook

Tensorflow and Keras implementation of the state of the art researches in Dialog System NLU

96
Relation_Extraction
Relation_Extraction wadhwasahil Python

Relation Extraction using Deep learning(CNN)

95
excelcy
excelcy kororo Python

Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.

94
nlp-models
nlp-models epwalsh Python

NLP research experiments, built on PyTorch within the AllenNLP framework.

94
TextMatching
TextMatching Accagain2014 Python

Methods about Deep Learning for Text Matching

94
monkeylearn
monkeylearn ropensci-archive R

:no_entry: ARCHIVED :no_entry: Accesses the Monkeylearn API for Text Classifiers and Extractors

94
fingerprints
fingerprints alephdata Python

Make it easier to compare and cross-reference the names of companies and people by applying strong normalisation.

94
NLP
NLP ChunML Python

This is where I put all my work in Natural Language Processing

94
CrossLingualContextualEmb
CrossLingualContextualEmb TalSchuster Python

Cross-Lingual Alignment of Contextual Word Embeddings

92
jargon
jargon clipperhouse Go

Tokenizers and lemmatizers for Go

92
lockebot
lockebot nmstoker Python

LockeBot: a demonstration of implementing a basic question answering bot with use of Rasa and a database

91
KeywordExtraction
KeywordExtraction WuLC Java

Implementation of algorithm in keyword extraction,including TextRank,TF-IDF and the combination of both

90
self-attention-classification
self-attention-classification nn116003 Python

document classification using LSTM + self attention

90
word2vec
word2vec maxoodf C++

word2vec++ is a Distributed Representations of Words (word2vec) library and tools implementation, written in C++11 from the scratch

90
spacy_hunspell
spacy_hunspell tokestermw Python

:pencil2: Hunspell extension for spaCy 2.0.

89