Topic

natural-language-processing

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

Repositories (1431)

monkeylearn-ruby
monkeylearn-ruby monkeylearn Ruby

Official Ruby client for the MonkeyLearn API. Build and consume machine learning models for language processing from your Ruby apps.

79
language-models
language-models ollie283 Python

Build unigram and bigram language models, implement Laplace smoothing and use the models to compute the perplexity of test corpora.

79
spacy-lookups-data
spacy-lookups-data explosion Python

📂 Additional lookup tables and data resources for spaCy

79
nlp-various-tutorials
nlp-various-tutorials Huffon Jupyter Notebook

자연어 처리와 관련한 여러 튜토리얼 저장소

79
Transformers-Domain-Adaptation
Transformers-Domain-Adaptation georgian-io Jupyter Notebook

Adapt Transformer-based language models to new text domains

79
google-natural-language-php
google-natural-language-php darrynten PHP

PHP Client for Google Natural Language with Extras

78
EmailParser
EmailParser mynameisvinn Python

remove signature blocks from emails

78
chicksexer
chicksexer kensk8er Python

A Python package for gender classification.

78
nlp_bangla
nlp_bangla raqueeb Jupyter Notebook

হাতেকলমে ন্যাচারাল ল্যাঙ্গুয়েজ প্রসেসিং (এনএলপি) - শুরুর ধারণা

78
canrevan
canrevan affjljoo3581 Python

대량의 네이버 뉴스 기사를 수집하는 라이브러리입니다.

78
healthsea
healthsea explosion Python

Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.

78
OpusFilter
OpusFilter Helsinki-NLP Python

OpusFilter - Parallel corpus processing toolkit

78
tickle
tickle yb66 Ruby

Natural language parser for recurring events

77
touchdown
touchdown lil-lab Python

Cornell Touchdown natural language navigation and spatial reasoning dataset.

77
Multilingual-Latent-Dirichlet-Allocation-LDA
Multilingual-Latent-Dirichlet-Allocation-LDA ArtificiAI Python

A Multilingual Latent Dirichlet Allocation (LDA) Pipeline with Stop Words Removal, n-gram features, and Inverse Stemming, in Python.

77
MAX-Speech-to-Text-Converter
MAX-Speech-to-Text-Converter IBM Python

Converts spoken words into text form.

77
WARP
WARP YerevaNN Python

Code for ACL'2021 paper WARP 🌀 Word-level Adversarial ReProgramming. Outperforming `GPT-3` on SuperGLUE Few-Shot text classification. https://aclanth...

77
HBMP
HBMP Helsinki-NLP Python

Sentence Embeddings in NLI with Iterative Refinement Encoders

76
tetre
tetre aoldoni Python

TETRE: a Toolkit for Exploring Text for Relation Extraction

76
Awesome-NLP-Research
Awesome-NLP-Research Yale-LILY
76
big-phoney
big-phoney repp Python

Get phonetic spellings and syllable counts for any english word. Works with made-up and non-dictionary words

76
Writing-Styles-Classification-Using-Stylometric-Analysis
Writing-Styles-Classification-Using-Stylometric-Analysis Hassaan-Elahi Python

✍️ An intelligent system that takes a document and classifies different writing styles within the document using stylometric techniques.

76
image-recognition-and-information-extraction-from-image-documents
image-recognition-and-information-extraction-from-image-documents IBM Jupyter Notebook

Image Recognition and Information Extraction from Image Documents using Keras and Watson NLU

75
FromScratch
FromScratch bmtgoncalves Jupyter Notebook
75
Turkish-Lemmatizer
Turkish-Lemmatizer akoksal Python

Lemmatization for Turkish Language

75
keras-crf-layer
keras-crf-layer Hironsan Python

Implementation of CRF layer in Keras.

74
CIKM-AnalytiCup-2018
CIKM-AnalytiCup-2018 zake7749 Python

[ACM-CIKM] 2nd place solution at CIKM AnalytiCup 2018, a task for determining short text similarities.

74
stanford-nlp-tagger
stanford-nlp-tagger patrickschur PHP

PHP wrapper for the Stanford Natural Language Processing library. Supports POSTagger and CRFClassifier.

74
bert_experimental
bert_experimental gaphex Jupyter Notebook

code and supplementary materials for a series of Medium articles about the BERT model

74
pytorch_basic_nmt
pytorch_basic_nmt pcyin Python

A simple yet strong implementation of neural machine translation in pytorch

74
TitleStylist
TitleStylist jind11 Python

Source code for our "TitleStylist" paper at ACL 2020

74
unsupervised_NER
unsupervised_NER ajitrajasekharan Python

Self-supervised NER prototype - updated version (69 entity types - 17 broad entity groups). Uses pretrained BERT models with no fine tuning. State-of-...

74
spacy-experimental
spacy-experimental explosion Python

🧪 Cutting-edge experimental spaCy components and features

74
yelp_comments_classification_nlp
yelp_comments_classification_nlp msahamed Jupyter Notebook

Yelp round-10 review comments classification using deep learning (LSTM and CNN) and natural language processing.

73
tupa
tupa danielhers Python

Transition-based UCCA Parser

73
bookworm
bookworm harrisonpim Jupyter Notebook

:books: social networks from novels

73
How-to-mine-newsfeed-data-and-extract-interactive-insights-in-Python
How-to-mine-newsfeed-data-and-extract-interactive-insights-in-Python ahmedbesbes HTML

A practical guide to topic mining and interactive visualizations

73
Free-Artificial-Intelligence-Resources
Free-Artificial-Intelligence-Resources aiwithqasim

Welcome, to this Open Source Repository regarding FREE ARTIFICIAL INTELLIGENCE RESOURCE. Get Benefit from the free resources mention & kindly five STA...

73
DaCy
DaCy centre-for-humanities-computing Jupyter Notebook

DaCy: The State of the Art Danish NLP pipeline using SpaCy

73
react-client
react-client speechly TypeScript

An React client library for Speechly API

72
frog
frog LanguageMachines C++

Frog is an integration of memory-based natural language processing (NLP) modules developed for Dutch. All NLP modules are based on Timbl, the Tilburg...

72
spanish-corpora
spanish-corpora josecannete Python

Unannotated Spanish 3 Billion Words Corpora

72
VideoSearchEngine
VideoSearchEngine AkshatSh Python

Semantically be able to search through a database of videos (using generated summaries)

72
A_chronology_of_deep_learning
A_chronology_of_deep_learning Ravoxsg

Tracing back and exposing in chronological order the main ideas in the field of deep learning, to help everyone better understand the current intense...

72
huggingartists
huggingartists AlekseyKorshuk Jupyter Notebook

Lyrics generation with GPT2-based Transformer

72
wen-notes
wen-notes HughWen

My notes.

71
textrank
textrank bnosac R

Summarise text by finding relevant sentences and keywords using the Textrank algorithm

71
NLG-RL
NLG-RL hassyGo Python

Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction

71
ucto
ucto LanguageMachines C++

Unicode tokeniser. Ucto tokenizes text files: it separates words from punctuation, and splits sentences. It offers several other basic preprocessing s...

70
hmrb
hmrb babylonhealth Python

Python Rule Processing Engine 🏺

70