Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

hmni

📛 Fuzzy Name Matching with Machine Learning

50   264   264  

nlp-labelling

Labelling platform for text using weak supervision.

18   264   264  

VSUA-Captioning

Code for "Aligning Linguistic Words and Visual Semantic Units for Imag...

24   264   264  

DeepResearch

This repository is the collection of research papers in Deep learning...

108   263   263  

tensorflow-ml-nlp-tf2

텐서플로2와 머신러닝으로 시작하는 자연어처리 (로지스틱회귀부터 BERT와...

134   262   262  

markup

A web-based document annotation tool, powered by GPT-4 :rocket:

31   262   262  

nlp-tutorial

自然语言处理(NLP)教程,包括:词向量,词法分析,预训练语言模型,文本...

50   262   262  

LagouJob

Job data mining repo for lagou.com

129   261   261  

character-based-cnn

Implementation of character based convolutional neural network

54   261   261  

summarizer

A Reddit bot that summarizes news articles written in Spanish or Engli...

31   260   260  

toxic

Toxic Comment Classification Challenge

75   259   259  

chatbot

Русскоязычный генеративный чатбот с профилем и фактами

64   259   259  

mmt

Multi-Modal Transformer for Video Retrieval

40   259   259  

text-segmentation

Implementation of the paper: Text Segmentation as a Supervised Learnin...

57   258   258  

weibo_terminator_workflow

Update Version of weibo_terminator, This is Workflow Version aim at Ge...

78   258   258  

negspacy

spaCy pipeline object for negating concepts in text

33   258   258  

genie-server

The home server version of Almond

40   258   258  

engtagger

English Part-of-Speech Tagger Library; a Ruby port of Lingua::EN::Tagg...

49   256   256  

spaczz

Fuzzy matching and more functionality for spaCy.

28   256   256  

Semantic-Retrieval-Models

A curated list of awesome papers for Semantic Retrieval (TOIS Accepted...

25   255   255  

kairon

Agentic AI platform that harnesses Visual LLM Chaining to build proact...

82   255   255  

vibrato

🎤 vibrato: Viterbi-based accelerated tokenizer

12   255   255  

scoper

Fuzzy and semantic search for captioned YouTube videos.

15   255   255  

Indic-BERT-v1

Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages an...

40   255   255  

rnn.wgan

Code for training and evaluation of the model from "Language Generatio...

76   254   254  

practical-1

Oxford Deep NLP 2017 course - Practical 1: word2vec

142   254   254  

awesome-hungarian-nlp

A curated list of NLP resources for Hungarian

19   254   254  

konoha

🌿 An easy-to-use Japanese Text Processing tool, which makes it possib...

28   253   253  

LasUIE

Universal Information Extraction, codes for the NeurIPS-2022 paper: Un...

3   253   253  

spacyr

R wrapper to spaCy NLP

38   253   253  

papers_we_read

Summaries for exciting works in the field of Deep Learning.

32   252   252  

KOMORAN

Korean Morphological Analyzer by shineware

59   252   252  

corus

Links to Russian corpora + Python functions for loading and parsing

19   252   252  

neat-vision

Neat (Neural Attention) Vision, is a visualization tool for the attent...

24   251   251  

BERT-AttributeExtraction

USING BERT FOR Attribute Extraction in KnowledgeGraph. fine-tuning and...

65   250   250  

awesome-nlp-polish

A curated list of resources dedicated to Natural Language Processing (...

33   250   250  

neologdn

Japanese text normalizer for mecab-neologd

18   249   249  

text-cnn-tensorflow

Convolutional Neural Networks for Sentence Classification(TextCNN) imp...

69   248   248  

masakhane-mt

Machine Translation for Africa

214   248   248  

gpl

Powerful unsupervised domain adaptation method for dense retrieval. Re...

31   248   248  

LiLT

Official PyTorch implementation of LiLT: A Simple yet Effective Langua...

30   248   248  

parsbert

🤗 ParsBERT: Transformer-based Model for Persian Language Understandin...

35   247   247  

ontogpt

GPT-based ontological extraction tools, including SPIRES

32   247   247  

torchnlp

Easy to use NLP library built on PyTorch and TorchText

45   247   247  

multi_rake

Multilingual Rapid Automatic Keyword Extraction (RAKE) for Python

36   247   247  

lemmatization-lists

Machine-readable lists of lemma-token pairs in 23 languages.

93   247   247  

Speech_Signal_Processing_and_Classification

Front-end speech processing aims at extracting proper features from sh...

66   247   247  

mead-baseline

Deep-Learning Model Exploration and Development for NLP

74   245   245  

nlp_made_easy

Explains nlp building blocks in a simple manner.

33   244   244  

concise-concepts

This repository contains an easy and intuitive approach to few-shot NE...

14   244   244