Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

Web-Database-Analytics

Web scrapping and related analytics using Python tools

163   231   231  

onnxt5

Summarization, translation, sentiment-analysis, text-generation and mo...

30   231   231  

zshot

Zero and Few shot named entity & relationships recognition

14   231   231  

headliner

🏖 Easy training and deployment of seq2seq models.

41   230   230  

machine-learning

从零基础开始机器学习之旅

88   230   230  

nlp_learning

结合python一起学习自然语言处理 (nlp): 语言模型、HMM、PCFG、Word2vec、...

91   230   230  

Persian-Swear-Words

Persian Swear Dataset - you can use in your production to filter unwan...

32   229   229  

question_generator

An NLP system for generating reading comprehension questions

66   229   229  

vert-papers

This repository contains code and datasets related to entity/knowledge...

87   229   229  

pyfasttext

Yet another Python binding for fastText

31   228   228  

nlp_profiler

A simple NLP library allows profiling datasets with one or more text c...

35   228   228  

SOHU_competition

Sohu's 2018 content recognition competition 1st solution(搜狐内容识别...

75   227   227  

DL-for-Chatbot

Deep Learning / NLP tutorial for Chatbot Developers

64   227   227  

spaczz

Fuzzy matching and more functionality for spaCy.

26   227   227  

4675-scifi

chinese NLP corpus of chinese science fiction,chinese science fiction...

37   226   226  

cs224n-2017-winter

All lecture notes, slides and assignments from CS224n: Natural Languag...

118   225   225  

fastPunct

Punctuation restoration and spell correction experiments.

34   225   225  

FedNLP

FedNLP: An Industry and Research Integrated Platform for Federated Lea...

42   225   225  

shared_colab_notebooks

A Repo to store the Google Colaboratory Notebooks that I have created...

59   225   225  

TextDescriptives

A Python library for calculating a large variety of metrics from text

19   225   225  

concise-concepts

This repository contains an easy and intuitive approach to few-shot NE...

19   225   225  

GermanWordEmbeddings

Toolkit to obtain and preprocess german corpora, train models using wo...

51   224   224  

TextCluster

短文本聚类预处理模块 Short text cluster

57   224   224  

vec4ir

Word Embeddings for Information Retrieval

41   223   223  

LemmInflect

A python module for English lemmatization and inflection.

23   223   223  

dialogbot

dialogbot, provide search-based dialogue, task-based dialogue and gene...

50   223   223  

razdel

Rule-based token, sentence segmentation for Russian language

28   222   222  

ml-projects

ML based projects such as Spam Classification, Time Series Analysis, T...

106   222   222  

nlplot

Visualization Module for Natural Language Processing

13   222   222  

text-dedup

All-in-one text de-duplication

28   221   221  

CNN-text-classification-keras

Text Classification by Convolutional Neural Network in Keras

94   220   220  

lm-spanish

Official source for spanish Language Models and resources made @ BSC-T...

18   220   220  

segmentit

任何 JS 环境可用的中文分词包,fork from leizongmin/node-segment

13   219   219  

KB-ALBERT

KB국민은행에서 제공하는 경제/금융 도메인에 특화된 한국어 ALBERT 모델

44   219   219  

label-sleuth

Open source no-code system for text annotation and building of text cl...

38   218   218  

JFastText

Java interface for fastText

99   217   217  

coursera-natural-language-processing-specialization

Programming assignments from all courses in the Coursera Natural Langu...

273   217   217  

embeddings

Fast, DB Backed pretrained word embeddings for natural language proces...

28   216   216  

claf

CLaF: Open-Source Clova Language Framework

37   216   216  

ocrpy

OCR, Archive, Index and Search: Implementation agnostic OCR framework.

8   216   216  

Speech_Signal_Processing_and_Classification

Front-end speech processing aims at extracting proper features from sh...

61   215   215  

AGI-Papers

Papers and Book to look at when starting AGI 📚

28   215   215  

KeyphraseVectorizers

Set of vectorizers that extract keyphrases with part-of-speech pattern...

32   215   215  

Cornucopia-LLaMA-Fin-Chinese

聚宝盆(Cornucopia): 基于中文金融知识的LLaMA微调模型;涉及SFT、RLHF、GP...

22   215   215  

OpenGPT

A framework for creating grounded instruction based datasets and train...

25   215   215  

bert-chainer

Chainer implementation of "BERT: Pre-training of Deep Bidirectional Tr...

41   214   214  

open-sesame

A frame-semantic parsing system based on a softmax-margin SegRNN.

65   214   214  

radish

C++ model train&inference framework

36   214   214  

PIE

Fast + Non-Autoregressive Grammatical Error Correction using BERT. Cod...

40   214   214  

graph-convolution-nlp

Graph Convolution Network for NLP

36   213   213