Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

spikex

SpikeX - SpaCy Pipes for Knowledge Extraction

28   390   390  

FakeNewsCorpus

A dataset of millions of news articles scraped from a curated list of...

97   389   389  

medspacy

Library for clinical NLP with spaCy.

69   387   387  

lawyer-llama

中文法律LLaMA (LLaMA for Chinese legel domain)

36   387   387  

paragraph-vectors

:page_facing_up: A PyTorch implementation of Paragraph Vectors (doc2ve...

75   386   386  

pen.el

Pen.el stands for Prompt Engineering in emacs. It facilitates the crea...

13   385   385  

pysentimiento

A Python multilingual toolkit for Sentiment Analysis and Social NLP ta...

55   385   385  

OpenICL

OpenICL is an open-source framework to facilitate research, developmen...

18   385   385  

holmes-extractor

Information extraction from English and German texts based on predicat...

41   384   384  

beginner_nlp

A curated list of beginner resources in Natural Language Processing

83   383   383  

customizable-gpt-chatbot

A dynamic, scalable AI chatbot built with Django REST framework, suppo...

83   383   383  

ChatIE

official repository for ChatIE paper and a tool of IE using ChatGPT....

31   381   381  

nlp

[UNMANTEINED] Extract values from strings and fill your structs with n...

35   381   381  

BERT-CH-NER

基于BERT的中文命名实体识别

96   378   378  

NLP_bahasa_resources

A Curated List of Dataset and Usable Library Resources for NLP in Baha...

114   378   378  

AzureML-BERT

End-to-End recipes for pre-training and fine-tuning BERT using Azure M...

129   377   377  

Awesome-Distributed-Deep-Learning

A curated list of awesome Distributed Deep Learning resources.

82   376   376  

airy

💬 Open Source App Framework to build streaming apps with real-time d...

44   376   376  

autocorrect

Spelling corrector in python

66   373   373  

Python-AI

深度学习100例、深度学习DL、图片分类、目标识别、目标检测、自然语言处理n...

98   373   373  

pytorch_chinese_lm_pretrain

pytorch中文语言模型预训练

78   372   372  

botonic

Build chatbots and conversational experiences using React

58   372   372  

nlp_highlights

The most important NLP highlights of 2018 (PDF Report)

70   371   371  

interpret-text

A library that incorporates state-of-the-art explainers for text-based...

66   369   369  

opencc4j

🇨🇳Open Chinese Convert is an opensource project for conversion between...

56   369   369  

commit-autosuggestions

A tool that AI automatically recommends commit messages.

15   368   368  

gcn-over-pruned-trees

Graph Convolution over Pruned Dependency Trees Improves Relation Extra...

71   366   366  

link-grammar

The CMU Link Grammar natural language parser

117   366   366  

delft

a Deep Learning Framework for Text

63   365   365  

studies

Notes of Develop/NLP/DeepLearning/Algorithms/LeetCodes

71   365   365  

splade

SPLADE: sparse neural search (SIGIR21, SIGIR22)

57   365   365  

Data-Science-Hacks

Data Science Hacks consists of tips, tricks to help you become a bette...

301   364   364  

gomarkov

Markov chains in golang

34   363   363  

nlp_fundamentals

📘 Contains a series of hands-on notebooks for learning the fundamenta...

42   361   361  

bert4pytorch

超轻量级bert的pytorch版本,大量中文注释,容易修改结构,持续更新

67   359   359  

nlp_thai_resources

More than 50+ collections of Thai Natural Language Processing librarie...

73   358   358  

ESIM

Implementation of the ESIM model for natural language inference with P...

103   358   358  

rat-sql

A relation-aware semantic parsing model from English to SQL

112   357   357  

large_language_model_training_playbook

An open collection of implementation tips, tricks and resources for tr...

12   357   357  

Hierarchical-attention-networks-pytorch

Hierarchical Attention Networks for document classification

98   355   355  

pytorch_RVAE

Recurrent Variational Autoencoder that generates sequential data imple...

91   353   353  

MyDataSciencePortfolio

Applying Data Science and Machine Learning to Solve Real World Busines...

224   353   353  

transformer-tensorflow

TensorFlow implementation of 'Attention Is All You Need (2017. 6)'

113   349   349  

aravec

AraVec is a pre-trained distributed word representation (word embeddin...

78   348   348  

LLamaSharp

C#/.NET binding of llama.cpp, including LLaMa/GPT model inference and...

42   348   348  

jumanpp

Juman++ (a Morphological Analyzer Toolkit)

40   347   347  

tensorlayer-tricks

How to use TensorLayer

62   346   346  

Few-NERD

Code and data of ACL 2021 paper "Few-NERD: A Few-shot Named Entity Rec...

57   346   346  

libai

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training

47   346   346