Most popular nlp repositories and open source projects

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

holmes-extractor

Information extraction from English and German texts based on predicat...

41   384   384  

beginner_nlp

A curated list of beginner resources in Natural Language Processing

83   383   383  

ArticutAPI

API of Articut 中文斷詞 (兼具語意詞性標記):「斷詞」又稱「分詞」,是中...

33   383   383  

nlp

[UNMANTEINED] Extract values from strings and fill your structs with n...

35   381   381  

ChatIE

official repository for ChatIE paper and a tool of IE using ChatGPT....

31   381   381  

BERT-CH-NER

基于BERT的中文命名实体识别

96   378   378  

NLP_bahasa_resources

A Curated List of Dataset and Usable Library Resources for NLP in Baha...

114   378   378  

AzureML-BERT

End-to-End recipes for pre-training and fine-tuning BERT using Azure M...

129   377   377  

Awesome-Distributed-Deep-Learning

A curated list of awesome Distributed Deep Learning resources.

82   376   376  

autocorrect

Spelling corrector in python

66   373   373  

Python-AI

深度学习100例、深度学习DL、图片分类、目标识别、目标检测、自然语言处理n...

98   373   373  

pytorch_chinese_lm_pretrain

pytorch中文语言模型预训练

78   372   372  

botonic

Build chatbots and conversational experiences using React

58   372   372  

nlp_highlights

The most important NLP highlights of 2018 (PDF Report)

75   370   370  

interpret-text

A library that incorporates state-of-the-art explainers for text-based...

66   369   369  

opencc4j

🇨🇳Open Chinese Convert is an opensource project for conversion between...

56   369   369  

Octopii

An AI-powered Personal Identifiable Information (PII) scanner.

22   369   369  

commit-autosuggestions

A tool that AI automatically recommends commit messages.

15   368   368  

gcn-over-pruned-trees

Graph Convolution over Pruned Dependency Trees Improves Relation Extra...

71   366   366  

link-grammar

The CMU Link Grammar natural language parser

117   366   366  

delft

a Deep Learning Framework for Text

63   365   365  

studies

Notes of Develop/NLP/DeepLearning/Algorithms/LeetCodes

71   365   365  

splade

SPLADE: sparse neural search (SIGIR21, SIGIR22)

57   365   365  

Data-Science-Hacks

Data Science Hacks consists of tips, tricks to help you become a bette...

301   364   364  

gomarkov

Markov chains in golang

34   363   363  

nlp_fundamentals

📘 Contains a series of hands-on notebooks for learning the fundamental...

42   361   361  

ai-study

人工智能学习资料超全整理,包含机器学习基础ML、深度学习基础DL、计算机视...

54   359   359  

bert4pytorch

超轻量级bert的pytorch版本,大量中文注释,容易修改结构,持续更新

67   359   359  

nlp_thai_resources

More than 50+ collections of Thai Natural Language Processing librarie...

73   358   358  

ESIM

Implementation of the ESIM model for natural language inference with P...

103   358   358  

rat-sql

A relation-aware semantic parsing model from English to SQL

112   357   357  

large_language_model_training_playbook

An open collection of implementation tips, tricks and resources for tr...

12   357   357  

Hierarchical-attention-networks-pytorch

Hierarchical Attention Networks for document classification

98   355   355  

pytorch_RVAE

Recurrent Variational Autoencoder that generates sequential data imple...

91   353   353  

MyDataSciencePortfolio

Applying Data Science and Machine Learning to Solve Real World Busines...

224   353   353  

FakeNewsCorpus

A dataset of millions of news articles scraped from a curated list of...

95   352   352  

chatgpt.js

🤖 A powerful client-side JavaScript library for ChatGPT

26   350   350  

transformer-tensorflow

TensorFlow implementation of 'Attention Is All You Need (2017. 6)'

113   349   349  

aravec

AraVec is a pre-trained distributed word representation (word embeddin...

78   348   348  

LLamaSharp

C#/.NET binding of llama.cpp, including LLaMa/GPT model inference and...

42   348   348  

tensorlayer-tricks

How to use TensorLayer

63   347   347  

jumanpp

Juman++ (a Morphological Analyzer Toolkit)

40   347   347  

Few-NERD

Code and data of ACL 2021 paper "Few-NERD: A Few-shot Named Entity Rec...

57   346   346  

libai

LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training

47   346   346  

a-PyTorch-Tutorial-to-Sequence-Labeling

Empower Sequence Labeling with Task-Aware Neural Language Model | a P...

80   345   345  

multi-task-NLP

multi_task_NLP is a utility toolkit enabling NLP developers to easily...

55   345   345  

displacy

:boom: displaCy.js: An open-source NLP visualiser for the modern web

82   344   344  

tacred-relation

PyTorch implementation of the position-aware attention model for relat...

97   344   344  

memn2n

End-To-End Memory Network using Tensorflow

140   343   343