Topic

nlp

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

Repositories (1462)

caiss
caiss ChunelFeng C++

一款简单好用的 跨平台/多语言的 相似向量/相似词/相似句 高性能检索引擎。欢迎star & fork。Build together! Power another !

547
camel_tools
camel_tools CAMeL-Lab Python

A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.

546
pinferencia
pinferencia underneathall Python

Python + Inference - Model Deployment library in Python. Simplest model inference server ever.

545
LMaaS-Papers
LMaaS-Papers txsun1997

Awesome papers on Language-Model-as-a-Service (LMaaS)

545
happy-transformer
happy-transformer EricFillion Python

Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.

545
Mengzi
Mengzi Langboat

Mengzi Pretrained Models

544
m3tl
m3tl JayYip Jupyter Notebook

BERT for Multitask Learning

544
japanese-pretrained-models
japanese-pretrained-models rinnakk Python

Code for producing Japanese pretrained models provided by rinna Co., Ltd.

543
codequestion
codequestion neuml Python

🔎 Semantic search for developers

543
chinese_dictionary
chinese_dictionary guotong1988

同义词表,反义词表,否定词表

542
ai-web-extensions
ai-web-extensions adamlui JavaScript

🤖 AI browser extensions & userscripts to augment your web experience

540
nlp-notebook
nlp-notebook jasoncao11 Python

NLP 领域常见任务的实现,包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。

536
Wordless
Wordless BLKSerene Python

An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation

536
HiRAG
HiRAG hhy-huang Python

[EMNLP'25 findings] This is the official repo for the paper, HiRAG: Retrieval-Augmented Generation with Hierarchical Knowledge.

535
MedCAT
MedCAT CogStack Python

Medical Concept Annotation Tool

531
php-text-analysis
php-text-analysis yooper PHP

PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language

531
Giveme5W1H
Giveme5W1H fhamborg HTML

Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?

530
text_summurization_abstractive_methods
text_summurization_abstractive_methods theamrzaki Jupyter Notebook

Multiple implementations for abstractive text summurization , using google colab

530
poplar
poplar synyi TypeScript

A web-based annotation tool for natural language processing (NLP)

529
similarity-search-kit
similarity-search-kit ZachNagengast Swift

🔎 SimilaritySearchKit is a Swift package providing on-device text embeddings and semantic search functionality for iOS and macOS applications.

524
awesome-tensorflow-2
awesome-tensorflow-2 Amin-Tgz

👉 Tensorflow 2.x resources such as tutorial, blog, code and videos

524
headlines
headlines udibr Jupyter Notebook

Automatically generate headlines to short articles

524
pytextclassifier
pytextclassifier shibing624 Python

pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,BERT等分类模型实现,开箱即用。

523
WebShop
WebShop princeton-nlp Python

[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents

523
Deep-Semantic-Similarity-Model
Deep-Semantic-Similarity-Model airalcorn2 Python

My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.micros...

521
examples
examples towhee-io Jupyter Notebook

Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, mole...

520
German-NLP
German-NLP adbar

Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German

520
Goopt
Goopt jokenox JavaScript

🔍 Search Engine for a Procedural Simulation of the Web with GPT-3.

519
Teamlinker
Teamlinker Teamlinker TypeScript

Teamlinker is a team collaboration platform that integrates multi-functional modules. Users can process tasks in parallel, including six functional mo...

519
language_tool_python
language_tool_python jxmorris12 Python

a free, non-AI python grammar checker 📝✅

518
fugashi
fugashi polm C++

A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.

518
python-stanford-corenlp
python-stanford-corenlp stanfordnlp Python

Python interface to CoreNLP using a bidirectional server-client interface.

518
OmniNet
OmniNet subho406 Python

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agraw...

514
awesome-llms-fine-tuning
awesome-llms-fine-tuning Curated-Awesome-Lists

Explore a comprehensive collection of resources, tutorials, papers, tools, and best practices for fine-tuning Large Language Models (LLMs). Perfect fo...

513
mergoo
mergoo Leeroo-AI Python

A library for easily merging multiple LLM experts, and efficiently train the merged LLM.

511
XPretrain
XPretrain microsoft Python

Multi-modality pre-training

511
edgar-crawler
edgar-crawler lefterisloukas Python

The only open-source toolkit that can download SEC EDGAR financial reports and extract textual data from specific item sections into nice & clean stru...

511
BertSimilarity
BertSimilarity Brokenwind Python

Computing similarity of two sentences with google's BERT algorithm。利用Bert计算句子相似度。语义相似度计算。文本相似度计算。

509
machine-learning-articles
machine-learning-articles Mybridge

Monthly Series - Top 10 Machine Learning Articles

507
prodigy-recipes
prodigy-recipes explosion Jupyter Notebook

🍳 Recipes for the Prodigy, our fully scriptable annotation tool

507
agency
agency neurocult Go

🕵️‍♂️ Library designed for developers eager to explore the potential of Large Language Models (LLMs) and other generative AI through a clean, effectiv...

506
wego
wego ynqa Go

Word Embeddings in Go!

506
pytorch-bert-crf-ner
pytorch-bert-crf-ner eagle705 Jupyter Notebook

KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)

505
oie-resources
oie-resources gkiril

A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.

504
beto
beto dccuchile

BETO - Spanish version of the BERT model

503
fashion-clip
fashion-clip patrickjohncyh Python

FashionCLIP is a CLIP-like model fine-tuned for the fashion domain.

502
CPM-Live
CPM-Live OpenBMB Python

Live Training for Open-source Big Models

501
detecting-fake-text
detecting-fake-text HendrikStrobelt TypeScript

Giant Language Model Test Room

499
subreddit-analyzer
subreddit-analyzer PhantomInsights Python

A comprehensive Data and Text Mining workflow for submissions and comments from any given public subreddit.

499
relik
relik SapienzaNLP Python

Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)

499