Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
一款简单好用的 跨平台/多语言的 相似向量/相似词/相似句 高性能检索引擎。欢迎star & fork。Build together! Power another !
A suite of Arabic natural language processing tools developed by the CAMeL Lab at New York University Abu Dhabi.
Python + Inference - Model Deployment library in Python. Simplest model inference server ever.
Awesome papers on Language-Model-as-a-Service (LMaaS)
Happy Transformer makes it easy to fine-tune and perform inference with NLP Transformer models.
Mengzi Pretrained Models
BERT for Multitask Learning
Code for producing Japanese pretrained models provided by rinna Co., Ltd.
🔎 Semantic search for developers
同义词表,反义词表,否定词表
🤖 AI browser extensions & userscripts to augment your web experience
NLP 领域常见任务的实现,包括新词发现、以及基于pytorch的词向量、中文文本分类、实体识别、摘要文本生成、句子相似度判断、三元组抽取、预训练模型等。
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
[EMNLP'25 findings] This is the official repo for the paper, HiRAG: Retrieval-Augmented Generation with Hierarchical Knowledge.
Medical Concept Annotation Tool
PHP Text Analysis is a library for performing Information Retrieval (IR) and Natural Language Processing (NLP) tasks using the PHP language
Extraction of the journalistic five W and one H questions (5W1H) from news articles: who did what, when, where, why, and how?
Multiple implementations for abstractive text summurization , using google colab
A web-based annotation tool for natural language processing (NLP)
🔎 SimilaritySearchKit is a Swift package providing on-device text embeddings and semantic search functionality for iOS and macOS applications.
👉 Tensorflow 2.x resources such as tutorial, blog, code and videos
Automatically generate headlines to short articles
pytextclassifier is a toolkit for text classification. 文本分类,LR,Xgboost,TextCNN,FastText,TextRNN,BERT等分类模型实现,开箱即用。
[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents
My Keras implementation of the Deep Semantic Similarity Model (DSSM)/Convolutional Latent Semantic Model (CLSM) described here: http://research.micros...
Analyze the unstructured data with Towhee, such as reverse image search, reverse video search, audio classification, question and answer systems, mole...
Curated list of open-access/open-source/off-the-shelf resources and tools developed with a particular focus on German
🔍 Search Engine for a Procedural Simulation of the Web with GPT-3.
Teamlinker is a team collaboration platform that integrates multi-functional modules. Users can process tasks in parallel, including six functional mo...
a free, non-AI python grammar checker 📝✅
A Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis.
Python interface to CoreNLP using a bidirectional server-client interface.
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agraw...
Explore a comprehensive collection of resources, tutorials, papers, tools, and best practices for fine-tuning Large Language Models (LLMs). Perfect fo...
A library for easily merging multiple LLM experts, and efficiently train the merged LLM.
Multi-modality pre-training
The only open-source toolkit that can download SEC EDGAR financial reports and extract textual data from specific item sections into nice & clean stru...
Computing similarity of two sentences with google's BERT algorithm。利用Bert计算句子相似度。语义相似度计算。文本相似度计算。
Monthly Series - Top 10 Machine Learning Articles
🍳 Recipes for the Prodigy, our fully scriptable annotation tool
🕵️♂️ Library designed for developers eager to explore the potential of Large Language Models (LLMs) and other generative AI through a clean, effectiv...
Word Embeddings in Go!
KoBERT와 CRF로 만든 한국어 개체명인식기 (BERT+CRF based Named Entity Recognition model for Korean)
A curated list of Open Information Extraction (OIE) resources: papers, code, data, etc.
BETO - Spanish version of the BERT model
FashionCLIP is a CLIP-like model fine-tuned for the fashion domain.
Live Training for Open-source Big Models
Giant Language Model Test Room
A comprehensive Data and Text Mining workflow for submissions and comments from any given public subreddit.
Retrieve, Read and LinK: Fast and Accurate Entity Linking and Relation Extraction on an Academic Budget (ACL 2024)