Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
A PyTorch-based knowledge distillation toolkit for natural language processing
Pre-trained Chinese ELECTRA(中文ELECTRA预训练模型)
北京航空航天大学大数据高精尖中心自然语言处理研究团队开展了智能问答的研究与应用总结。包括基于知识图谱的问答(KBQA),基于文本的问答系统(TextQA),基于...
🪐 End-to-end NLP workflows from prototype to production
🛸 Use pretrained transformers like BERT, XLNet and GPT-2 in spaCy
:memo: Подборка ресурсов по машинному обучению
the open-source virtual assistant for Ubuntu based Linux distributions
Evaluation code for various unsupervised automated metrics for Natural Language Generation.
A list of NLP(Natural Language Processing) tutorials
📄 🤖 Semantic search and workflows for medical/scientific papers
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
"結巴"中文分詞:做最好的 PHP 中文分詞、中文斷詞組件。 / "Jieba" (Chinese for "to stutter") Chinese text segmentation: built to be the best PHP Chinese...
MNBVC(Massive Never-ending BT Vast Chinese corpus)超大规模中文语料集。对标chatGPT训练的40T数据。MNBVC数据集不但包括主流文化,也包括各个小众文化甚至火...
A persistent, network resilient, full text search library for the browser and Node.js
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
Python package for Korean natural language processing.
Collection of open-source libraries and tools for Robotic Process Automation (RPA), designed to be used with both Robot Framework and Python
Deprecated in favor of https://github.com/facebook/duckling
Tribuo - A Java machine learning library
A BERT model for scientific text.
Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.
Obsei is a low code AI powered automation tool. It can be used in various business flows like social listening, AI based alerting, brand image analysi...
🦙 Integrating LLMs into structured NLP pipelines
Basaran is an open-source alternative to the OpenAI text completion API. It provides a compatible streaming API for your Hugging Face Transformers-bas...
Overview of Modern Deep Learning Techniques Applied to Natural Language Processing
Developer friendly Natural Language Processing ✨
The most accurate natural language detection library for Go, suitable for short text and mixed-language text
GNES is Generic Neural Elastic Search, a cloud-native semantic search system based on deep neural network.
TextRank implementation for Python 3.
Solves basic Russian NLP tasks, API for lower level Natasha projects
ktrain is a Python library that makes deep learning and AI more accessible and easier to apply
Graph Convolutional Networks for Text Classification. AAAI 2019
Apache OpenNLP
DeText: A Deep Neural Text Understanding Framework for Ranking and Classification Tasks
NLP tools for Turkish.
Keras implementation of "One pixel attack for fooling deep neural networks" using differential evolution on Cifar10 and ImageNet
Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
📖 A curated list of awesome resources dedicated to Relation Extraction, one of the most important tasks in Natural Language Processing (NLP).
Multilingual word vectors in 78 languages
🌊HMTL: Hierarchical Multi-Task Learning - A State-of-the-Art neural network model for several NLP tasks based on PyTorch and AllenNLP
My first Python repo with codes in Machine Learning, NLP and Deep Learning with Keras and Theano
⭐️ NLP Algorithms with transformers lib. Supporting Text-Classification, Text-Generation, Information-Extraction, Text-Matching, RLHF, SFT etc.
Starter code to solve real world text data problems. Includes: Gensim Word2Vec, phrase embeddings, Text Classification with Logistic Regression, word...
Plug and Play Language Model implementation. Allows to steer topic and attributes of GPT-2 models.
Videos, notes and experiments to understand deep learning
Korean BERT pre-trained cased (KoBERT)
Neural question generation using transformers
A curated list of NLP resources focused on Transformer networks, attention mechanism, GPT, BERT, ChatGPT, LLMs, and transfer learning.
Datumbox is an open-source Machine Learning framework written in Java which allows the rapid development of Machine Learning and Statistical applicati...
curated collection of papers for the nlp practitioner 📖👩🔬