Topic

nlp

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

Repositories (1462)

Deeplearning.ai-Natural-Language-Processing-Specialization
Deeplearning.ai-Natural-Language-Processing-Specialization rust0258 Jupyter Notebook

This repository contains my full work and notes on Coursera's NLP Specialization (Natural Language Processing) taught by the instructor Younes Bensoud...

750
DNABERT
DNABERT jerryji1993 Python

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

750
spacy-stanza
spacy-stanza explosion Python

💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy

747
pywsd
pywsd alvations Python

Python Implementations of Word Sense Disambiguation (WSD) Technologies.

747
primeqa
primeqa primeqa Python

The prime repository for state-of-the-art Multilingual Question Answering research and development.

739
COMET
COMET Unbabel Python

A Neural Framework for MT Evaluation

739
attention_sinks
attention_sinks tomaarsen Python

Extend existing LLMs way beyond the original training length with constant memory usage, without retraining

736
PromptKG
PromptKG zjunlp Python

PromptKG Family: a Gallery of Prompt Learning & KG-related research works, toolkits, and paper-list.

734
awesome-open-data-centric-ai
awesome-open-data-centric-ai Renumics

Curated list of open source tooling for data-centric AI on unstructured data.

733
tensorflow-tutorial
tensorflow-tutorial wagamamaz

TensorFlow and Deep Learning Tutorials

731
pubmed_parser
pubmed_parser titipata Python

:clipboard: A Python Parser for PubMed Open-Access XML Subset and MEDLINE XML Dataset

730
Octopii
Octopii redhuntlabs Python

An AI-powered Personal Identifiable Information (PII) scanner.

727
xgen
xgen salesforce Python

Salesforce open-source LLMs with 8k sequence length.

727
naacl_transfer_learning_tutorial
naacl_transfer_learning_tutorial huggingface Python

Repository of code for the tutorial on Transfer Learning in NLP held at NAACL 2019 in Minneapolis, MN, USA

722
awesome-generative-information-retrieval
awesome-generative-information-retrieval gabriben
721
poetry
poetry sheepzh Python

地球上最全的华语现代诗歌语料库,3k+诗人,80K+诗歌,15M+字

721
nlp-pytorch-zh
nlp-pytorch-zh apachecn JavaScript

《Natural Language Processing with PyTorch》中文翻译

720
OpenAI-CLIP
OpenAI-CLIP moein-shariatnia Jupyter Notebook

Simple implementation of OpenAI CLIP model in PyTorch.

720
Legal-Text-Analytics
Legal-Text-Analytics Liquid-Legal-Institute

A list of selected resources, methods, and tools dedicated to Legal Text Analytics.

716
albert_pytorch
albert_pytorch lonePatient Python

A Lite Bert For Self-Supervised Learning Language Representations

714
meta
meta meta-toolkit C++

A Modern C++ Data Sciences Toolkit

713
neuspell
neuspell neuspell Python

NeuSpell: A Neural Spelling Correction Toolkit

712
searchGPT
searchGPT michaelthwan Python

Grounded search engine (i.e. with source reference) based on LLM / ChatGPT / OpenAI API. It supports web search, file content search etc.

708
MacBERT
MacBERT ymcui

Revisiting Pre-trained Models for Chinese Natural Language Processing (MacBERT)

708
Annotated-Semantic-Relationships-Datasets
Annotated-Semantic-Relationships-Datasets davidsbatista

A collections of public and free annotated datasets of relationships between entities/nominals (Portuguese and English)

706
Kiwi
Kiwi bab2min C++

Kiwi(지능형 한국어 형태소 분석기)

706
sequence-labeling-BiLSTM-CRF
sequence-labeling-BiLSTM-CRF scofield7419 JavaScript

The BiLSTM-CRF model implementation in Tensorflow, for sequence labeling tasks.

703
chat
chat Decalogue Python

基于自然语言理解与机器学习的聊天机器人,支持多用户并发及自定义多轮对话

701
vectorflow
vectorflow dgarnitz Python

VectorFlow is a high volume vector embedding pipeline that ingests raw data, transforms it into vectors and writes it to a vector DB of your choice.

701
prompt-tuning
prompt-tuning google-research Python

Original Implementation of Prompt Tuning from Lester, et al, 2021

699
advanced-machine-learning-engineer-roadmap-2024
advanced-machine-learning-engineer-roadmap-2024 farukalamai

A Full Stack ML (Machine Learning) Roadmap involves learning the necessary skills and technologies to become proficient in all aspects of machine lear...

698
awesome-nlp-sentiment-analysis
awesome-nlp-sentiment-analysis haiker2011

:book: 收集NLP领域相关的数据集、论文、开源实现,尤其是情感分析、情绪原因识别、评价对象和评价词抽取方面。

690
mynlp
mynlp jimichan Java

一个生产级、高性能、模块化、可扩展的中文NLP工具包。(中文分词、平均感知机、fastText、拼音、新词发现、分词纠错、BM25、人名识别、命名实体、自定义词典)

689
WeCron
WeCron polyrabbit JavaScript

:heavy_check_mark: 微信上的定时提醒 - Cron on WeChat

689
magpie
magpie inspirehep Python

Deep neural network framework for multi-label text classification

688
whatlanggo
whatlanggo abadojack Go

Natural language detection library for Go

687
voice-builder
voice-builder google JavaScript

An opensource text-to-speech (TTS) voice building tool

685
deep_research_bench
deep_research_bench Ayanami0730 Python

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

684
base-llm
base-llm datawhalechina Jupyter Notebook

从 NLP 到 LLM 的算法全栈教程,在线阅读地址:https://datawhalechina.github.io/base-llm/

684
Blackstone
Blackstone ICLRandD Python

:black_circle: A spaCy pipeline and model for NLP on unstructured legal text.

683
AILearners
AILearners aimi-cn Python

机器学习、深度学习、自然语言处理、计算机视觉、各种算法等AI领域相关技术的路线、教程、干货分享。笔记有:机器学习实战、剑指Offer、cs231n、cs131、吴恩达机...

682
stanford-openie-python
stanford-openie-python philipperemy Python

Stanford Open Information Extraction made simple!

682
SpeedTorch
SpeedTorch Santosh-Gupta Python

Library for faster pinned CPU <-> GPU transfer in Pytorch

682
AI-Compass
AI-Compass tingaicompass Python

“AI-Compass”将为社区指引在 AI 技术海洋中航行的方向,无论你是初学者还是进阶开发者,都能在这里找到通往 AI 各大方向的路径。旨在帮助开发者系统性地了解 AI...

679
Rankify
Rankify DataScienceUIBK Python

🔥 Rankify: A Comprehensive Python Toolkit for Retrieval, Re-Ranking, and Retrieval-Augmented Generation 🔥. Our toolkit integrates 40 pre-retrieved b...

676
langchain_dart
langchain_dart davidmigloz Dart

Build LLM-powered Dart/Flutter applications.

675
ekphrasis
ekphrasis cbaziotis Python

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word norm...

675
nboost
nboost koursaros-ai Python

NBoost is a scalable, search-api-boosting platform for deploying transformer models to improve the relevance of search results on different platforms...

674
Chinese_models_for_SpaCy
Chinese_models_for_SpaCy howl-anderson Jupyter Notebook

SpaCy 中文模型 | Models for SpaCy that support Chinese

674
ai-study
ai-study leerumor

人工智能学习资料超全整理,包含机器学习基础ML、深度学习基础DL、计算机视觉CV、自然语言处理NLP、推荐系统、语音识别、图神经网路、算法工程师面试题

671