Topic

nlp

Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.

Repositories (1462)

Indic-BERT-v1
Indic-BERT-v1 AI4Bharat Python

Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/...

296
prosodic
prosodic quadrismegistus Python

Prosodic: a metrical-phonological parser, written in Python. For English and Finnish, with flexible language support.

296
Kevinpro-NLP-demo
Kevinpro-NLP-demo Ricardokevins Python

All NLP you Need Here. 目前包含15个NLP demo的pytorch实现(大量代码借鉴于其他开源项目,原先是自己玩的,后来干脆也开源出来)

295
nlp-hanzi-similar
nlp-hanzi-similar houbb Java

The hanzi similar tool.(汉字相似度计算工具,中文形近字算法。可用于手写汉字识别纠正,文本混淆等。)

295
komputation
komputation sekwiatkowski Kotlin

Komputation is a neural network framework for the Java Virtual Machine written in Kotlin and CUDA C.

294
transfer-nlp
transfer-nlp feedly Python

NLP library designed for reproducible experimentation management

294
lda
lda primaryobjects JavaScript

LDA topic modeling for node.js

294
nlp-data-augmentation
nlp-data-augmentation quincyliang

Data Augmentation for NLP. NLP数据增强

294
ml-projects
ml-projects 30lm32

ML based projects such as Spam Classification, Time Series Analysis, Text Classification using Random Forest, Deep Learning, Bayesian, Xgboost in Pyth...

294
text-classification
text-classification javedsha Jupyter Notebook

Machine Learning and NLP: Text Classification using python, scikit-learn and NLTK

293
textnets
textnets jboynyc Python

Text analysis with networks.

293
retvec
retvec google-research Jupyter Notebook

RETVec is an efficient, multilingual, and adversarially-robust text vectorizer.

293
ML-ProjectYard
ML-ProjectYard ashishsahu1 Jupyter Notebook

This repo consists of multiple machine learning based projects with frontend

292
rc-cnn-dailymail
rc-cnn-dailymail danqi Python

CNN/Daily Mail Reading Comprehension Task

291
DeepResearch
DeepResearch Hsankesara Python

This repository is the collection of research papers in Deep learning, computer vision and NLP.

291
Taisite-Platform
Taisite-Platform amazingTest Vue

最强接口测试平台

291
BOND
BOND cliang1453 Python

BOND: BERT-Assisted Open-Domain Name Entity Recognition with Distant Supervision

291
wn
wn goodmami Python

A modern, interlingual wordnet interface for Python

291
NSC
NSC thunlp Python

Neural Sentiment Classification

288
neologdn
neologdn ikegami-yukino Cython

Japanese text normalizer for mecab-neologd

288
RNNSharp
RNNSharp zhongkaifu C#

RNNSharp is a toolkit of deep recurrent neural network which is widely used for many different kinds of tasks, such as sequence labeling, sequence-to-...

287
Web-Database-Analytics
Web-Database-Analytics tirthajyoti Jupyter Notebook

Web scrapping and related analytics using Python tools

287
snow-owl
snow-owl b2ihealthcare Java

:owl: Snow Owl Terminology Server - a production-ready, scalable, FHIR Terminology Service compliant server that supports SNOMED CT International and...

287
RLHF
RLHF sunzeyeah Python

Implementation of Chinese ChatGPT

287
vnlp
vnlp vngrs-ai Python

State-of-the-art, lightweight NLP tools for Turkish language. Developed by VNGRS.

286
TopMost
TopMost bobxwu Jupyter Notebook

A Topic Modeling System Toolkit (ACL 2024 Demo)

286
languagecrunch
languagecrunch artpar Python

LanguageCrunch NLP server docker image

285
fancy-nlp
fancy-nlp boat-group Python

NLP for human. A fast and easy-to-use natural language processing (NLP) toolkit, satisfying your imagination about NLP.

285
Multi-Type-TD-TSR
Multi-Type-TD-TSR Psarpei Jupyter Notebook

Extracting Tables from Document Images using a Multi-stage Pipeline for Table Detection and Table Structure Recognition

285
InsTag
InsTag OFA-Sys

InsTag: A Tool for Data Analysis in LLM Supervised Fine-tuning

285
behemoth
behemoth DigitalPebble Java

Behemoth is an open source platform for large scale document analysis based on Apache Hadoop.

284
multifit
multifit n-waves Jupyter Notebook

The code to reproduce results from paper "MultiFiT: Efficient Multi-lingual Language Model Fine-tuning" https://arxiv.org/abs/1909.04761

284
shared_colab_notebooks
shared_colab_notebooks mrm8488 Jupyter Notebook

A Repo to store the Google Colaboratory Notebooks that I have created and shared

284
negspacy
negspacy jenojp Python

spaCy pipeline object for negating concepts in text

283
Data-Science-EBooks
Data-Science-EBooks data-science-projects-and-resources

Data Science E-books, Interview Resources and Cheat-sheets

278
open-semantic-etl
open-semantic-etl opensemanticsearch Python

Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity R...

277
PyTorch-Batch-Attention-Seq2seq
PyTorch-Batch-Attention-Seq2seq AuCson Python

PyTorch implementation of batched bi-RNN encoder and attention-decoder.

275
awesome-hungarian-nlp
awesome-hungarian-nlp oroszgy

A curated list of NLP resources for Hungarian

275
kairon
kairon digiteinfotech Python

Agentic AI platform that harnesses Visual LLM Chaining to build proactive digital assistants

275
nlp-tutorial
nlp-tutorial bonzanini Jupyter Notebook

Tutorial: Natural Language Processing in Python

274
THUTag
THUTag thunlp Java

A Package of Keyphrase Extraction and Social Tag Suggestion

273
gobbli
gobbli RTIInternational Python

Deep learning with text doesn't have to be scary.

272
pytorch-question-answering
pytorch-question-answering kushalj001 Jupyter Notebook

Important paper implementations for Question Answering using PyTorch

269
scoper
scoper RameshAditya Python

Fuzzy and semantic search for captioned YouTube videos.

268
markup
markup samueldobbie TypeScript

A web-based document annotation tool, powered by GPT-4 :rocket:

268
extreme-bert
extreme-bert extreme-bert Python

ExtremeBERT is a toolkit that accelerates the pretraining of customized language models on customized datasets, described in the paper “ExtremeBERT: A...

268
hmni
hmni Christopher-Thornton Python

📛 Fuzzy Name Matching with Machine Learning

267
KeyphraseVectorizers
KeyphraseVectorizers TimSchopf Python

Set of vectorizers that extract keyphrases with part-of-speech patterns from a collection of text documents and convert them into a document-keyphrase...

267
squirrel-core
squirrel-core merantix-momentum Python

A Python library that enables ML teams to share, load, and transform data in a collaborative, flexible, and efficient way :chestnut:

266
text-segmentation
text-segmentation koomri Python

Implementation of the paper: Text Segmentation as a Supervised Learning Task

265