Natural language processing (NLP) is a field of computer science that studies how computers and humans interact. In the 1950s, Alan Turing published an article that proposed a measure of intelligence, now called the Turing test. More modern techniques, such as deep learning, have produced results in the fields of language modeling, parsing, and natural-language tasks.
This repo contains evaluation code for the paper "BLINK: Multimodal Large Language Models Can See but Not Perceive". https://arxiv.org/abs/2404.12...
A Ruby natural language processor.
Preprocessing Library for Natural Language Processing
Implementation of BERT in R
국내 자연어 처리 기술을 연구 및 개발하는 스타트업 목록
A tool that locates, downloads, and extracts machine translation corpora
[NeurIPS 2025 Spotlight] Scaling Computer-Use Grounding via UI Decomposition and Synthesis
A Library to parse natural language in pure Clojure and ClojureScript
Python wrapper for evaluating summarization quality by ROUGE package
Sentence Embeddings using Siamese ETRI KoBERT
A professional list of Tutorials and Surveys on DL, ML, DM, CV, NLP, Speech in top AI conferences and journals.
Python scripts preprocessing Penn Treebank and Chinese Treebank
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents (ACL 2024 Main Conference)
An overview of the AI-as-a-service landscape
Python tutorials as Jupyter Notebooks for NLP, ML, AI
Confidence and Byt5 - based geotagging model predicting coordinates from text alone.
基于深度学习的自然语言处理库
package lingo provides the data structures and algorithms required for natural language processing
Marathi NLP - is a repository dedicated to development of tools and resources for Marathi language.
A curated list of awesome resources, tools, research papers, and projects related to the concept of Large Language Model Operating Systems (LLM-OS).
Augmenty is an augmentation library based on spaCy for augmenting texts.
📚 Survey of previous research and related works on machine learning (especially Deep Learning) in Japanese
Code for Sluice networks: Learning what to share between loosely related tasks
Long Document Summarization Papers
[NeurIPS 2023] This is the code for the paper `Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias`.
An open-source no-code tool for teams to collaborate on building, evaluating, and hosting applications leveraging GPT and other large language models....
Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"
Lyrics Generator aka Character-level Language Modeling with Multi-layer LSTM Recurrent Neural Network
Japanese sentiment analyzer implemented in Python.
Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper
[EMNLP 2025] LightThinker: Thinking Step-by-Step Compression
A Framework for Textual Entailment based Zero Shot text classification
Japanese negative positive classification.日本語文書のネガポジを判定。
Machine reading comprehension on clinical case reports
[BEA @ ACL 2023] General-purpose tool for linguistic features extraction; Tested on readability assessment, essay scoring, fake news detection, hate s...
[NeurIPS 2024] Spider2-V: How Far Are Multimodal Agents From Automating Data Science and Engineering Workflows?
Code for the ACL 2018 paper "Neural Document Summarization by Jointly Learning to Score and Select Sentences"
A Flexible Deep Learning Approach to Fuzzy String Matching
A Greek edition of BERT pre-trained language model
Generating paper titles (and more!) with GPT trained on data scraped from arXiv.
Large, curated set of benchmark datasets for evaluating automatic keyphrase extraction algorithms.
Fast and robust date extraction from web pages, with Python or on the command-line
Convert the given plain text to MySQL query by ChatGPT
[ICLR 2025] Benchmarking Agentic Workflow Generation
Library for unit extraction - fork of quantulum for python3
[EMNLP 2024] OneGen: Efficient One-Pass Unified Generation and Retrieval for LLMs.
Extracting scientific claims from biomedical abstracts (powered by AllenNLP)
Source code for EMNLP 2020 paper: Double Graph Based Reasoning for Document-level Relation Extraction