Transforms complex documents like PDFs and Office docs into LLM-ready markdown/JSON for your Agentic workflows.
A markdown parser and compiler. Built for speed.
Rust-based platform for the Web
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
Transforming styles with JS plugins
An incremental parsing system for programming tools
Python logging made (stupidly) simple
A high-performance observability data pipeline.
⚓ A collection of high-performance JavaScript tools.
A PHP parser written in PHP
Better Markdown Parser in PHP
A high-performance 100% compatible drop-in replacement of "encoding/json"
jsoup: the Java HTML parser, built for HTML editing, cleaning, scraping, and XSS safety.
Repository for the book "Crafting Interpreters"
Rust parser combinator framework
🗜 JavaScript parser, mangler and compressor toolkit for ES6+
Python SQL Parser and Transpiler
The official repo for “Dolphin: Document Image Parsing via Heterogeneous Anchor Prompting”, ACL, 2025.
A shell parser, formatter, and interpreter with bash and zsh support; includes shfmt
Select, put and delete data from JSON, TOML, YAML, XML, INI, HCL and CSV files with a single tool. Also available as a go mod.
An extremely fast CSS parser, transformer, bundler, and minifier written in Rust.
File Parser optimised for LLM Ingestion with no loss 🧠 Parse PDFs, Docx, PPTx in a format that is ideal for LLMs.
Boa is an embeddable Javascript engine written in Rust.
ECMAScript parsing infrastructure for multipurpose analysis
Community maintained fork of pdfminer - we fathom PDF
A cross-platform .NET library for IMAP, POP3, and SMTP.
A web tool to explore the ASTs generated by various parsers.
Java 1-25 Parser and Abstract Syntax Tree for Java with advanced analysis functionalities.
JSqlParser parses an SQL statement and translate it into a hierarchy of Java classes. The generated hierarchy can be navigated using the Visitor Patte...
Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.
Markdown parser, done right. Commonmark support, extensions, syntax plugins, high speed - all in one. Gulp and metalsmith plugins available. Used by F...
📜 Extract meaningful content from the chaos of a web page
One of the fastest alternative JSON parser for Go that does not require schema
A library and language for building parsers, interpreters, compilers, etc.
Node.js body parsing middleware
:angel: The ultimate angle brackets parser library parsing HTML5, MathML, SVG and CSS to construct a DOM based on the official W3C specifications.
LIEF - Library to Instrument Executable Formats (C++, Python, Rust)
Picocli is a modern framework for building powerful, user-friendly, GraalVM-enabled command line apps with ease. It supports colors, autocompletion, s...
A doc comment standard for TypeScript
A JavaScript library for internationalization and localization that leverages the official Unicode CLDR JSON data
The fast & forgiving HTML and XML parser
Repair malformed JSON from LLMs, APIs, logs, and user input in Python.
Sweeten your JavaScript.
[Chumsky has moved to Codeberg!] Write expressive, high-performance parsers with ease.
Lightweight and fast library written in C# for reading Microsoft Excel files
Full featured CSV parser with simple api and tested against large datasets.
A toy programming language written in Typescript
翻译、开发心得或学习笔记
Snoop — инструмент разведки на основе открытых данных (OSINT world)
HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.