← Back to tools
View AGENTS.md for pup
pup
Parsing HTML at the command line
Description
pup is a command-line tool for parsing and extracting data from HTML documents. It uses CSS selectors to query HTML, similar to how jq works for JSON. It reads from stdin, outputs to stdout, and supports various output formats.
AI Summary
Command-line HTML parser that uses CSS selectors to extract data from HTML documents, like jq but for HTML.
Capabilities
- + Parse HTML using CSS selectors
- + Extract text, attributes, and elements from HTML
- + Format output as text, JSON, or HTML
- + Read from stdin for pipeline integration
Use When
- → User needs to extract data from HTML documents
- → User wants CSS selector-based HTML querying
- → User needs HTML parsing in shell pipelines
Avoid When
- x User prefers XPath over CSS selectors
- x User needs full web scraping with JavaScript rendering