Product Spec PDF Parser
A Claude Code skill that reads product PDF files — price books, fact sheets, and spec sheets — and extracts structured furniture specification data into a standardized FF&E schedule. Claude reasons over extracted text to handle wildly varying layouts.
What it does
Type /product-spec-pdf-parser in Claude Code and point it at PDF files or a folder. The skill extracts text with PyMuPDF, then Claude reads and reasons over the content to identify products, variants, SKUs, dimensions, materials, and pricing — structuring everything into a 24-field FF&E schedule.
Handles price books (Herman Miller Aeron), fact sheets with SKUs (Hem Alphabeta), spec sheets, and multi-product catalogs. No regex or custom parsers — Claude IS the parser.
Install
Claude Desktop:
- Open Customize → Browse plugins
- Click + → Add marketplace from GitHub
- Enter
AlpacaLabsLLC/skills-for-architects - Install the Product & Materials Research plugin
Claude Code (terminal):
claude install github:AlpacaLabsLLC/skills-for-architects/05-materials-research
pip install PyMuPDF
Usage
/product-spec-pdf-parser
Then provide PDF paths or a folder:
/product-spec-pdf-parser
~/Documents/specs/alphabeta-floor-lamp.pdf
~/Documents/specs/aeron-price-book.pdf
Choose variant depth — expand (one row per variant, default) or summarize (one row per product).
Output schema
24 fields per product, extending the Canoa Clipper FF&E format with PDF-specific fields:
| Field | Example |
|---|---|
| Product Name | Alphabeta Floor Lamp |
| Variant | Diamond, Black |
| SKU | HEM-AF-DB |
| Brand / Designer | Hem / Luca Nichetto |
| Collection / Category | Alphabeta / Lighting |
| Description | Modular floor lamp with interchangeable shades |
| W / D / H / Seat H | — / — / 135.5 / — |
| Unit / Weight | cm / 4.5 kg |
| Materials | Aluminium, Steel |
| Colors/Finishes | Black |
| List Price / Price Adder | 595.00 / — |
| Currency | EUR |
| Warranty / Certifications | — / CE |
| Country of Origin | Sweden |
| Source File | alphabeta-fact-sheet.pdf |
PDF types supported
| Type | Strategy |
|---|---|
| Fact sheet with SKUs | One row per SKU |
| Fact sheet with finishes | One row per upholstery/finish option |
| Price book / configurator | One row per product type, options summarized |
| Product catalog | Rows for each distinct product |
How it handles failures
- Scanned/image PDFs — detected and flagged for OCR
- Password-protected PDFs — caught and reported
- Large PDFs (100+ pages) — processed in 10-page chunks with progress updates
After every batch: Parsed: X products from Y PDF(s)
Pairs with
Use /product-spec-bulk-fetch for web URLs instead of PDFs, then /product-spec-bulk-cleanup to normalize. Chain: PDF parse → cleanup → spec-ready schedule.