🧭 Introduction
What is Papercuts?
Papercuts /parse/
API allows you to extract structured data from a wide range of unstructured file formats using a unified endpoint.
It supports OCR, audio transcription, and LLM-based parsing to convert files like PDFs, images, or audio recordings into structured JSON based on a template JSON schema you provide.
This output can be directly consumed by downstream systems such as:
- LLM agents requiring structured context for planning or reasoning
- ETL pipelines and data warehouses for analytics ingestion
- No-code tools or APIs for workflow automation and process integration
- Document understanding systems for classification, extraction, and routing
By aligning the parsed result to your custom schema, the Parse API ensures interoperability with your backend systems and AI workflows.
✅ Supported File Types
- Documents:
.pdf
,.docx
,.pptx
- Images:
.png
,.jpg
,.jpeg
,.bmp
,.tiff
- Audio:
.mp3
,.wav
,.ogg
,.m4a
- Video (Beta):
.mp4
,.mov
,.avi
,.mkv
Note: File size must be less than 32MB.
⚡ Quickstart
🔐 Authentication
All requests must include an x-api-key
header with a valid API key.
Papercuts is currently in closed beta. To get access, contact us at aditya@papercuts.ai.
🔧 Request
Send a POST
request to /parse/
with:
file
: The input file to be parsed (asmultipart/form-data
)template_json
: A JSON string describing the expected output schema
Next Steps
Parse API
Start using the API