Export Clastify canvases into a PDF, allowing user to draw boxes for content to exclude from OCR, with individual removal, scrolling, Clastify logo OCR exclusion zone, and dynamic banner removal.
Clastify PDF Exporter with OCR Export your Clastify documents as high-quality PDFs with an optional selectable text layer extracted via OCR. This userscript lets you capture all canvas pages into a single PDF while excluding non-text elements like images, diagrams, or equations to improve OCR accuracy. It automatically skips the Clastify logo and removes dynamic banners for a cleaner experience.
What It Does High-Res PDF Export: Renders each canvas at 1.75x scale into a temporary buffer for sharper images, then compiles them into a PDF (native size or A4-scaled). Smart OCR Overlay: Uses Tesseract.js to detect and overlay invisible, selectable text on the PDF, making content copy-pasteable without altering visuals. Exclusion Zones: Draw custom rectangles to block OCR on problematic areas (e.g., math formulas or charts). Auto-excludes the top-right logo zone. Lazy Loading & Scrolling: Auto-scrolls the page to ensure all canvases load before export. Debug Visualization: Optionally shows OCR bounding boxes and exclusions in the PDF for verification. Banner Cleanup: Dynamically removes Clastify banners on scroll.
You can alter the constants at the top of the script to try and get better (normal text) OCR results or to scroll faster or whatever. Don't bother setting the export scale to more than 1.75 quality increase becomes very diminishing but the OCR and exporting take way longer.
Perfect for students or researchers needing printable, searchable PDFs from Clastify but not wanting to pay for another subscription.
I'm not gonna lie, all of the code was made with AI, I don't know javascript. Feel free to fork, tweak, and use however you like, this is FOSS baby!! (If you do build on it pls message me though, I would love to see what ppl do with this if anything, though I doubt it)
Enjoy.