feras-vbrl's picture
Upload 4 files
195dd9b verified

A newer version of the Streamlit SDK is available: 1.44.1

Upgrade
metadata
title: PDF to Markdown Converter
emoji: πŸ“„
colorFrom: blue
colorTo: green
sdk: streamlit
sdk_version: 1.29.0
app_file: app.py
pinned: false

PDF to Markdown Converter

This application converts PDF documents to Markdown format. It uses the docling library for document conversion and provides a simple Streamlit interface.

Features

  • Upload PDF files directly
  • Convert PDFs from URLs
  • Batch process multiple images using vLLM
  • Download the resulting Markdown files
  • Clean, user-friendly interface

How to Use

PDF to Markdown

  1. Select the "PDF to Markdown" tab
  2. Upload a PDF file using the file uploader or enter a URL to a PDF document
  3. Click the "Convert to Markdown" button
  4. Once conversion is complete, download the Markdown file

Batch Image Processing

  1. Select the "Batch Image Processing" tab
  2. Upload multiple image files (PNG, JPG, JPEG)
  3. Optionally customize the model path and prompt text
  4. Click the "Process Images" button
  5. Once processing is complete, download the ZIP file containing all results

Technical Details

Built with:

  • Streamlit 1.29.0
  • Docling 2.7.0
  • docling_core
  • vLLM (for batch processing)
  • Python 3.12

Deployment

This application is deployed on Hugging Face Spaces.

To deploy this application:

  1. Create a new Space on Hugging Face (https://huggingface.co./spaces)
  2. Choose "Streamlit" as the SDK
  3. Upload all these files to the Space repository:
    • app.py
    • requirements.txt
    • README.md
    • runtime.txt

The application will automatically create any necessary directories when it starts.

Note: The vLLM functionality requires significant computational resources, so you may need to select a more powerful hardware configuration for your Space.