๐Ÿ“‘How to build a PDF summariser

Creating a PDF uploader and using LLM to summarise the content.

Key Components:

  • FileUpload Component: An UI component that allows users to select and upload PDF files..

  • APIs: We need two backends here

    • Upload Endpoint: Accepts PDF file uploads, stores them in Databutton's storage, and returns the filename.

    • Extract Endpoint: Fetches the stored PDF file, extracts its content as text, and returns the extracted text.

    • llm_summarizer: Accepts the extracted text, uses OpenAI LLM to summarize it, and returns the summary.

App's Workflow:

  1. User uploads a PDF file via the FileUpload component.

  2. The file is sent to the upload_and_extract API, which stores the file and extracts its text.

  3. The extracted text is displayed on the Home page.

  4. User can then click a button to summarize the extracted text using the llm_summarizer API.

  5. The summary is displayed on the Home page

Prompting guides to build the app

Databutton is usually good in building in the skeleton of such apps. However, it is a good practice to pass a clear strcuture of the app ( UI + Backends ) in form of prompting or via some simple sketches.

In this use case, mentioning Databutton :

"Hey I would like to build a PDF summariser App."

Databutton came up with a clear outline of the app;

Building the UI

Databutton created the File Upload component sucessfully.

Here's the code -

Code for the File Upload UI Component
import brain from "brain";
import type React from "react";
import { useState } from "react";

export type Props = {};

export const FileUpload = (props: Props) => {
  const [file, setFile] = useState<File | null>(null);
  const [extractedText, setExtractedText] = useState<string>("");
  const [summary, setSummary] = useState<string>("");

  const handleFileChange = (event: React.ChangeEvent<HTMLInputElement>) => {
    if (event.target.files && event.target.files.length > 0) {
      setFile(event.target.files[0]);
    }
  };

  const handleUpload = async () => {
    if (file) {
      try {
        // Upload the PDF file
        const uploadResponse = await brain.upload_pdf({ file });
        const uploadData = await uploadResponse.json();
        const filename = uploadData.filename;

        // Extract text from the uploaded PDF
        const extractResponse = await brain.extract_text({ filename });
        const extractData = await extractResponse.json();
        setExtractedText(extractData.text);
      } catch (error) {
        console.error("Error during file upload and extraction:", error);
      }
    }
  };

  const handleSummarize = async () => {
    if (extractedText) {
      try {
        const response = await brain.summarize_text({ text: extractedText });
        const data = await response.json();
        setSummary(data.summary);
      } catch (error) {
        console.error("Error during text summarization:", error);
      }
    }
  };

  return (
    <div className="p-4 border rounded">
      <input type="file" accept="application/pdf" onChange={handleFileChange} />
      {file && <p>Selected file: {file.name}</p>}
      <button
        onClick={handleUpload}
        className="mt-2 p-2 bg-blue-500 text-white rounded"
      >
        Upload
      </button>
      {extractedText && (
        <div className="mt-4">
          <h2 className="text-xl font-bold">Extracted Text</h2>
          <p>{extractedText}</p>
          <button
            onClick={handleSummarize}
            className="mt-2 p-2 bg-green-500 text-white rounded"
          >
            Summarize
          </button>
        </div>
      )}
      {summary && (
        <div className="mt-4">
          <h2 className="text-xl font-bold">Summary</h2>
          <p>{summary}</p>
        </div>
      )}
    </div>
  );
};

Note : This was the final UI component code that Databutton generated after integrating the backends / APIs ( brain modules). For integrating it is crucial to use the hastag (#) with a prompt refering to the API / backend to integrate.

Building the Backends

For building the backend, it's important to provide a clear and consideration prompt . Instead of having a single API as Databutton suggested, we took the route of breaking it down into two APIs .

Here's how it can be prompted,

How about we break down this in two APIs.
Upload and Extract content from PDF :
Upload the PDF file, store it over databutton's storage with the filename
Fetch the same file from storage and extract the content as text
LLM Summarizer :
Accept the context and use OpenAI LLM to summarise

Databutton, reinitialized it's plan and went on building the backends.

Debugging Prompts and Tips

  1. Define a clear role on how / where to store the file and fetch from

Hey, I would like to store the uploaded file from the frontend 
to Databutton Storage. Use a unique naming to store the file. Fetch the file later using the same unique name. And finally read the file and extract the context.
  1. In case general errors while testing the UI component / APIs, you can prompt to check the logs. ( These errors usually pop over the consloe at the bottom. Often the backend symbol shows "green" / "red" status to indicate error or sucess. Hover the specific backend to know more. )

I see error already in console log. Please check and fix.
  1. Testing the API / Backend

It is recommended to click into specific API and prompt, this helps Databutton to scope in and perform better.

Can you test this api for me.

While tesing the upload_and_extract API , it would be easier to upload a PDF file and prompt Databutton to test the given PDF file,

Can you can test with <> PDF file in databutton's storaage but remember in real scenario, user will upload from frontend
  1. Defining File type in Storage.

Hey you have a Binary file , make sure to use databutton's storage SDK to access the binary file

Backend code

API : upload_and_extract
from fastapi import APIRouter, UploadFile, File, HTTPException
import databutton as db
import fitz  # PyMuPDF
import uuid
from pydantic import BaseModel

# Router for endpoints
router = APIRouter()

class UploadResponse(BaseModel):
    filename: str

class ExtractResponse(BaseModel):
    text: str

@router.post("/upload", response_model=UploadResponse)
def upload_pdf(file: UploadFile = File(...)) -> UploadResponse:
    print("Received file upload request")
    if file.content_type != "application/pdf":
        print(f"Invalid file type: {file.content_type}")
        raise HTTPException(status_code=400, detail="Invalid file type. Only PDFs are allowed.")

    # Generate a unique filename
    filename = f"{uuid.uuid4()}.pdf"
    print(f"Generated filename: {filename}")

    # Store the file in Databutton's storage
    file_content = file.file.read()
    db.storage.binary.put(filename, file_content)
    print(f"Stored file in storage with filename: {filename}")

    return UploadResponse(filename=filename)

@router.get("/extract", response_model=ExtractResponse)
def extract_text(filename: str) -> ExtractResponse:
    print(f"Received text extraction request for filename: {filename}")
    # Fetch the file from storage
    try:
        file_content = db.storage.binary.get(filename)
        print(f"Fetched file from storage: {filename}")
    except FileNotFoundError:
        print(f"File not found: {filename}")
        raise HTTPException(status_code=404, detail="File not found.")

    # Extract text from the PDF
    pdf_document = fitz.open(stream=file_content, filetype="pdf")
    text = ""
    for page_num in range(pdf_document.page_count):
        page = pdf_document.load_page(page_num)
        text += page.get_text()
    print(f"Extracted text from file: {filename}")

    return ExtractResponse(text=text)

This API needs Python package : PyMuPDF to read the PDF files. Databutton handles the installation.

API : llm_summarizer
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
import databutton as db
from openai import OpenAI

# Router for endpoints
router = APIRouter()

# Pydantic model for request
class SummarizeRequest(BaseModel):
    text: str

# Pydantic model for response
class SummarizeResponse(BaseModel):
    summary: str

# Retrieve OpenAI API key from secrets
OPENAI_API_KEY = db.secrets.get("OPENAI_API_KEY")

# Initialize OpenAI client
client = OpenAI(api_key=OPENAI_API_KEY)

@router.post("/summarize", response_model=SummarizeResponse)
def summarize_text(request: SummarizeRequest) -> SummarizeResponse:
    try:
        # Use OpenAI LLM to summarize the text
        response = client.chat.completions.create(
            model="gpt-4o",
            messages=[
                {"role": "system", "content": "You are a summarization assistant."},
                {"role": "user", "content": f"Summarize the following text: {request.text}"}
            ]
        )
        summary = response.choices[0].message.content.strip()
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

    return SummarizeResponse(summary=summary)

Last updated