How to build a PDF summariser

Creating a PDF uploader and using LLM to summarise the content.

Key Components:

  • FileUpload Component: An UI component that allows users to select and upload PDF files..

  • APIs: We need two backends here

    • Upload Endpoint: Accepts PDF file uploads, stores them in Databutton's storage, and returns the filename.

    • Extract Endpoint: Fetches the stored PDF file, extracts its content as text, and returns the extracted text.

    • llm_summarizer: Accepts the extracted text, uses OpenAI LLM to summarize it, and returns the summary.

App's Workflow:

  1. User uploads a PDF file via the FileUpload component.

  2. The file is sent to the upload_and_extract API, which stores the file and extracts its text.

  3. The extracted text is displayed on the Home page.

  4. User can then click a button to summarize the extracted text using the llm_summarizer API.

  5. The summary is displayed on the Home page

Prompting guides to build the app

Databutton is usually good in building in the skeleton of such apps. However, it is a good practice to pass a clear strcuture of the app ( UI + Backends ) in form of prompting or via some simple sketches.

In this use case, mentioning Databutton :

"Hey I would like to build a PDF summariser App."

Databutton came up with a clear outline of the app;

Example prompt used : Hey I would like to build a PDF summariser App.

Building the UI

Databutton created the File Upload component sucessfully.

Here's the code -

Code for the File Upload UI Component

Note : This was the final UI component code that Databutton generated after integrating the backends / APIs ( brain modules). For integrating it is crucial to use the hastag (#) with a prompt refering to the API / backend to integrate.

Building the Backends

For building the backend, it's important to provide a clear and consideration prompt . Instead of having a single API as Databutton suggested, we took the route of breaking it down into two APIs .

Here's how it can be prompted,

Databutton, reinitialized it's plan and went on building the backends.

Debugging Prompts and Tips

  1. Define a clear role on how / where to store the file and fetch from

  1. In case general errors while testing the UI component / APIs, you can prompt to check the logs. ( These errors usually pop over the consloe at the bottom. Often the backend symbol shows "green" / "red" status to indicate error or sucess. Hover the specific backend to know more. )

  1. Testing the API / Backend

It is recommended to click into specific API and prompt, this helps Databutton to scope in and perform better.

While tesing the upload_and_extract API , it would be easier to upload a PDF file and prompt Databutton to test the given PDF file,

  1. Defining File type in Storage.

Backend code

API : upload_and_extract

This API needs Python package : PyMuPDF to read the PDF files. Databutton handles the installation.

API : llm_summarizer

Last updated

Was this helpful?