Databutton
DiscordSign up
  • Getting Started
    • Databutton University
    • Meet the Databutton AI agent
  • Help & FAQ
  • Prompting in Databutton
    • Top prompting strategies
    • Prompting app UI & design
    • Prompting up your backend (APIs)
    • Troubleshooting prompt
    • Connecting UI with the Backend
    • Prompt Gallery
  • Using Tasks
    • How to use tasks in Databutton
    • Writing a good task
    • Chaining tasks
  • Task Gallery
  • Integrating SaaS Services
    • Authentication Integration
      • Firebase Integration
      • Supabase Integration
    • Working with Firestore Database
    • Working with Supabase Table
    • Stripe Integration
  • App Configurations
    • Package Installations
    • Managing Secrets
    • Customising Agent Behavior
    • Invite Collaborators
    • Customising App Design
  • Tutorials
    • Configuring a custom domain for your app
  • Databutton MCP – Build tools for AI
  • Troubleshooting
    • Browser window crashes
Powered by GitBook
On this page
  • Create an Audio File Uploader
  • Create APIs (Python backends)
  • Storing the Audio file from the frontend to database
  • Process and Translate the Audio File
  • Add the API to the UI component
  • Import the AudioFileUploader UI Component to Home Page of the App

Was this helpful?

  1. Tutorials

Creating an Audio to Text Converter with Databutton and OpenAI Whisper

A simple step-by-step walkthrough on creating an audio file uploader using Databutton, storing the audio files, and converting them to text using the OpenAI Whisper model.

Last updated 11 months ago

Was this helpful?

  1. Create an Audio File Uploader

  2. Create APIs (Python Backends)

  3. Add the API to the UI Component

Create an Audio File Uploader

✍🏼 Prompt : Can you create an audio file uploader where I can upload .mp3 files

Databutton creates an UI component for AudioFileUploader .

The 'Upload' button needs a functionalities .

Create APIs (Python backends)

Functionalities

  • Store the audio file from the frontend in the database.

  • Process and translate the audio file.

Storing the Audio file from the frontend to database

✍🏼 Prompt : You will have an audio file in your frontend. Store that audio file over databutton's storage and pass an unique key which can be later used to fetch it back from the storage. Store the audio file in binary format using Databutton's SDK.

Note : We're currently using Databutton's default storage to store the audio files. However, other storage services like Firebase can also be used. It's recommended to use Firebase for storing audio files due to its scalability.

API : Store Uploaded Files ( Code )
from fastapi import APIRouter, UploadFile, File
from pydantic import BaseModel
import databutton as db
import uuid

# Router for endpoints
router = APIRouter()

class UploadAudioResponse(BaseModel):
    file_key: str

@router.post('/upload-audio', response_model=UploadAudioResponse)
def upload_audio(file: UploadFile = File(...)) -> UploadAudioResponse:
    # Generate a unique key for the file with .mp4 extension
    file_key = f"{uuid.uuid4()}.mp3"
    
    # Read the file content
    file_content = file.file.read()
    
    # Store the file in Databutton's storage in binary format
    db.storage.binary.put(file_key, file_content)
    
    # Return the unique key
    return UploadAudioResponse(file_key=file_key)

Process and Translate the Audio File

✍🏼 Prompt : I would like you to use OpenAI whisper model and perfrom trsanscripion of an audio file. The audio file is stored in the Databutton's storage. The file can be accessed via an unique key which would be the input.

The output needs to be the transcription as a text performed by the OpenAI model.

Next, Databutton will define the Pydantic model (input/output parameters) and seek for the OpenAI API key.

Once the API key is passed, Databutton proceeds on generating a functional API endpoint.

Error Handling and Debugging Prompts:

Databutton might need some additional support on how to handle the file stored and pass the file path according to the supported format. Here's a suggested prompt:

Prompt : I would like you to fetch the data from storage using the Databutton sdk. Then use a temp path which will be passed to OpenAI LLM. Also remember that the OpenAI sdk requires to open the temporary file in binary mode

Prompt breakdown and expected code generation

  • Retrieve the audio file from storage

audio_file = db.storage.binary.get(request.storage_key)
  • Save the audio file to a temporary file with a recognized format

with tempfile.NamedTemporaryFile(suffix=".mp3", delete=False) as temp_audio_file:
    temp_audio_file.write(audio_file)
    temp_audio_file_path = temp_audio_file.name
  • Open the temporary file in binary mode and pass the file object to the Whisper model

  with open(temp_audio_file_path, "rb") as file:
        transcription = client.audio.transcriptions.create(model="whisper-1", file=file)
    print(transcription)
API : Process and trsanlsate Audio files ( Code )
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
import databutton as db
from openai import OpenAI
import io
import os
import tempfile

# Router for endpoints
router = APIRouter()

# Initialize OpenAI client
client = OpenAI(api_key=db.secrets.get("OPENAI_API_KEY"))


class TranscriptionRequest(BaseModel):
    storage_key: str


class TranscriptionResponse(BaseModel):
    transcription_text: str


@router.post("/transcribe-audio", response_model=TranscriptionResponse)
def transcribe_audio(request: TranscriptionRequest) -> TranscriptionResponse:
    try:
        # Retrieve the audio file from storage
        audio_file = db.storage.binary.get(request.storage_key)

        # Save the audio file to a temporary file with a recognized format
        with tempfile.NamedTemporaryFile(suffix=".mp3", delete=False) as temp_audio_file:
            temp_audio_file.write(audio_file)
            temp_audio_file_path = temp_audio_file.name

        # Print the temporary audio file path for debugging
        print(f"Temporary audio file path: {temp_audio_file_path}")
        # Open the temporary file in binary mode and pass the file object to the Whisper model
        with open(temp_audio_file_path, "rb") as file:
            transcription = client.audio.transcriptions.create(model="whisper-1", file=file)
        print(transcription)
        # Clean up the temporary file
        os.remove(temp_audio_file_path)

        # Extract the transcription text
        transcription_text = transcription.text
        print(f"Transcription: {transcription_text}")

        # Return the transcription text
        return TranscriptionResponse(transcription_text=transcription_text)

    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e)) from e

Add the API to the UI component

  • Integrating the "Store Uploaded Files" API

✍🏼 Prompt :
Can you integrate #store_audio_file API . 
This api get's triggered when the Upload button is pressed
  • Adding a new button and integrating the process and translate API

✍🏼 Prompt :
Can you also add a button called Translate ..

This Translate button will trigger #process_audio_file API

Import the AudioFileUploader UI Component to Home Page of the App

✍🏼 Prompt : Import the #AudioFileUploader component here in this main page

Further, the main home page of the app can be polished. Here's how the main UI code looks like.

Final UI ( Code )
import { Box, Container, Heading, VStack } from "@chakra-ui/react";

import { AudioFileUploader } from "../components/AudioFileUploader";

export default function App() {
  return (
    <VStack
      spacing={4}
      justify="center"
      align="center"
      height="100vh"
      bg="black"
    >
      <Heading as="h1" size="2xl" my={2} textAlign="center" color="#FFFFFF">
        Audio Converter
      </Heading>
      <Container maxW="container.md" centerContent overflowY="auto">
        <Box
          p={8}
          borderRadius="lg"
          boxShadow="lg"
          bgImage="url('https://images.unsplash.com/reserve/LJIZlzHgQ7WPSh5KVTCB_Typewriter.jpg?q=80&w=3096&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D')"
          bgSize="cover"
          bgPosition="center"
          bgColor="rgba(255, 255, 255, 0.8)"
          bgBlendMode="lighten"
        >
          <Heading
            as="h3"
            size="m"
            my={4}
            textAlign="center"
            color="black"
            fontWeight="normal"
          ></Heading>
          <AudioFileUploader />
        </Box>
      </Container>
    </VStack>
  );
}

Databutton creates a Simple UI component to upload Audio Files
Databutton providing the next steps.