DIY: Summarise Your Speech

I like discussing ideas out loud. I record my thoughts using Voice Memos on my iPhone. I talk for hours but never listen to my records, because it gets boring.

We will make an app, that transforms audio into text. And then, we will make a structured summary. And even a Business Model Canvas of an idea.

Plan

  1. Install Openai Library
  2. Audio to Text
  3. Audio to Summary
  4. Audio to Business Model Canvas
  5. Draw a Business Model Canvas Automatically

Requirements

You need Python and Openai account.

if you don’t have Python or don’t know how to work with libraries on Python, take a look at my previous tutorial about object detection in images, where we installed it on Mac and Windows.

If you have some errors or don’t understand something, ask chatGPT.

Install OpenAI Library

Open “Terminal” on Mac or in Visual Studio Code. type the following:

pip install openai

The library is used for 3 purposes: ask ChatGPT, translate speech to text, and generate an image. We use ChatGPT and Speech-to-Text functions.

Get the API key

Openai is not free to use. We should pay 0.006$ per each minute of audio. Therefore, we need to register on the Openai website and get the API key, that we insert later in the code. Openai also gives 5$ to the new users.

get the API key after registration.

Record your voice

Use the “Voice Memos” app on Mac or iPhone or any other recording app. Most of the audio file formats are supported by Whisper AI from OpenAI.

Speak for up to 10 minutes in English. Next, move the recording file to the same directory where you will store the Python file. You may also copy the path to the audio file instead of moving it.

Audio to Text

import openai

# insert your API key
openai.api_key = "sk-tIBD5UTbJ9RdqrwjP2dNT3BlfkFJzex4WGqvNsi3uAdMdC1V"

# Define the path to the file. if the file is in the same folder as the Python code, just write the name of a file
path = "audio.m4a"
audio_file= open(path, "rb")

# Select the AI to use (whisper-1 is the newest in April 2023)
transcript = openai.Audio.transcribe("whisper-1", audio_file)

print(transcript.text)

We have translated our speech into text.

Audio to Summary

We use ChatGPT-3 to make a summary. You may use gpt-3.5 that costs 10 times less or gpt-4 for a more precise summary of more than 4000 words.

The maximum amount of words is 4096. The token is a way how the AI stores the data. for English, each token represents nearly one word (700 words are nearly 1000 tokens). For Russian, German, and French – each token is nearly one letter (700 words are nearly 5000 tokens). Translate the text to English and only then use chatGPT. You will save 5x dollars.

In Prompt we add the translated text and the task that we want an AI to do.

# save the translated text as text
text = transcript.text

prompt = (
        f"Make the summary of the text: \n\n{text}\n\n"
    )

response = openai.Completion.create(
    model="text-davinci-003", 
    prompt=prompt,
    max_tokens=3900 # 4096 minus the constant prompt message
    n=1,
    stop=None,
    temperature=0.5,
)

print(response.choices[0].text.strip())

combine this code with the previous one and run the script. That’s it.

Audio to Business Model Canvas

If I talk about founding a business, why not combine the text in a business model canvas.

prompt = (
    f"Create a business model canvas based on the idea explained below:\n\n{text}\n\n"
    "Use the following structure of 9 blocks:\n"
    "Customer Segments:\n"
    "Value Propositions:\n"
    "Channels:\n"
    "Customer Relationships:\n"
    "Revenue Streams:\n"
    "Key Resources:\n"
    "Key Activities:\n"
    "Key Partners:\n"
    "Cost Structure:"
)

You get the structured answer, and you can insert the data in your business model canvas template.

Draw business model canvas

Let’s make a real business model canvas out of the idea described in the audio. We need to do a simple website with CSS, HTML, JSON, and Python. I prepared it for you:
Download the folder

Install Flask library

pip install flask

Run text_to_bmc.py, audio_to_bmc.py, or bmc_web.py (if you don’t have openai account) within the given folder. Don’t forget to write Openai API key and the audio or text:

Open in the browser: http://127.0.0.1:5000. Or any other page that is written in your terminal

you may also run audio_to_bmc.py or text_to_bmc.py, where the full code is written, just edit the API key and that’s it.

That’s it. You don’t need any longer to think about how to structure the business model canvas.

Challenge

Create the House of Quality model, that tells the difference between you and competitors.

Create the business plan template, which is created automatically based on your recording – add the SWOT, PESTEL analysis, intro about the product, executive summary, team structure, and maybe competitor and market analysis, using the knowledge from my “DIY” articles.

Share the tutorial with your friends and colleagues. If you want to see how to record one hour of a speech or team meeting, and how to record in any language and cheaply, then leave a comment. Meanwhile, read other DIY articles, you can find them, by simply typing “DIY” in the search on the top right. Also, check out the articles about finances and urban planning.

If you want to use a similar app casually, subscribe to my Telegram bot. @ @speech_into_text_bot can listen for up to 10 minutes of speech and transform it into text and summarise everything you said. It costs just 20 cents per 10 minutes of speech. The money is spent on Openai libraries and server maintenance.

Share
Send
Pin
2023   AI   DIY   Programming
Popular