In this blog post, we’ll dive deep into some code that illustrates the convergence of Streamlit, an open-source app framework for Machine Learning and Data Science projects, and Generative AI, powered by OpenAI’s models. This code snippet demonstrates how to create a Streamlit application that processes YouTube videos by downloading them, fetching their transcripts, and summarizing the content using OpenAI’s GPT models. I initially wrote this code to simplify a task I was doing in a regular basis, and to give myself an idea of whether a video would be worth watching or not. Since then, creating the app has been a learning experience helping me to be more acquainted not only with Python, but with using Streamlit to create simple web apps and interacting with OpenAI and GPT. Let’s break down the code, piece by piece, to understand its components and the power of Generative AI it showcases.
Import Statements
In my journey to build this application, I start by gathering my toolkit—think of it as grabbing your ingredients before whipping up a gourmet meal. The import statements are our recipe’s foundation, bringing together various Python libraries each with its unique flavor. First, we have os
, our Swiss Army knife for navigating the filesystem, necessary for saving files where we need them. Then, there’s streamlit
, the library transforms our script into a sleek, interactive web app with minimal effort. Next,openai
connects us to OpenAI’s Large-language models, allowing our app to understand and summarize video content. With dotenv
, we keep our secret ingredients safe, ensuring our API keys stay private. The pytube
library is what we use for downloading YouTube videos, and finally, youtube_transcript_api
and its companion, TextFormatter
, help us fetch and clean video transcripts. Together, these imports set the stage for our app’s functionality in a neat, professional yet approachable manner.
import os
import streamlit as st
import openai
from dotenv import load_dotenv
from pytube import YouTube
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.formatters import TextFormatter
This section imports necessary libraries:
os
for interacting with the operating system,streamlit
for building the web app interface,openai
to access OpenAI’s API,dotenv
for managing environment variables,pytube
to download videos from YouTube,youtube_transcript_api
and itsTextFormatter
for fetching and formatting video transcripts.
Environment Variables and OpenAI API Key
In this part of our code, we’re bringing dotenv
into play, where we’re essentially whispering our secrets into a lockbox—our .env
file. This is where we stash away our OpenAI API key. We need this key to communicate with OpenAI, but we hide it so others can’t use it. Just like a skilled magician doesn’t reveal their tricks, we use dotenv
to keep our API key under wraps, away from prying eyes. Then, with os.getenv('OPENAI_API_KEY')
, we fetch this secret key and entrust it to the variable openai.api_key
, enabling our application to communicate with OpenAI’s servers. It’s pretty straight forward if you’ve done something like this before.
load_dotenv()
openai.api_key = os.getenv('OPENAI_API_KEY')
Really, to sum it up in simple terms, these lines load environment variables from a .env
file and set the OpenAI API key, enabling secure API calls to OpenAI’s services without hardcoding sensitive information.
Streamlit Sidebar Setup
In this section, we’re setting up the sidebar for our application using Streamlit, a tool that allows developers to quickly turn data scripts into shareable web apps. Streamlit is all about simplicity and efficiency, making it easier for us to create interactive elements without getting bogged down in web development details. I’m not a web developer, so this is perfect for me.
The sidebar we’re building acts as a navigational panel for users, where I’m simply linking to my blog, GitHub, and social profiles. Really, I just wanted a little more on the page than a text entry box and a button.
To accomplish this we use commands like st.sidebar.title
for the heading and st.sidebar.markdown
for the links. It’s an easy way to organize links and provide easy access to other resources, enhancing the user experience. Through this setup, we’re leveraging Streamlit’s capability to create interactive and aesthetically pleasing UI components with minimal code, making our app a bit more engaging and professionally structured.
st.sidebar.title('My Links')
st.sidebar.markdown('[Blog](https://brandonjcarroll.com)')
...
Directories Creation for Videos and Transcripts
Next, our code ensures that directories for storing downloaded videos and their transcripts exist, preventing errors during the download process.
os.makedirs('videos', exist_ok=True)
os.makedirs('transcripts', exist_ok=True)
Now we move on to the functions that are doing most of the work.
The Functions that do the Work
The first function you see in the code is the check_and_download_transcript function.
check_and_download_transcript(video_id)
# Function to check and download the transcript
def check_and_download_transcript(video_id):
try:
# Fetch the transcript
transcript = YouTubeTranscriptApi.get_transcript(video_id)
# Format the transcript into plain text
formatter = TextFormatter()
transcript_text = formatter.format_transcript(transcript)
# Save the transcript to a file
with open(f'transcripts/{video_id}.txt', 'w', encoding='utf-8') as file:
file.write(transcript_text)
st.success("Transcript downloaded successfully.")
return transcript_text
except Exception as e:
st.warning("No transcript available or an error occurred.")
return None
As you can see in the code snippet above, this function, takes one argument: video_id
, which is the unique identifier for a YouTube video. Its main job is to fetch, format, and save the video’s transcript to a file, providing feedback along the way.
Here’s a step-by-step breakdown of the logic:
- The Try-Except Block: The function is wrapped in a try-except block to handle any potential errors gracefully. If anything goes wrong during the process, instead of crashing, it will notify the user that either no transcript is available or some other error occurred.
- Fetching the Transcript: It uses the
YouTubeTranscriptApi.get_transcript
method to retrieve the transcript for the givenvideo_id
. This step assumes the video has a transcript available, which isn’t always the case. Interestingly enough It seemed to be about 50/50 that the videos had a transcript. - Formatting the Transcript: Once the transcript is fetched, it’s format is not necessarily clean and readable. That’s where
TextFormatter
comes in. I used it to format the raw transcript data into plain text, making it more understandable and easier to work with. - Saving the Transcript: The formatted transcript text is then saved to a file within a ‘transcripts’ directory. The filename is constructed using the video ID for easy identification later on. This is done using a context manager (
with open(...) as file:
) to ensure the file is properly opened and closed after writing, minimizing the chance of file corruption or other I/O errors. There’s probably a better way to do it, but this is what I knew how to do. - Feedback to User: If everything goes smoothly, the function informs the user via Streamlit’s
st.success
method that the transcript was downloaded successfully. This feedback is important for a good user experience, letting them know the process worked as expected. - Return Value: Finally, the function returns the formatted transcript text. In case of an error, it returns
None
. This return value can be useful if you want to further process the transcript text within your application.
In essence, this function encapsulates the whole process of dealing with YouTube video transcripts in a user-friendly way, abstracting away the complexities and potential pitfalls of API calls and file handling.
download_video(url)
The next function you see in the code is the, download_video
function.
# Function to download the video
def download_video(url):
try:
yt = YouTube(url)
video = yt.streams.filter(file_extension='mp4', progressive=True).order_by('resolution').desc().first()
if video:
st.info(f"Downloading video: {yt.title}")
safe_filename = ''.join(char for char in yt.title if char.isalnum() or char in " -_").rstrip()
video.download(output_path='videos', filename=f"{safe_filename}.mp4")
st.success("Download complete!")
download_path = os.path.join('videos', f"{safe_filename}.mp4")
return download_path, yt.title # Return the download path and video title
else:
st.error("No downloadable video found.")
return None, None
except Exception as e:
st.error(f"An error occurred while downloading the video: {e}")
return None, None
This function downloads a YouTube video. Why? Well, the idea was that if I couldn’t get a transcript, I could download the video and then just transcribe it myself. That’s why it’s structured to handle the process from extracting the video streams to saving the video file, while also providing user feedback via Streamlit’s interface.
Here’s the details of how it operates:
- Try-Except Block: Again I want to gracefully manage any errors that may arise during the video download process. If an error occurs, it notifies the user instead of letting the application crash.
- Extracting Video Information: I start by creating a
YouTube
object (from thepytube
library) with the provided URL. This object allows access to various details and streams associated with the YouTube video. - Selecting the Video Stream: I then filter the available video streams to only those with an ‘mp4’ file extension and are progressive (meaning the video and audio are combined in a single file). Among these, it chooses the one with the highest resolution by sorting the streams in descending order and picking the first one.
- Downloading the Video: If a suitable video stream is found, the function proceeds to download it. It first generates a “safe” filename by removing any characters from the video title that aren’t alphanumeric or are not one of ” -_”. This is to prevent issues with file systems that may not support certain characters in file names. The video is then downloaded to a specified ‘videos’ directory, and the user is informed about the download start and completion via Streamlit’s
st.info
andst.success
methods, respectively. - Feedback and Return Values: The function also provides immediate feedback to the user:
- If the video is successfully downloaded, it displays a success message and returns the path to the downloaded video file along with the video’s title.
- If no downloadable video stream is found, it displays an error message and returns
None
for both the download path and video title. - In case of any other exceptions during the download process, it shows an error with the exception message and also returns
None
for both values.
- Error Handling: The error handling ensures that any issues encountered during the download process are communicated back to the user, maintaining transparency about the operation’s success or failure.
summarize_transcript(transcript)
Next, the summarize_transcript
function is designed to leverage the capabilities of OpenAI’s GPT models to summarize a given text, in this case, a transcript from a YouTube video. This process involves a few key steps and employs error handling similar to the previous functions (check_and_download_transcript
and download_video
).
Here’s how it works:
- Calling OpenAI’s API: At its core, the function interacts with the OpenAI API by sending a request to the
openai.chat.completions.create
endpoint. This request includes the transcript text along with a specific prompt that guides the AI on how to construct the summary. The choice of the “gpt-3.5-turbo” model is noteworthy for its efficiency and effectiveness in generating human-like text based on the provided instructions. - Formatting the Prompt and Handling the Response: The prompt is carefully crafted to instruct the AI to provide a summary that’s concise and encourages the reader to check out the video, aiming for a balance between informativeness and engagement. Once the response is received, the function extracts the summary from the returned data structure, ensuring it’s neatly trimmed of any excess whitespace.
- Error Handling: Similar to the other functions,
summarize_transcript
includes a try-except block to gracefully handle any exceptions that might occur during the API call. This is is important here because you could have network issues or API limits could cause unexpected errors. If an error occurs, the function uses Streamlit’sst.error
method to display an appropriate message, informing the user of the issue without causing the entire application to crash. - Return Value: If successful, the function returns the generated summary, which can then be displayed to the user or used in further processing within the app. In case of an error, it returns
None
, allowing the calling code to handle the absence of a summary appropriately.
OK, with that all said, lets talk about the UI and the logic for processing.
Streamlit UI and Processing Logic
This section sets up the Streamlit user interface:
- Displays a title which I think is pretty self explanatory in the code.
- Provides an input field for the YouTube video URL which is also self explanatory in the code.
- Includes a button that, when clicked, processes the video by first attempting to download its transcript and then, if available, summarizing it. If the transcript isn’t available, it attempts to download the video instead.
# Streamlit UI setup
st.title("YouTube Video and Transcript Processor")
# Input field for YouTube URL
video_url = st.text_input("Enter the YouTube video URL")
# Button to process the video
if st.button("Process Video"):
if video_url:
try:
video_id = YouTube(video_url).video_id
yt = YouTube(video_url) # Get YouTube object to access the title
transcript_text = check_and_download_transcript(video_id)
if transcript_text:
st.text_area("Transcript", transcript_text, height=300)
summary = summarize_transcript(transcript_text)
if summary:
# Display the video title, URL, and summary in formatted markdown
formatted_text = f"- [{yt.title}]({video_url}) - {summary}"
st.markdown(formatted_text, unsafe_allow_html=True)
# Provide the same text in a text area for easy manual copying
copy_text = f"- {yt.title} ({video_url}) - {summary}"
st.text_area("Copy the text below:", copy_text, height=100)
else:
st.warning("No transcript is available for this video. Starting download...")
download_path, video_title = download_video(video_url) # Get video title from download function
if download_path:
st.info(f"Video has been downloaded to: {download_path}")
# Optionally, you might want to display video info here as well
else:
st.error("Failed to download the video.")
except Exception as e:
st.error(f"An error occurred: {e}")
else:
st.warning("Please enter a YouTube URL.")
This block of code effectively ties together the app’s functionalities, from interacting with YouTube for video and transcript data to leveraging AI for summarization, and finally, presenting the results in an interactive and user-friendly manner.
Overall I think the app demonstrates a practical application of Generative AI and just a bit of possibility.
Working with Streamlit has been a ton of fun. I hope this example helps give you an idea of what you can do with a bit pf python, streamlit, and an LLM.
Got ideas or want to share some feedback? Please do, I welcome the words!