How to Extract Text From Videos using Python

Speech recognition is an interesting task that allows you to recognize the text behind the audio. With the use of voice recognition, we can also extract text from a video with python. In this article, I will walk you through how to extract text from videos using Python.

SpeechRecognition is a Python library for performing speech recognition with support for Google’s API, while moviepy allows to cut, read, and write all the most common audio and video formats. Moreover, moviepy supports various file format: .ogv.mp4.mpeg.avi.mov.

Extract Text From Videos using Python

In this section, I will take you through how to extract text from a video using Python. The first step is to download a video. After downloading the videos you need to install two Python libraries:

  1. SpeechRecognition: pip install SpeechRecognition 
  2. moviepy: pip install moviepy

After installing the above two Python libraries you can start with coding. Here is the complete Python program to convert a video into the text:

import speech_recognition as sr 
import moviepy.editor as mp
from moviepy.video.io.ffmpeg_tools import ffmpeg_extract_subclip

num_seconds_video= 52*60
print("The video is {} seconds".format(num_seconds_video))
l=list(range(0,num_seconds_video+1,60))

diz={}
for i in range(len(l)-1):
    ffmpeg_extract_subclip("videorl.mp4", l[i]-2*(l[i]!=0), l[i+1], targetname="chunks/cut{}.mp4".format(i+1))
    clip = mp.VideoFileClip(r"chunks/cut{}.mp4".format(i+1)) 
    clip.audio.write_audiofile(r"converted/converted{}.wav".format(i+1))
    r = sr.Recognizer()
    audio = sr.AudioFile("converted/converted{}.wav".format(i+1))
    with audio as source:
      r.adjust_for_ambient_noise(source)  
      audio_file = r.record(source)
    result = r.recognize_google(audio_file)
    diz['chunk{}'.format(i+1)]=result

After executing the above Python code you need to create a text document to store all the text that has been extracted from the video:

l_chunks=[diz['chunk{}'.format(i+1)] for i in range(len(diz))]
text='\n'.join(l_chunks)

with open('recognized.txt',mode ='w') as file: 
   file.write("Recognized Speech:") 
   file.write("\n") 
   file.write(text) 
   print("Finally ready!")

Subscribe

Related articles

Amarendra Singh
Amarendra Singh
Stock Trader, SEO, Music Producer

Leave a reply

Please enter your comment!
Please enter your name here