利用Python进行粗糙的视频字幕识别

一.安装包准备

1.安装tesseract-ocr

2.添加语言文件

tesseract-ocr

安装教程参见：Tesseract OCR V5.0安装教程（Windows） - 简书

3.pytessetact识别库

通过cmd输入pip install pytesseract进行安装。

pytesseract包依赖于Tesseract执行文件，需要安装Tesseract，当然Tesseract只能识别标准的ASCII字符串，复杂的验证吗就无法使用pytesseract来读取了。

二.字幕识别步骤

1.导入需要的库

import pytesseract
import cv2
import numpy as np
from scipy import stats
import os
import matplotlib.pyplot as plt

2.切割视频帧

if __name__ == '__main__':
    path="zbslb.mp4"
    cap=cv2.VideoCapture(path)
    frame_count=int(cap.get(cv2.CAP_PROP_frame_COUNT))
    print(frame_count)

3.定义开始识别的位置

（接上）
    i=100
    while i 
4.定位并截取字幕位置 
（接上）
        shape = frame.shape
        img=frame[ 680:760,0:540]
        plt.imshow(img)
        plt.axis("off")
        plt.show() 
[680:760,0:540]是视频字幕的纵横比，如何精准的找这个位置？我是截取一张视频截图，利用画图软件的像素点瞄的，效果还不错。 
5.对截取的字幕部分图像进行灰度化处理、中值滤波去噪等 
（接上）
        img=cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) 
        _,img=cv2.threshold(img, 220, 255, cv2.THRESH_BINARY)
        tessdata_dir_config = '--tessdata-dir "C:\Program Files\Tesseract-OCR\tessdata"  --psm 7 -c preserve_interword_spaces=1'
        word = pytesseract.image_to_string(img,
                                           lang='chi_sim',
                                           config=' --psm 7 -c preserve_interword_spaces=1')
                                           #config=tessdata_dir_config)
        print(word)
        i=i+24*2

        if cv2.waitKey(10) & 0xff == ord("q"):
            break
                
    cap.release()
    cv2.destroyAllWindows()
 
6.效果：

利用Python进行粗糙的视频字幕识别

Python相关栏目本月热门文章