How I use OCR+AI+Inpainting+Automation to create an auto hard sub translation product?
niumo
0 replies
Hi, everyone. I'm niumo and I'm developing a fully automated AI product that can translate video hard subtitles into another language. This product will be released in the product community today, and I hope to you like it.
https://www.producthunt.com/products/ghostcut
Here, I will introduce the background and technical principles of this product. The technology concept that I most want to share with everyone will be mentioned later.
# Background of Hard Subtitle Translation
As we all know, hard subtitles are embedded in the video and the subtitles are already encoded in the video. Many video translation software extract the sound in the video using ASR technology, and then translate it. But many videos have only background music and subtitles, how can these videos be translated? This is actually a problem in the market, so there are currently no good hard subtitle extraction and editing software on the market.
# Difficulties in hard subtitle extraction and translation
Due to the fact that hard subtitles are already embedded in the video, OCR technology is usually used. However, if we want to annotate the text in the video as before, it will be very complicated. Image annotation does not have a time concept, so just draw a box. But video is continuous, with time and space concepts, and when we annotate, there may be several situations to consider:
- The text is not continuously displayed
- The text position is variable
- The text may even be moving
- The text has its own style
Therefore, hard subtitle extraction requires more automation, higher recognition accuracy, and text grouping, because the text style of the same group is consistent.
After translation, layout and alignment also need to be considered, because the translated text may be long or short.
Inpainting needs to be combined with hard subtitle extraction. Usually, we have a requirement to put the new text in exactly the same location as the original subtitle. This requires Inpainting of the original video. Therefore, the efficiency and effect of Inpainting also need to be considered.
Let's summarize the difficulties of hard subtitle translation:
1. OCR and automation
2. Style extraction and layout settlement
3. Automation and efficiency of Inpainting
Our GhostCut-hard subtitle translation uses some of the technologies mentioned above to achieve fully automatic translation of hard subtitles. Fully automatic translation may have some problems, including extraction errors, translation errors, etc., so an online editor needs to be provided to facilitate adjustments and modifications.
Okay, the most important thing is here. What we really want to share is this product concept. After abstraction, it is actually:
✅Understanding-Extraction-Processing-Delivery
which is actually applicable to many products. In each product process, I think we can consider which AI technology can speed up this process and improve this effect. In addition to AI, we can also ask ourselves if we can automate it. When we look at it from this perspective, we find that many products will have new space.
Several other GhostCut products were developed in this way:
1. Intelligent text removal, which is a combination of OCR and Inpainting;
2. Automatic translation and dubbing, which is a combination of download+ASR+sound separation+translation+TTS+alignment algorithm, etc.
All you need to do is, upload, confirm the option and let GhostCut take care of the rest.
Okay, today's sharing is here. Our product is about to launch, and I am very much looking forward to sharing it with you.
AMA, and I'm also curious about how you design your products.
🤔
No comments yet be the first to help
No comments yet be the first to help