English audio and subtitles are actually in Spanish

Seattle's KUNS is a Spanish language channel that I like to watch sometimes. Many of their programs have a splash message at the beginning saying that English subtitles are on CC3. Many years ago this was the case, but it appears to no longer be the case.

There's a telenovela that just began re-airing last night called " Los ricos también lloran" that I'd like to watch with English subtitles. However, when I look at the details of the recorded program, the primary audio track is labeled English but is in Spanish and the only subtitle track I could find was also in Spanish.

I'm willing to watch these with another viewer such as VLC or MX Player, but I'd like to have English subtitles. Is there a simple way to do this? Or will I have to save the Spanish subtitle to a text file and manually translate it? .

2 Likes

You can use ccextractor which should pull both CC1 and CC3 and then you could place it as a sidecar srt.

Currently Channels only pulls out CC1, as CC3 usage is quite rare.

CCExtractor came up with one .srt file but it was empty. I know the program file has at least one closed caption stream in it because I see it on all the players I've played it with, but it's in Spanish.

MediaInfo shows 8 Text streams that are muxed in the Video, the first one is labeled -CC1 and the second one is labeled -CC3, while the others are labeled -1 through -6.

VLC shows 4 subtitle streams, simply labeled Closed captions 1 through Closed captions 4. The only one that displays anything in the player is Closed captions 1 and it is in Spanish.

At this point, I've been unable to extract even the one subtitle track that plays, so maybe this is a no-go.

I tried using ffmpeg from the command line.
First, I ran:
ffmpeg -i input.mp4 -map 0:s:0 output.srt
and I got, "Output file #0 does not contain any stream"
Next, I ran:
ffmpeg -i input.mp4 output.srt
and got the same error.

Are there other methods I should try?

Can you try CCEXTRACTOR much better than ffmpeg ...

CCExtractor's home page | CCExtractor

If you converted to mp4 then the closed caption might have been lost already.

I didn't convert it - just renamed it to a temporary name. Also, it throws me off to work with .mpg filenames that aren't using MPEG-1 or MPEG-2 codecs, as these are AVC.

If ccextractor doesn't show anything then there's nothing there. It is the most comprehensive closed caption implementation out there.

Hmmm - then could it be something that acts like a subtitle but really isn't?
I just opened the aforementioned input.mp4 in VLC and played it. Then when I opened the Subtitle menu and selected Sub Track -> Closed captions 1 the subtitles appeared and when I selected Subtitle->Sub Track -> Disable they disappeared.

Check ffmpeg -i <file.ts> to see if the output shows "Closed Captions" in the video track.

To extract CC with ffmpeg, you have to use a different syntax. See Can ffmpeg extract closed caption data - Stack Overflow

What does MediaInfo show for the recorded file?
Using the Text view it should show any closed captions like this

Text #1
ID                                       : 2060 (0x80C)-CC1
Menu ID                                  : 5 (0x5)
Format                                   : EIA-608
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 59 min 59 s
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)
Language                                 : English
CaptionServiceName                       : CC1

Text #2
ID                                       : 2060 (0x80C)-CC3
Menu ID                                  : 5 (0x5)
Format                                   : EIA-608
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 59 min 59 s
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)
CaptionServiceName                       : CC3

Text #3
ID                                       : 2060 (0x80C)-1
Menu ID                                  : 5 (0x5)
Format                                   : EIA-708
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 59 min 59 s
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)
Language                                 : English

Text #4
ID                                       : 2060 (0x80C)-2
Menu ID                                  : 5 (0x5)
Format                                   : EIA-708
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 59 min 59 s
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)

Also, if you ran comskip on the recording it shows the closed captions in its log file.
Looks like this

Closed caption transcript
--------------------
0) S:     1 E:   925 L:   0 
1) S:  1030 E:  1215 L:   2   
2) S:  1330 E:  1497 L:  63 BENSON: Dropped into one of the wildest places on the planet...
3) S:  1498 E:  1515 L:  13  [ Chirping ]
4) S:  1568 E:  1617 L:  41  ...I'm going to run a unique experiment.
5) S:  1776 E:  1783 L:   2   

Here's the MediaInfo report:
What is says English for is actually Spanish (language and text).

General
ID                                       : 0 (0x0)
Complete name                            : D:\zz_\_MyDocs\Recorded TV\input.mp4
Format                                   : MPEG-TS
File size                                : 1.84 GiB
Duration                                 : 1 h 0 min
Overall bit rate mode                    : Variable
Overall bit rate                         : 4 317 kb/s
Movie name                               : La Herencia Un Legado de Amor
Law rating                               : TV-14
FileExtension_Invalid                    : ts m2t m2s m4t m4s tmf ts tp trp ty

Video
ID                                       : 5393 (0x1511)
Menu ID                                  : 718 (0x2CE)
Format                                   : AVC
Format/Info                              : Advanced Video Codec
Format profile                           : [email protected]
Format settings                          : CABAC / 4 Ref Frames
Format settings, CABAC                   : Yes
Format settings, Reference frames        : 4 frames
Codec ID                                 : 27
Duration                                 : 1 h 0 min
Bit rate mode                            : Constant
Bit rate                                 : 3 510 kb/s
Maximum bit rate                         : 3 718 kb/s
Width                                    : 1 280 pixels
Height                                   : 720 pixels
Display aspect ratio                     : 16:9
Frame rate                               : 59.940 (60000/1001) FPS
Standard                                 : Component
Color space                              : YUV
Chroma subsampling                       : 4:2:0
Bit depth                                : 8 bits
Scan type                                : Progressive
Bits/(Pixel*Frame)                       : 0.064
Stream size                              : 1.50 GiB (82%)
Color range                              : Limited
EBP_AcquisitionTime                      : UTC 2022-09-08 04:59:29.055999756
EBP_Distance                             : 2.000
EBP_Mode                                 : Explicit

Audio #1
ID                                       : 5395 (0x1513)
Menu ID                                  : 718 (0x2CE)
Format                                   : AC-3
Format/Info                              : Audio Coding 3
Commercial name                          : Dolby Digital
Codec ID                                 : 129
Duration                                 : 1 h 0 min
Bit rate mode                            : Constant
Bit rate                                 : 384 kb/s
Maximum bit rate                         : 411 kb/s
Channel(s)                               : 6 channels
Channel layout                           : L R C LFE Ls Rs
Sampling rate                            : 48.0 kHz
Frame rate                               : 31.250 FPS (1536 SPF)
Compression mode                         : Lossy
Delay relative to video                  : -2 s 65 ms
Stream size                              : 168 MiB (9%)
Language                                 : English
Service kind                             : Complete Main

Audio #2
ID                                       : 5396 (0x1514)
Menu ID                                  : 718 (0x2CE)
Format                                   : AC-3
Format/Info                              : Audio Coding 3
Commercial name                          : Dolby Digital
Codec ID                                 : 129
Duration                                 : 1 h 0 min
Bit rate mode                            : Constant
Bit rate                                 : 192 kb/s
Maximum bit rate                         : 214 kb/s
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 48.0 kHz
Frame rate                               : 31.250 FPS (1536 SPF)
Compression mode                         : Lossy
Delay relative to video                  : -2 s 65 ms
Stream size                              : 83.8 MiB (4%)
Language                                 : Spanish
Service kind                             : Complete Main

Text #1
ID                                       : 5393 (0x1511)-CC1
Menu ID                                  : 718 (0x2CE)
Format                                   : EIA-608
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 1 h 0 min
Duration of the visible content          : 1 h 0 min
Start time (commands)                    : 23 h 57 min
Start time                               : 23 h 57 min
End time                                 : 24 h 58 min
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)
Count of frames before first event       : 952
Type of the first event                  : PopOn

Text #2
ID                                       : 5393 (0x1511)-CC3
Menu ID                                  : 718 (0x2CE)
Format                                   : EIA-608
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 1 h 0 min
Start time (commands)                    : 23 h 57 min
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)

Text #3
ID                                       : 5393 (0x1511)-1
Menu ID                                  : 718 (0x2CE)
Format                                   : EIA-708
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 1 h 0 min
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)

Text #4
ID                                       : 5393 (0x1511)-2
Menu ID                                  : 718 (0x2CE)
Format                                   : EIA-708
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 1 h 0 min
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)

Text #5
ID                                       : 5393 (0x1511)-3
Menu ID                                  : 718 (0x2CE)
Format                                   : EIA-708
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 1 h 0 min
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)

Text #6
ID                                       : 5393 (0x1511)-4
Menu ID                                  : 718 (0x2CE)
Format                                   : EIA-708
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 1 h 0 min
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)

Text #7
ID                                       : 5393 (0x1511)-5
Menu ID                                  : 718 (0x2CE)
Format                                   : EIA-708
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 1 h 0 min
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)

Text #8
ID                                       : 5393 (0x1511)-6
Menu ID                                  : 718 (0x2CE)
Format                                   : EIA-708
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 1 h 0 min
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)

Menu
ID                                       : 5392 (0x1510)
Menu ID                                  : 718 (0x2CE)
Format                                   : AVC / AC-3 / AC-3
Duration                                 : 1 h 0 min
List                                     : 5393 (0x1511) (AVC) / 5395 (0x1513) (AC-3, English) / 5396 (0x1514) (AC-3, Spanish)
Title                                    : La Herencia Un Legado de Amor
Language                                 :  / English / Spanish
Law rating                               : TV-14

I guess I've always used VideoRedo for commercial detection, so I didn't realize Comskip grabbed that kind of information.

That one I showed MediaInfo on says it has CC1 and CC3, but all the players I tried only let me select CC1 as if CC3 doesn't exist.

The comskip log for that recording shows the English closed captions (I assume CC1).
I would check the link @tmm1 gave to see if you can extract them with ffmpeg if ccextractor doesn't work.

I use MediaInfo a lot, but I've often experienced that much of its report relies on header information. It wasn't always this way. A number of years ago, one of its developers (Jerome Martinez) said that MediaInfo by default parses the first few hundreds of frames - typically up to a maximum of 64MB - to compose its report. I certainly doesn't seem to have done that for the files mentioned in this thread.

To help me try to use ffmpeg to extract the subtitle stream, I referenced the thread,

because it was heavy with explanation. I'll take another look at the other one to compare and contrast.

There is a difference between subtitle extraction and closed caption extraction in ffmpeg.

It worked! I renamed the file to input.ts and ran:

ffmpeg -f lavfi -i input.ts[out+subcc] -map 0:1 output.srt

and the file was parsed to a 92 KB .srt file within about 9 minutes. I also tried other stream numbers (-map 0:0, -map 0:2, -map 0:3) - due to the MediaInfo report, but nothing was found for those streams.

I'm disappointed to find that there was only one stream and that it was not in English - but now I at least have one. Now I wonder if there's a simple way to convert the Spanish to English.

I clearly didn't understand the difference between subtitles and closed captions - probably because from the viewer's standpoint they seem interchangeable. Now I know first hand that closed captions are not handled the same as subtitles because they're embedded in individual frames and must be extracted one by one in order to make an independent stream.

Subtitles are a separate track.

Closed captions are embedded inside video frames.

So only -map 0:1 will work, because only the video track has caption info.

To extract CC3, you can add -data_field second before the -i

1 Like

Oops - I misunderstood the application of the map option in this case.

So I tried:

ffmpeg -f lavfi -data_field second -i input.ts[out+subcc] -map 0:1 output3.srt

and got back:

Unrecognized option 'data_field'
Error splitting the argument list: Option not found

your ffmpeg might be too old then

I got the latest ffmpeg 5.1.1 and now I get:

No such filter: 'input.ts'
input.ts[out+subcc]: Invalid argument