Skip to content
Advertisement

Create a dictionary from the ouput of the program

This is a Python Program to get all the captions from youtube link:

from pytube import YouTube
yt = YouTube('https://youtu.be/5MgBikgcWnY')
captions = yt.captions.all()
for caption in captions:
    print(caption)

and the output of the above program is:

<Caption lang="Arabic" code="ar">
<Caption lang="Chinese (China)" code="zh-CN">              
<Caption lang="English" code="en">
<Caption lang="English (auto-generated)" code="a.en">
<Caption lang="French" code="fr">
<Caption lang="German" code="de">
<Caption lang="Hungarian" code="hu">
<Caption lang="Italian" code="it">

But I want to get only the lang and code from the above output in a dictionary pair.

{"Arabic" : "ar", "Chinese" : "zh-CN", "English" : "en",
 "French : "fr", "German" : "de", "Hungarian" : "hu", "Italian" : "it"} 

Thanks in Advance.

Advertisement

Answer

It’s pretty simple

from pytube import YouTube

yt = YouTube('https://youtu.be/5MgBikgcWnY')
captions = yt.captions.all()
captions_dict = {}

for caption in captions:
    # Mapping the caption name to the caption code
    captions_dict[caption.name] = caption.code

If you want a one-liner

captions_dict = {caption.name: caption.code for caption in captions}

Output

{'Arabic': 'ar', 'Bangla': 'bn', 'Burmese': 'my', 'Chinese (China)': 'zh-CN',
'Chinese (Taiwan)': 'zh-TW', 'Croatian': 'hr', 'English': 'en', 
'English (auto-generated)': 'a.en', 'French': 'fr', 'German': 'de',
'Hebrew': 'iw', 'Hungarian': 'hu', 'Italian': 'it', 'Japanese': 'ja',
'Persian': 'fa', 'Polish': 'pl', 'Portuguese (Brazil)': 'pt-BR', 
'Russian': 'ru', 'Serbian': 'sr', 'Slovak': 'sk', 'Spanish': 'es', 
'Spanish (Spain)': 'es-ES', 'Thai': 'th', 'Turkish': 'tr', 
'Ukrainian': 'uk', 'Vietnamese': 'vi'}
Advertisement