You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DO NOT DELETE THIS! Please take the time to fill this out properly. I am not able to help you if I do not know what you are executing and what error messages you are getting. If you are having problems with a specific video make sure to include the video id.
Which version of youtube-transcript-api are you using?
youtube-transcript-api 0.6.3
Expected behavior
Get the trancripts of some videos
For example: I expected to receive the english transcript
Actual behaviour
In local all work prefect!!
I use de Youtube API to search some videos and get the ID's to pass to the library, and actually work.
But when i deply an image in AWS Lambda Function with docker, it just doaent work all the videos that work in local now show:
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=ym30IDwQ5LI! This is most likely caused by:
Subtitles are disabled for this video
If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtube_transcript_api you are using and provide the information needed to replicate the error. Also make sure that there are no open issues which already describe your problem!
And for every video, i tried proxy, public and private proxy, even VPN but seem the same,
Dont get it, i can use the youtube API for search in AWS, but get blocked when are from AWS?
Please help!
This is the code i'm using.
from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api._errors import TranscriptsDisabled, VideoUnavailable, NoTranscriptFound
from googleapiclient.discovery import build
import json
from tqdm import tqdm
YOUTUBE_API_KEY = 'YT_API_KEY'
# Función Lambda
def lambda_handler(event, context):
search_results = search_videos("TED Talks", max_results=10)
transcripts = []
for video_id, video_title, published_at, channel_title in tqdm(search_results, desc="Procesando videos"):
try:
transcript = get_transcript(video_id)
processed_transcript = process_transcript(transcript)
transcripts.append(processed_transcript)
except NoTranscriptFound:
pass
return {
"statusCode": 200,
"body": json.dumps(str({
"transcripts": len(transcripts),
"sample": str(str(transcripts[-1][:30])+'...')
})
)
}
def search_videos(query, max_results=5):
youtube = build("youtube", "v3", developerKey=YOUTUBE_API_KEY)
request = youtube.search().list(
part="snippet",
q=query,
type="video",
order="date",
maxResults=max_results,
videoCaption="closedCaption" # Solo videos con subtítulos
)
response = request.execute()
videos = []
for item in response['items']:
video_id = item['id']['videoId']
video_title = item['snippet']['title']
published_at = item['snippet']['publishedAt']
channel_title = item['snippet']['channelTitle']
videos.append((video_id, video_title, published_at, channel_title))
return videos
def get_transcript(video_id):
try:
transcript = YouTubeTranscriptApi.get_transcript(video_id)
return transcript
except (TranscriptsDisabled, VideoUnavailable, NoTranscriptFound) as e:
print(f"Error : No Subtitulos ", e)
return ''
except Exception as e:
print(f"Error inesperado con proxy: {e}")
return ''
def process_transcript(transcript):
return " ".join([item['text'] for item in transcript])
if __name__ == '__main__':
print(lambda_handler('', ''))
Thanks for your help!
The text was updated successfully, but these errors were encountered:
DO NOT DELETE THIS! Please take the time to fill this out properly. I am not able to help you if I do not know what you are executing and what error messages you are getting. If you are having problems with a specific video make sure to include the video id.
To Reproduce
Steps to reproduce the behavior:
public.ecr.aws/lambda/python:3.11.2025.01.13.14
What code / cli command are you executing?
Which Python version are you using?
Python 3.11.11
Which version of youtube-transcript-api are you using?
youtube-transcript-api 0.6.3
Expected behavior
Get the trancripts of some videos
For example: I expected to receive the english transcript
Actual behaviour
In local all work prefect!!
I use de Youtube API to search some videos and get the ID's to pass to the library, and actually work.
But when i deply an image in AWS Lambda Function with docker, it just doaent work all the videos that work in local now show:
And for every video, i tried proxy, public and private proxy, even VPN but seem the same,
Dont get it, i can use the youtube API for search in AWS, but get blocked when are from AWS?
Please help!
This is the code i'm using.
Thanks for your help!
The text was updated successfully, but these errors were encountered: