-
Notifications
You must be signed in to change notification settings - Fork 7
Feature/add file support for messenger #185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds file and image support (multimodal capabilities) to Telegram and Facebook Messenger integrations, enabling AI agents to analyze images and documents sent by users.
Key Changes:
- Implemented file/image download and base64 encoding for both Telegram and Messenger integrations
- Added multimodal request handling via
service.run_multi()to process text, images, and files together - Enhanced session management to preserve multimodal conversation context in OpenAI framework
- Updated documentation with multimodal feature descriptions, limitations, and usage examples
- Added Google ADK integration examples alongside existing OpenAI examples
Reviewed changes
Copilot reviewed 13 out of 29 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
examples/containerized/openai/uv.lock |
Downgraded openai-agents from 0.6.4 to 0.6.3; updated revision |
examples/cli/openai/uv.lock |
Updated revision number |
examples/cli/openai-dynamic/uv.lock |
Downgraded openai-agents; updated revision |
examples/cli/multi/uv.lock |
Downgraded openai-agents; updated revision |
examples/aws-serverless/openai/uv.lock |
Downgraded openai-agents; updated revision |
examples/aws-containerized/openai-dynamodb/uv.lock |
Downgraded openai-agents; updated revision |
examples/api/whatsapp/uv.lock |
Changed agentkernel source from local path to PyPI registry |
examples/api/telegram/uv.lock |
Downgraded openai-agents; updated revision |
examples/api/telegram/server_adk.py |
New Google ADK-based Telegram bot example with multimodal support |
examples/api/telegram/build.sh |
Added 'adk' to the local build dependencies |
examples/api/telegram/README.md |
Comprehensive multimodal documentation with usage examples and file type tables |
examples/api/slack/uv.lock |
Downgraded openai-agents; updated revision |
examples/api/openai/uv.lock |
Downgraded openai-agents; updated revision |
examples/api/messenger/uv.lock |
Changed agentkernel to local source; downgraded multiple packages |
examples/api/messenger/server_adk.py |
New Google ADK-based Messenger bot example with multimodal support |
examples/api/messenger/build.sh |
Added 'adk' to the local build dependencies |
examples/api/messenger/README.md |
Added multimodal feature documentation with file types and size limits |
examples/api/mcp/multi/uv.lock |
Downgraded openai-agents; updated revision |
examples/api/instagram/uv.lock |
Downgraded openai-agents; updated revision |
examples/api/hooks/uv.lock |
Downgraded openai-agents; updated revision |
examples/api/gmail/uv.lock |
Downgraded openai-agents; updated revision |
examples/api/a2a/multi/uv.lock |
Downgraded openai-agents; updated revision |
docs/docs/integrations/telegram.md |
Enhanced documentation with multimodal features, usage examples, and limitations |
docs/docs/integrations/messenger.md |
Added comprehensive multimodal documentation with implementation details |
ak-py/src/agentkernel/integration/telegram/telegram_chat.py |
Core implementation for file/image download, processing, and multimodal request handling |
ak-py/src/agentkernel/integration/telegram/README.md |
Technical documentation for multimodal support with configuration details |
ak-py/src/agentkernel/integration/messenger/messenger_chat.py |
Core implementation for attachment processing with file size validation |
ak-py/src/agentkernel/integration/messenger/README.md |
Technical documentation for multimodal support with security considerations |
ak-py/src/agentkernel/framework/openai/openai.py |
Added session persistence for multimodal conversations |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| file_info = await self._get_file_info(file_id) | ||
| if not file_info: | ||
| self._log.warning("Failed to get photo file info") | ||
| failed_files.append("photo") | ||
| return failed_files |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function processes files and images but immediately returns when a failure occurs, which means if processing one file fails, subsequent files in the message won't be processed. Consider collecting all failures instead of returning early, so that users can still receive responses for successfully processed files.
| file_info = await self._get_file_info(file_id) | ||
| if not file_info: | ||
| self._log.warning(f"Failed to get file info for {file_name}") | ||
| failed_files.append(file_name) | ||
| return failed_files |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function returns early when file info retrieval fails, preventing subsequent files from being processed. Consider continuing to process other files instead of returning immediately, and only add the failed file to the failed_files list.
| @@ -1,11 +1,14 @@ | |||
| import base64 | |||
| import logging | |||
| import mimetypes | |||
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import of 'mimetypes' is added but never used in the code. The MIME types are being determined from Telegram's API response or hardcoded values, not using the mimetypes module. Consider removing this unused import.
| import mimetypes |
| reply = (await Runner.run(agent.agent, message_content, session=None)).final_output | ||
|
|
||
| # Manually save the multimodal conversation to session for future reference | ||
| if session: | ||
| openai_session = session.get("openai") or session.set("openai", OpenAISession()) | ||
| # Add user message | ||
| await openai_session.add_items([{"role": "user", "content": prompt}]) | ||
| # Add assistant response | ||
| await openai_session.add_items([{"role": "assistant", "content": reply}]) | ||
|
|
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The multimodal conversation is being manually saved to the session after the agent run completes. However, if an exception occurs during Runner.run() execution on line 170, the conversation won't be saved to the session. Consider using a try-finally block or moving the session save logic to ensure it executes even if an error occurs, or handle this case explicitly.
| reply = (await Runner.run(agent.agent, message_content, session=None)).final_output | |
| # Manually save the multimodal conversation to session for future reference | |
| if session: | |
| openai_session = session.get("openai") or session.set("openai", OpenAISession()) | |
| # Add user message | |
| await openai_session.add_items([{"role": "user", "content": prompt}]) | |
| # Add assistant response | |
| await openai_session.add_items([{"role": "assistant", "content": reply}]) | |
| try: | |
| reply = (await Runner.run(agent.agent, message_content, session=None)).final_output | |
| except Exception: | |
| # Even if the agent run fails, persist the user message for context | |
| if session: | |
| openai_session = session.get("openai") or session.set("openai", OpenAISession()) | |
| await openai_session.add_items([{"role": "user", "content": prompt}]) | |
| # Re-raise so the outer handler can return an error reply | |
| raise | |
| else: | |
| # Manually save the multimodal conversation to session for future reference | |
| if session: | |
| openai_session = session.get("openai") or session.set("openai", OpenAISession()) | |
| # Add user message | |
| await openai_session.add_items([{"role": "user", "content": prompt}]) | |
| # Add assistant response | |
| await openai_session.add_items([{"role": "assistant", "content": reply}]) |
| return failed_files | ||
|
|
||
| file_path = file_info.get("file_path") | ||
| file_size = file_info.get("file_size", 0) | ||
|
|
||
| # Download file | ||
| file_content = await self._download_telegram_file(file_path) | ||
| if file_content is None: | ||
| self._log.warning(f"Failed to download file: {file_name}") | ||
| failed_files.append(file_name) | ||
| return failed_files | ||
|
|
||
| # Base64 encode | ||
| file_data_base64 = base64.b64encode(file_content).decode("utf-8") | ||
|
|
||
| # Add as file request | ||
| requests.append( | ||
| AgentRequestFile( | ||
| file_data=file_data_base64, | ||
| name=file_name, | ||
| mime_type=mime_type, | ||
| ) | ||
| ) |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function returns early when file download fails, preventing subsequent files from being processed. Consider continuing to process other files instead of returning immediately, and only add the failed file to the failed_files list.
| return failed_files | |
| file_path = file_info.get("file_path") | |
| file_size = file_info.get("file_size", 0) | |
| # Download file | |
| file_content = await self._download_telegram_file(file_path) | |
| if file_content is None: | |
| self._log.warning(f"Failed to download file: {file_name}") | |
| failed_files.append(file_name) | |
| return failed_files | |
| # Base64 encode | |
| file_data_base64 = base64.b64encode(file_content).decode("utf-8") | |
| # Add as file request | |
| requests.append( | |
| AgentRequestFile( | |
| file_data=file_data_base64, | |
| name=file_name, | |
| mime_type=mime_type, | |
| ) | |
| ) | |
| file_path = file_info.get("file_path") if file_info else None | |
| file_size = file_info.get("file_size", 0) if file_info else 0 | |
| # Download file | |
| if file_path: | |
| file_content = await self._download_telegram_file(file_path) | |
| else: | |
| file_content = None | |
| if file_content is None: | |
| self._log.warning(f"Failed to download file: {file_name}") | |
| failed_files.append(file_name) | |
| else: | |
| # Base64 encode | |
| file_data_base64 = base64.b64encode(file_content).decode("utf-8") | |
| # Add as file request | |
| requests.append( | |
| AgentRequestFile( | |
| file_data=file_data_base64, | |
| name=file_name, | |
| mime_type=mime_type, | |
| ) | |
| ) |
| import hashlib | ||
| import hmac | ||
| import logging | ||
| import mimetypes |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import of 'mimetypes' is added but never used in the code. The MIME types are obtained from the HTTP response headers, not using the mimetypes module. Consider removing this unused import.
| import mimetypes |
| async def _process_files(self, message: dict, requests: list) -> list[str]: | ||
| """ | ||
| Process files and images in a Telegram message. | ||
| :param message: Message object from Telegram | ||
| :param requests: List to append AgentRequestFile/AgentRequestImage objects | ||
| :return: List of failed file names | ||
| """ | ||
| failed_files = [] | ||
|
|
||
| # Process photos (images) | ||
| if "photo" in message: | ||
| photos = message.get("photo", []) | ||
| if photos: | ||
| # Get the largest photo | ||
| largest_photo = photos[-1] | ||
| file_id = largest_photo.get("file_id") | ||
|
|
||
| try: | ||
| self._log.debug(f"Processing photo: {file_id}") | ||
|
|
||
| # Get file info | ||
| file_info = await self._get_file_info(file_id) | ||
| if not file_info: | ||
| self._log.warning("Failed to get photo file info") | ||
| failed_files.append("photo") | ||
| return failed_files | ||
|
|
||
| file_path = file_info.get("file_path") | ||
| file_size = file_info.get("file_size", 0) | ||
|
|
||
| # Download file | ||
| file_content = await self._download_telegram_file(file_path) | ||
| if file_content is None: | ||
| self._log.warning(f"Failed to download photo") | ||
| failed_files.append("photo") | ||
| return failed_files | ||
|
|
||
| # Base64 encode | ||
| image_data_base64 = base64.b64encode(file_content).decode("utf-8") | ||
|
|
||
| # Add as image request | ||
| requests.append( | ||
| AgentRequestImage( | ||
| image_data=image_data_base64, | ||
| name="photo.jpg", | ||
| mime_type="image/jpeg", | ||
| ) | ||
| ) | ||
| self._log.debug(f"Added photo to request (size: {file_size} bytes)") | ||
|
|
||
| except Exception as e: | ||
| self._log.error(f"Error processing photo: {e}\n{traceback.format_exc()}") | ||
| failed_files.append("photo") | ||
|
|
||
| # Process documents (files) | ||
| if "document" in message: | ||
| document = message.get("document", {}) | ||
| file_id = document.get("file_id") | ||
| file_name = document.get("file_name", "document") | ||
| mime_type = document.get("mime_type", "application/octet-stream") | ||
|
|
||
| try: | ||
| self._log.debug(f"Processing document: {file_id} ({file_name})") | ||
|
|
||
| # Get file info | ||
| file_info = await self._get_file_info(file_id) | ||
| if not file_info: | ||
| self._log.warning(f"Failed to get file info for {file_name}") | ||
| failed_files.append(file_name) | ||
| return failed_files | ||
|
|
||
| file_path = file_info.get("file_path") | ||
| file_size = file_info.get("file_size", 0) | ||
|
|
||
| # Download file | ||
| file_content = await self._download_telegram_file(file_path) | ||
| if file_content is None: | ||
| self._log.warning(f"Failed to download file: {file_name}") | ||
| failed_files.append(file_name) | ||
| return failed_files | ||
|
|
||
| # Base64 encode | ||
| file_data_base64 = base64.b64encode(file_content).decode("utf-8") | ||
|
|
||
| # Add as file request | ||
| requests.append( | ||
| AgentRequestFile( | ||
| file_data=file_data_base64, | ||
| name=file_name, | ||
| mime_type=mime_type, | ||
| ) | ||
| ) | ||
| self._log.debug(f"Added file to request: {file_name} (size: {file_size} bytes)") | ||
|
|
||
| except Exception as e: | ||
| self._log.error(f"Error processing document: {e}\n{traceback.format_exc()}") | ||
| failed_files.append(file_name) | ||
|
|
||
| return failed_files |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file size validation logic is missing for Telegram file downloads. Unlike the Messenger integration which checks file size against max_file_size configuration (line 276 in messenger_chat.py), the Telegram integration downloads files without validating their size. This could lead to memory issues or timeouts when processing very large files. Consider adding size validation similar to the Messenger implementation.
| if command == "/status": | ||
| await self._send_message(chat_id, "✅ Bot is running!") | ||
| elif command == "/about": | ||
| await self._send_message(chat_id, "I'm powered by Agent Kernel and OpenAI") | ||
| else: | ||
| await super()._handle_command(chat_id, command) |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent whitespace detected. This line uses tabs or mixed indentation instead of spaces. The rest of the codebase uses spaces consistently. Please fix the indentation to use 4 spaces.
| if command == "/status": | |
| await self._send_message(chat_id, "✅ Bot is running!") | |
| elif command == "/about": | |
| await self._send_message(chat_id, "I'm powered by Agent Kernel and OpenAI") | |
| else: | |
| await super()._handle_command(chat_id, command) | |
| if command == "/status": | |
| await self._send_message(chat_id, "✅ Bot is running!") | |
| elif command == "/about": | |
| await self._send_message(chat_id, "I'm powered by Agent Kernel and OpenAI") | |
| else: | |
| await super()._handle_command(chat_id, command) |
|
|
||
| **Limitations:** | ||
|
|
||
| - Maximum file size: ~2MB |
Copilot
AI
Dec 26, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation states a maximum file size of "~2MB" but doesn't reference the actual configuration option or explain how to change it. Consider adding a reference to the configuration option (api.max_file_size) and clarifying the exact default value to match the Messenger documentation which specifies "2 MB (2,097,152 bytes)".
| - Maximum file size: ~2MB | |
| - Maximum file size: 2 MB (2,097,152 bytes) by default, controlled by the `api.max_file_size` configuration option (adjust this setting to change the limit) |
Description
This change will add file & image support for messenger integrations
Type of Change
Testing
Checklist
Screenshots (if applicable)
Additional Notes