Skip to content

Conversation

@pulinduvidmal
Copy link
Contributor

Description

This change will add file & image support for messenger integrations

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update
  • CI/CD update
  • Other (please describe):

Testing

  • Unit tests pass locally
  • Integration tests pass locally
  • Manual testing completed
  • New tests added for changes

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Screenshots (if applicable)

Additional Notes

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds file and image support (multimodal capabilities) to Telegram and Facebook Messenger integrations, enabling AI agents to analyze images and documents sent by users.

Key Changes:

  • Implemented file/image download and base64 encoding for both Telegram and Messenger integrations
  • Added multimodal request handling via service.run_multi() to process text, images, and files together
  • Enhanced session management to preserve multimodal conversation context in OpenAI framework
  • Updated documentation with multimodal feature descriptions, limitations, and usage examples
  • Added Google ADK integration examples alongside existing OpenAI examples

Reviewed changes

Copilot reviewed 13 out of 29 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
examples/containerized/openai/uv.lock Downgraded openai-agents from 0.6.4 to 0.6.3; updated revision
examples/cli/openai/uv.lock Updated revision number
examples/cli/openai-dynamic/uv.lock Downgraded openai-agents; updated revision
examples/cli/multi/uv.lock Downgraded openai-agents; updated revision
examples/aws-serverless/openai/uv.lock Downgraded openai-agents; updated revision
examples/aws-containerized/openai-dynamodb/uv.lock Downgraded openai-agents; updated revision
examples/api/whatsapp/uv.lock Changed agentkernel source from local path to PyPI registry
examples/api/telegram/uv.lock Downgraded openai-agents; updated revision
examples/api/telegram/server_adk.py New Google ADK-based Telegram bot example with multimodal support
examples/api/telegram/build.sh Added 'adk' to the local build dependencies
examples/api/telegram/README.md Comprehensive multimodal documentation with usage examples and file type tables
examples/api/slack/uv.lock Downgraded openai-agents; updated revision
examples/api/openai/uv.lock Downgraded openai-agents; updated revision
examples/api/messenger/uv.lock Changed agentkernel to local source; downgraded multiple packages
examples/api/messenger/server_adk.py New Google ADK-based Messenger bot example with multimodal support
examples/api/messenger/build.sh Added 'adk' to the local build dependencies
examples/api/messenger/README.md Added multimodal feature documentation with file types and size limits
examples/api/mcp/multi/uv.lock Downgraded openai-agents; updated revision
examples/api/instagram/uv.lock Downgraded openai-agents; updated revision
examples/api/hooks/uv.lock Downgraded openai-agents; updated revision
examples/api/gmail/uv.lock Downgraded openai-agents; updated revision
examples/api/a2a/multi/uv.lock Downgraded openai-agents; updated revision
docs/docs/integrations/telegram.md Enhanced documentation with multimodal features, usage examples, and limitations
docs/docs/integrations/messenger.md Added comprehensive multimodal documentation with implementation details
ak-py/src/agentkernel/integration/telegram/telegram_chat.py Core implementation for file/image download, processing, and multimodal request handling
ak-py/src/agentkernel/integration/telegram/README.md Technical documentation for multimodal support with configuration details
ak-py/src/agentkernel/integration/messenger/messenger_chat.py Core implementation for attachment processing with file size validation
ak-py/src/agentkernel/integration/messenger/README.md Technical documentation for multimodal support with security considerations
ak-py/src/agentkernel/framework/openai/openai.py Added session persistence for multimodal conversations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +392 to +396
file_info = await self._get_file_info(file_id)
if not file_info:
self._log.warning("Failed to get photo file info")
failed_files.append("photo")
return failed_files
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function processes files and images but immediately returns when a failure occurs, which means if processing one file fails, subsequent files in the message won't be processed. Consider collecting all failures instead of returning early, so that users can still receive responses for successfully processed files.

Copilot uses AI. Check for mistakes.
Comment on lines +436 to +440
file_info = await self._get_file_info(file_id)
if not file_info:
self._log.warning(f"Failed to get file info for {file_name}")
failed_files.append(file_name)
return failed_files
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function returns early when file info retrieval fails, preventing subsequent files from being processed. Consider continuing to process other files instead of returning immediately, and only add the failed file to the failed_files list.

Copilot uses AI. Check for mistakes.
@@ -1,11 +1,14 @@
import base64
import logging
import mimetypes
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import of 'mimetypes' is added but never used in the code. The MIME types are being determined from Telegram's API response or hardcoded values, not using the mimetypes module. Consider removing this unused import.

Suggested change
import mimetypes

Copilot uses AI. Check for mistakes.
Comment on lines 170 to +179
reply = (await Runner.run(agent.agent, message_content, session=None)).final_output

# Manually save the multimodal conversation to session for future reference
if session:
openai_session = session.get("openai") or session.set("openai", OpenAISession())
# Add user message
await openai_session.add_items([{"role": "user", "content": prompt}])
# Add assistant response
await openai_session.add_items([{"role": "assistant", "content": reply}])

Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The multimodal conversation is being manually saved to the session after the agent run completes. However, if an exception occurs during Runner.run() execution on line 170, the conversation won't be saved to the session. Consider using a try-finally block or moving the session save logic to ensure it executes even if an error occurs, or handle this case explicitly.

Suggested change
reply = (await Runner.run(agent.agent, message_content, session=None)).final_output
# Manually save the multimodal conversation to session for future reference
if session:
openai_session = session.get("openai") or session.set("openai", OpenAISession())
# Add user message
await openai_session.add_items([{"role": "user", "content": prompt}])
# Add assistant response
await openai_session.add_items([{"role": "assistant", "content": reply}])
try:
reply = (await Runner.run(agent.agent, message_content, session=None)).final_output
except Exception:
# Even if the agent run fails, persist the user message for context
if session:
openai_session = session.get("openai") or session.set("openai", OpenAISession())
await openai_session.add_items([{"role": "user", "content": prompt}])
# Re-raise so the outer handler can return an error reply
raise
else:
# Manually save the multimodal conversation to session for future reference
if session:
openai_session = session.get("openai") or session.set("openai", OpenAISession())
# Add user message
await openai_session.add_items([{"role": "user", "content": prompt}])
# Add assistant response
await openai_session.add_items([{"role": "assistant", "content": reply}])

Copilot uses AI. Check for mistakes.
Comment on lines +440 to +462
return failed_files

file_path = file_info.get("file_path")
file_size = file_info.get("file_size", 0)

# Download file
file_content = await self._download_telegram_file(file_path)
if file_content is None:
self._log.warning(f"Failed to download file: {file_name}")
failed_files.append(file_name)
return failed_files

# Base64 encode
file_data_base64 = base64.b64encode(file_content).decode("utf-8")

# Add as file request
requests.append(
AgentRequestFile(
file_data=file_data_base64,
name=file_name,
mime_type=mime_type,
)
)
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function returns early when file download fails, preventing subsequent files from being processed. Consider continuing to process other files instead of returning immediately, and only add the failed file to the failed_files list.

Suggested change
return failed_files
file_path = file_info.get("file_path")
file_size = file_info.get("file_size", 0)
# Download file
file_content = await self._download_telegram_file(file_path)
if file_content is None:
self._log.warning(f"Failed to download file: {file_name}")
failed_files.append(file_name)
return failed_files
# Base64 encode
file_data_base64 = base64.b64encode(file_content).decode("utf-8")
# Add as file request
requests.append(
AgentRequestFile(
file_data=file_data_base64,
name=file_name,
mime_type=mime_type,
)
)
file_path = file_info.get("file_path") if file_info else None
file_size = file_info.get("file_size", 0) if file_info else 0
# Download file
if file_path:
file_content = await self._download_telegram_file(file_path)
else:
file_content = None
if file_content is None:
self._log.warning(f"Failed to download file: {file_name}")
failed_files.append(file_name)
else:
# Base64 encode
file_data_base64 = base64.b64encode(file_content).decode("utf-8")
# Add as file request
requests.append(
AgentRequestFile(
file_data=file_data_base64,
name=file_name,
mime_type=mime_type,
)
)

Copilot uses AI. Check for mistakes.
import hashlib
import hmac
import logging
import mimetypes
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The import of 'mimetypes' is added but never used in the code. The MIME types are obtained from the HTTP response headers, not using the mimetypes module. Consider removing this unused import.

Suggested change
import mimetypes

Copilot uses AI. Check for mistakes.
Comment on lines +370 to +469
async def _process_files(self, message: dict, requests: list) -> list[str]:
"""
Process files and images in a Telegram message.
:param message: Message object from Telegram
:param requests: List to append AgentRequestFile/AgentRequestImage objects
:return: List of failed file names
"""
failed_files = []

# Process photos (images)
if "photo" in message:
photos = message.get("photo", [])
if photos:
# Get the largest photo
largest_photo = photos[-1]
file_id = largest_photo.get("file_id")

try:
self._log.debug(f"Processing photo: {file_id}")

# Get file info
file_info = await self._get_file_info(file_id)
if not file_info:
self._log.warning("Failed to get photo file info")
failed_files.append("photo")
return failed_files

file_path = file_info.get("file_path")
file_size = file_info.get("file_size", 0)

# Download file
file_content = await self._download_telegram_file(file_path)
if file_content is None:
self._log.warning(f"Failed to download photo")
failed_files.append("photo")
return failed_files

# Base64 encode
image_data_base64 = base64.b64encode(file_content).decode("utf-8")

# Add as image request
requests.append(
AgentRequestImage(
image_data=image_data_base64,
name="photo.jpg",
mime_type="image/jpeg",
)
)
self._log.debug(f"Added photo to request (size: {file_size} bytes)")

except Exception as e:
self._log.error(f"Error processing photo: {e}\n{traceback.format_exc()}")
failed_files.append("photo")

# Process documents (files)
if "document" in message:
document = message.get("document", {})
file_id = document.get("file_id")
file_name = document.get("file_name", "document")
mime_type = document.get("mime_type", "application/octet-stream")

try:
self._log.debug(f"Processing document: {file_id} ({file_name})")

# Get file info
file_info = await self._get_file_info(file_id)
if not file_info:
self._log.warning(f"Failed to get file info for {file_name}")
failed_files.append(file_name)
return failed_files

file_path = file_info.get("file_path")
file_size = file_info.get("file_size", 0)

# Download file
file_content = await self._download_telegram_file(file_path)
if file_content is None:
self._log.warning(f"Failed to download file: {file_name}")
failed_files.append(file_name)
return failed_files

# Base64 encode
file_data_base64 = base64.b64encode(file_content).decode("utf-8")

# Add as file request
requests.append(
AgentRequestFile(
file_data=file_data_base64,
name=file_name,
mime_type=mime_type,
)
)
self._log.debug(f"Added file to request: {file_name} (size: {file_size} bytes)")

except Exception as e:
self._log.error(f"Error processing document: {e}\n{traceback.format_exc()}")
failed_files.append(file_name)

return failed_files
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file size validation logic is missing for Telegram file downloads. Unlike the Messenger integration which checks file size against max_file_size configuration (line 276 in messenger_chat.py), the Telegram integration downloads files without validating their size. This could lead to memory issues or timeouts when processing very large files. Consider adding size validation similar to the Messenger implementation.

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +168
if command == "/status":
await self._send_message(chat_id, "✅ Bot is running!")
elif command == "/about":
await self._send_message(chat_id, "I'm powered by Agent Kernel and OpenAI")
else:
await super()._handle_command(chat_id, command)
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistent whitespace detected. This line uses tabs or mixed indentation instead of spaces. The rest of the codebase uses spaces consistently. Please fix the indentation to use 4 spaces.

Suggested change
if command == "/status":
await self._send_message(chat_id, "✅ Bot is running!")
elif command == "/about":
await self._send_message(chat_id, "I'm powered by Agent Kernel and OpenAI")
else:
await super()._handle_command(chat_id, command)
if command == "/status":
await self._send_message(chat_id, "✅ Bot is running!")
elif command == "/about":
await self._send_message(chat_id, "I'm powered by Agent Kernel and OpenAI")
else:
await super()._handle_command(chat_id, command)

Copilot uses AI. Check for mistakes.

**Limitations:**

- Maximum file size: ~2MB
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states a maximum file size of "~2MB" but doesn't reference the actual configuration option or explain how to change it. Consider adding a reference to the configuration option (api.max_file_size) and clarifying the exact default value to match the Messenger documentation which specifies "2 MB (2,097,152 bytes)".

Suggested change
- Maximum file size: ~2MB
- Maximum file size: 2 MB (2,097,152 bytes) by default, controlled by the `api.max_file_size` configuration option (adjust this setting to change the limit)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants