Skip to content

Conversation

@pulinduvidmal
Copy link
Contributor

Description

This change will add file & image support for instagram integrations

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (no functional changes)
  • Performance improvement
  • Test update
  • CI/CD update
  • Other (please describe):

Testing

  • Unit tests pass locally
  • Integration tests pass locally
  • Manual testing completed
  • New tests added for changes

Checklist

  • My code follows the project's style guidelines
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published

Screenshots (if applicable)

Additional Notes

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds multimodal file and image support for Instagram, Telegram, and Messenger integrations, enabling agents to analyze photos, documents, and other file types alongside text messages.

Key Changes:

  • Implementation of file download and base64 encoding for Telegram, Messenger, and Instagram
  • Addition of multimodal request handling using AgentRequestFile and AgentRequestImage
  • Session context preservation for multimodal conversations in OpenAI framework
  • Documentation updates explaining multimodal features and usage examples

Reviewed changes

Copilot reviewed 19 out of 35 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
examples/containerized/openai/uv.lock Downgraded openai-agents from 0.6.4 to 0.6.3 and updated revision number
examples/cli/*/uv.lock Updated revision numbers across multiple CLI examples
examples/api/whatsapp/uv.lock Changed agentkernel source from local to PyPI
examples/api/telegram/uv.lock Updated revision and downgraded openai-agents
examples/api/telegram/server_adk.py Added new Google ADK server example with multimodal support
examples/api/telegram/build.sh Added ADK extras to build dependencies
examples/api/telegram/README.md Comprehensive documentation of multimodal features, file types, and usage
examples/api/messenger/uv.lock Changed agentkernel source to local, downgraded multiple packages
examples/api/messenger/server_adk.py Added Google ADK server example for Messenger
examples/api/messenger/build.sh Added ADK extras to build script
examples/api/messenger/README.md Added multimodal documentation with file support details
examples/api/instagram/uv.lock Updated revision and downgraded openai-agents
examples/api/instagram/server_adk.py Added Google ADK server example for Instagram
examples/api/instagram/build.sh Added ADK extras to build dependencies
examples/api/instagram/README.md Added multimodal features documentation
docs/docs/integrations/telegram.md Updated integration docs with multimodal features and examples
docs/docs/integrations/messenger.md Added comprehensive multimodal documentation
docs/docs/integrations/instagram.md Added multimodal features and file handling documentation
ak-py/src/agentkernel/integration/telegram/telegram_chat.py Implemented file/image download, processing, and multimodal request handling
ak-py/src/agentkernel/integration/telegram/README.md Added multimodal support documentation for integration
ak-py/src/agentkernel/integration/messenger/messenger_chat.py Added attachment processing with file size validation
ak-py/src/agentkernel/integration/messenger/README.md Updated with multimodal features and file handling details
ak-py/src/agentkernel/integration/instagram/instagram_chat.py Implemented attachment processing for images and files
ak-py/src/agentkernel/integration/instagram/README.md Added multimodal support documentation
ak-py/src/agentkernel/framework/openai/openai.py Added manual session context preservation for multimodal requests

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


### Multimodal Limitations

- File size: Max 2MB
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation states "File size: Max 2MB" but doesn't mention that this limit is configurable. This could confuse users who want to adjust the limit. Consider updating this to reflect that it's configurable via the api.max_file_size configuration option, as mentioned elsewhere in the documentation.

Suggested change
- File size: Max 2MB
- File size: Max 2MB by default (configurable via `api.max_file_size`)

Copilot uses AI. Check for mistakes.
Comment on lines +163 to +168
if command == "/status":
await self._send_message(chat_id, "✅ Bot is running!")
elif command == "/about":
await self._send_message(chat_id, "I'm powered by Agent Kernel and OpenAI")
else:
await super()._handle_command(chat_id, command)
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The indentation is inconsistent here - lines 163-168 use incorrect indentation that doesn't align with the surrounding code structure. These lines appear to be orphaned code fragments that should be part of a code example but are instead rendered as plain text in the markdown.

Copilot uses AI. Check for mistakes.
Comment on lines +393 to +396
if not file_info:
self._log.warning("Failed to get photo file info")
failed_files.append("photo")
return failed_files
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _process_files method returns early on any file processing failure, preventing subsequent files from being processed. If a user sends multiple files and the first one fails, the remaining files will be ignored. Consider continuing to process other files even if one fails, and only returning the list of failed files at the end of the method after attempting to process all files.

Copilot uses AI. Check for mistakes.
Comment on lines +440 to +464
return failed_files

file_path = file_info.get("file_path")
file_size = file_info.get("file_size", 0)

# Download file
file_content = await self._download_telegram_file(file_path)
if file_content is None:
self._log.warning(f"Failed to download file: {file_name}")
failed_files.append(file_name)
return failed_files

# Base64 encode
file_data_base64 = base64.b64encode(file_content).decode("utf-8")

# Add as file request
requests.append(
AgentRequestFile(
file_data=file_data_base64,
name=file_name,
mime_type=mime_type,
)
)
self._log.debug(f"Added file to request: {file_name} (size: {file_size} bytes)")

Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file download from Telegram returns early on failure at line 450, preventing subsequent files from being processed. This creates an inconsistent user experience where some files might be analyzed while others are silently ignored if one fails in the middle.

Suggested change
return failed_files
file_path = file_info.get("file_path")
file_size = file_info.get("file_size", 0)
# Download file
file_content = await self._download_telegram_file(file_path)
if file_content is None:
self._log.warning(f"Failed to download file: {file_name}")
failed_files.append(file_name)
return failed_files
# Base64 encode
file_data_base64 = base64.b64encode(file_content).decode("utf-8")
# Add as file request
requests.append(
AgentRequestFile(
file_data=file_data_base64,
name=file_name,
mime_type=mime_type,
)
)
self._log.debug(f"Added file to request: {file_name} (size: {file_size} bytes)")
else:
file_path = file_info.get("file_path")
file_size = file_info.get("file_size", 0)
# Download file
file_content = await self._download_telegram_file(file_path)
if file_content is None:
self._log.warning(f"Failed to download file: {file_name}")
failed_files.append(file_name)
else:
# Base64 encode
file_data_base64 = base64.b64encode(file_content).decode("utf-8")
# Add as file request
requests.append(
AgentRequestFile(
file_data=file_data_base64,
name=file_name,
mime_type=mime_type,
)
)
self._log.debug(
f"Added file to request: {file_name} (size: {file_size} bytes)"
)

Copilot uses AI. Check for mistakes.
@@ -1,11 +1,14 @@
import base64
import logging
import mimetypes
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mimetypes module is imported but never used in this file. Consider removing this unused import to keep the code clean.

Suggested change
import mimetypes

Copilot uses AI. Check for mistakes.
Comment on lines +251 to +311
async def _process_attachment(self, attachment: dict, requests: list):
"""
Process a Messenger attachment (image or file).
:param attachment: Attachment object from message
:param requests: List to append the processed request to
"""
attachment_type = attachment.get("type")
payload = attachment.get("payload", {})
url = payload.get("url")

if not url:
self._log.warning(f"Attachment has no URL: {attachment}")
return

try:
# Download the attachment
async with httpx.AsyncClient() as client:
response = await client.get(url, timeout=10.0)
response.raise_for_status()
file_data = response.content

# Check file size
if len(file_data) > self._max_file_size:
self._log.warning(f"Attachment size ({len(file_data) / (1024 * 1024):.2f} MB) exceeds maximum allowed size of {self._max_file_size / (1024 * 1024):.2f} MB")
return

# Encode to base64
file_data_base64 = base64.b64encode(file_data).decode("utf-8")

# Get MIME type
mime_type = response.headers.get("content-type", "application/octet-stream")

# Extract filename from URL if available
filename = url.split("/")[-1].split("?")[0] or f"attachment_{len(requests)}"

self._log.debug(f"Downloaded {attachment_type} attachment: {filename} (size: {len(file_data)} bytes, type: {mime_type})")

# Classify based on attachment type and MIME type
if attachment_type == "image" or (mime_type and mime_type.startswith("image/")):
self._log.debug(f"Adding image: {filename}")
requests.append(
AgentRequestImage(
image_data=file_data_base64,
name=filename,
mime_type=mime_type,
)
)
else:
self._log.debug(f"Adding file: {filename}")
requests.append(
AgentRequestFile(
file_data=file_data_base64,
name=filename,
mime_type=mime_type,
)
)

except Exception as e:
self._log.error(f"Error processing attachment: {e}\n{traceback.format_exc()}")

Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code has significant duplication across Telegram, Messenger, and Instagram integrations. The _process_attachment method in both Messenger and Instagram handlers is nearly identical (lines 251-311 in messenger_chat.py and lines 272-336 in instagram_chat.py). Consider extracting this common file-processing logic into a shared utility function or base class method to improve maintainability and reduce the risk of inconsistent behavior.

Copilot uses AI. Check for mistakes.
### Supported File Types

| Type | Format | Use Case |
| ------------------- | --------------- | ---------------------- |
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The markdown table has inconsistent spacing in the header. The pipe characters and dashes should be aligned properly for better readability. Line 156 uses | ------------------- | --------------- | ---------------------- | while other tables in the file use consistent spacing.

Suggested change
| ------------------- | --------------- | ---------------------- |
|---------------------|----------------|------------------------|

Copilot uses AI. Check for mistakes.
Comment on lines +246 to +249
if len(requests) > 1 or any(isinstance(r, (AgentRequestFile, AgentRequestImage)) for r in requests):
result = await service.run_multi(requests=requests)
else:
result = await service.run(message_text) if message_text else None
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic on lines 246-249 has the same issue as in the Messenger integration. When there's only one request and it's an AgentRequestFile or AgentRequestImage (no text), the code will try to call service.run(message_text) where message_text is an empty string. Consider using run_multi for all cases where attachments are present.

Copilot uses AI. Check for mistakes.
Comment on lines +436 to +440
file_info = await self._get_file_info(file_id)
if not file_info:
self._log.warning(f"Failed to get file info for {file_name}")
failed_files.append(file_name)
return failed_files
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the photo handling, the document processing returns early on failure at line 440, preventing any subsequent files from being processed. This means if processing the document fails, the method immediately returns and doesn't attempt to process any additional attachments that might be in the message.

Copilot uses AI. Check for mistakes.
import hashlib
import hmac
import logging
import mimetypes
Copy link

Copilot AI Dec 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mimetypes module is imported but never used in this file. This is duplicated across all three integration files (Telegram, Messenger, Instagram). Consider removing this unused import.

Suggested change
import mimetypes

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants