feat(serialization): introduce dedicated serialization module by s-alexey · Pull Request #118 · Kaggle/kaggle-benchmarks

s-alexey · 2026-04-03T19:09:46Z

Separates serialization logic from API clients (OpenAI, GoogleGenAI) and make it more customizable per model or provider.

Introduced a unified serialization pipeline through the BaseSerializer and its implementations: OpenAICompletionSerializer, ModelProxyOpenAISerializer, and GenAiSerializer.
Removed inline message serialization loops (like _get_raw_messages) from LLM classes, delegating it directly to the corresponding serializer instances.

Separates serialization logic from API clients (OpenAI, GoogleGenAI) and make it more customizable per model or provider. - Introduced a unified serialization pipeline through the BaseSerializer and its implementations: OpenAICompletionSerializer, ModelProxyOpenAISerializer, and GenAiSerializer. - Removed inline message serialization loops (like _get_raw_messages) from LLM classes, delegating it directly to the corresponding serializer instances.

dolaameng

Thanks for the improvement! Left some comments regarding differences from regression tests.

dolaameng · 2026-04-07T16:56:31Z

src/kaggle_benchmarks/serializers/__init__.py

nit: Unfortunately I have a "serialization" module that does different things: src/kaggle_benchmarks/kaggle/serialization.py.

Shall we rename one or the other to make it less confusing? It's probably easier to just rename "src/kaggle_benchmarks/kaggle/serialization.py".

I'm fine with keeping this as is for now. We should discuss a better overall structure, as the current one causes frequent cyclic import issues. One idea:

srd/kb/ - openai/ - models.py / or client.py - serializers.py - google/ ... - kaggle - model_proxy

dolaameng · 2026-04-07T17:48:24Z

src/kaggle_benchmarks/serializers/openai.py

+        yield {
+            "role": self.get_role(message.sender),
+            "content": [
+                {"type": "video_url", "video_url": {"url": video.url}},


All test_video_url golden tests failed. I don't think "video_url" will work for MP. it should still be "content": [{"type": "image_url", "image_url": {"url": video.url}}] .

@mohami2000 can you confirm?

You are right, as per VideoContent.get_payload

My key didn't let me to test it.

dolaameng · 2026-04-07T17:50:21Z

src/kaggle_benchmarks/serializers/openai.py

+        yield {
+            "role": self.get_role(message.sender),
+            "content": [
+                {"type": "text", "text": video.url},


I am not familiar with this, can you confirm "text" is the right type here?

tests/serializers/test_genai_serializer.py

dolaameng · 2026-04-07T18:19:32Z

tests/serializers/test_openai_serializer.py

+                },
+            ],
+        }
+    ]


Some regression tests to show the difference, not all are needed but we should fix the failure ones.

Suggested change

]

]

# ---------------------------------------------------------------------------

# Regression tests for old-vs-new serialization differences

# ---------------------------------------------------------------------------

import dataclasses

import pydantic

class _SamplePydanticModel(pydantic.BaseModel):

name: str

value: int

@dataclasses.dataclass

class _SampleDataclass:

name: str

value: int

class TestModelProxyImageWithCaption:

"""Images with captions should include both the caption text and image data."""

def test_image_without_caption_includes_image_data(self):

serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})

image = ImageBase64(b64_string="abc123", mime_type="image/png")

msg = messages.Message(content=image, sender=actors.user)

result = list(serializer.dump_message(msg))

content = result[0]["content"]

types_in_content = [item["type"] for item in content]

assert "image_url" in types_in_content

# FIXME: Operator precedence bug in dump_image — the `else [] + [image_part]`

# binds as `else ([] + [image_part])`, so when a caption IS present the image

# data is silently dropped and only the caption text is emitted.

def test_image_with_caption_includes_both_caption_and_image_data(self):

serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})

image = ImageBase64(b64_string="abc123", mime_type="image/png", caption="A cat")

msg = messages.Message(content=image, sender=actors.user)

result = list(serializer.dump_message(msg))

content = result[0]["content"]

types_in_content = [item["type"] for item in content]

# Both caption text and image data must be present

assert "text" in types_in_content

assert "image_url" in types_in_content

class TestModelProxyVideoSerialization:

"""Videos should use image_url as a generic file URL carrier for Model Proxy.

Model Proxy doesn't support the video_url content type and rejects it

with a 400 error. The old code used image_url as a carrier, which worked.

"""

# FIXME: dump_video uses {"type": "video_url", ...} but Model Proxy only

# accepts {"type": "image_url", ...} as a generic file URL carrier.

# The old code used VideoContent.get_payload() which produced image_url.

def test_video_uses_image_url_carrier(self):

serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})

video = videos.VideoURL(url="https://youtube.com/watch?v=dummy")

msg = messages.Message(content=video, sender=actors.user)

result = list(serializer.dump_message(msg))

content = result[0]["content"]

assert content == [

{

"type": "image_url",

"image_url": {"url": "https://youtube.com/watch?v=dummy"},

}

]

class TestModelProxyDictContent:

"""Dict content should be serialized as a JSON string."""

# FIXME: dump_json_message calls message.copy() which doesn't exist on

# the Message dataclass. Should use dataclasses.replace() or similar.

def test_dict_content_serialized_as_json(self):

serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})

msg = messages.Message(content={"key": "value"}, sender=actors.user)

result = list(serializer.dump_message(msg))

assert result == [{"role": "user", "content": '{"key": "value"}'}]

class TestModelProxyPydanticContent:

"""Pydantic model content should be serialized as a JSON string."""

# FIXME: Pydantic models go through base.dump_message → copy.copy(msg) +

# model_dump() → dump_json_message → message.copy() crash (same root cause

# as dict content above).

def test_pydantic_model_serialized_as_json(self):

import json

serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})

model = _SamplePydanticModel(name="test", value=42)

msg = messages.Message(content=model, sender=actors.user)

result = list(serializer.dump_message(msg))

assert result[0]["role"] == "user"

# Content should be a valid JSON string matching the model

parsed = json.loads(result[0]["content"])

assert parsed == {"name": "test", "value": 42}

class TestModelProxyDataclassContent:

"""Dataclass content should be serialized as a JSON string, not Python repr."""

# FIXME: Dataclass content is not matched by any isinstance check in

# dump_message, so it falls through to _dump_message which uses str(content)

# producing Python repr instead of JSON.

def test_dataclass_serialized_as_json(self):

import json

serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})

dc = _SampleDataclass(name="test", value=42)

msg = messages.Message(content=dc, sender=actors.user)

result = list(serializer.dump_message(msg))

assert result[0]["role"] == "user"

# Content should be a valid JSON string, not Python repr

parsed = json.loads(result[0]["content"])

assert parsed == {"name": "test", "value": 42}

class TestModelProxyToolInvocationResult:

"""Standalone ToolInvocationResult should be serialized as text content.

Model Proxy doesn't support native tool calls, so the result should be

rendered as a human-readable text message rather than silently dropped.

"""

# FIXME: ModelProxyOpenAISerializer._dump_invocation is a noop (`if False: yield`),

# so standalone ToolInvocationResult messages are silently dropped.

def test_standalone_tool_result_serialized_as_text(self):

serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})

tool_result = ToolInvocationResult(

name="calc", arguments={"a": 1}, call_id="c1", output="42"

)

msg = messages.Message(content=tool_result, sender=actors.user)

result = list(serializer.dump_message(msg))

# Should produce at least one message, not silently drop the content

assert len(result) > 0

assert result[0]["role"] == "user"

class TestModelProxyRoleMapping:

"""Verifies that the production role mapping {"tool": "system"} works correctly."""

def test_tool_role_maps_to_system(self):

serializer = openai_serializer.ModelProxyOpenAISerializer(

roles_mapping={"tool": "system"}

)

tool_actor = actors.Actor(name="tool_actor", role="tool")

msg = messages.Message(content="tool output", sender=tool_actor)

result = list(serializer.dump_message(msg))

assert result[0]["role"] == "system"

def test_user_role_passes_through(self):

serializer = openai_serializer.ModelProxyOpenAISerializer(

roles_mapping={"tool": "system"}

)

msg = messages.Message(content="hello", sender=actors.user)

result = list(serializer.dump_message(msg))

assert result[0]["role"] == "user"

def test_assistant_role_passes_through(self):

serializer = openai_serializer.ModelProxyOpenAISerializer(

roles_mapping={"tool": "system"}

)

assistant = actors.Actor(name="assistant", role="assistant")

msg = messages.Message(content="response", sender=assistant)

result = list(serializer.dump_message(msg))

assert result[0]["role"] == "assistant"

src/kaggle_benchmarks/serializers/openai.py

dolaameng · 2026-04-07T19:03:25Z

src/kaggle_benchmarks/serializers/base.py

+    def dump_json_message(self, message: msg.Message[dict]):
+        """Serializes a JSON dictionary message by stringifying it as text by default."""
+        yield from self.dump_text_message(
+            message.copy(new_content=json.dumps(message.content))


typo: message.replace?

Leftover from an intermediate state. I opted not to implement Message.copy.

src/kaggle_benchmarks/actors/llms.py

src/kaggle_benchmarks/serializers/genai.py

dolaameng · 2026-04-08T16:07:44Z

src/kaggle_benchmarks/serializers/openai.py

+            ],
+        }
+
+    def _dump_invocation(self, tool):


@s-alexey I think we can just remove this because MP should support tool calling. wdyt?

sounds good

dolaameng

LGTM. Thanks for the improvement!

dolaameng · 2026-04-08T16:46:26Z

Golden test and new unit tests passed. Commit it now.

s-alexey requested a review from dolaameng April 3, 2026 19:09

s-alexey changed the title ~~Refactor serialization logic to decouple from API clients~~ feat(serialization): introduce dedicated serialization module Apr 3, 2026

s-alexey force-pushed the alexey/serializers branch from 351b090 to 51feb95 Compare April 6, 2026 11:53

remove providers

a8fbbe9

dolaameng reviewed Apr 7, 2026

View reviewed changes

src/kaggle_benchmarks/serializers/genai.py Outdated Show resolved Hide resolved

s-alexey added 3 commits April 8, 2026 11:34

Address PR comments

197c79a

add tool invocation

4e00505

videos and images

760b49a

dolaameng reviewed Apr 8, 2026

View reviewed changes

remove dump_invocation

cb80aa3

dolaameng approved these changes Apr 8, 2026

View reviewed changes

dolaameng merged commit 5588428 into ci Apr 8, 2026
4 checks passed

dolaameng deleted the alexey/serializers branch April 8, 2026 16:46

-    ]
+    ]
+# ---------------------------------------------------------------------------
+# Regression tests for old-vs-new serialization differences
+# ---------------------------------------------------------------------------
+import dataclasses
+import pydantic
+class _SamplePydanticModel(pydantic.BaseModel):
+    name: str
+    value: int
+@dataclasses.dataclass
+class _SampleDataclass:
+    name: str
+    value: int
+class TestModelProxyImageWithCaption:
+    """Images with captions should include both the caption text and image data."""
+    def test_image_without_caption_includes_image_data(self):
+        serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})
+        image = ImageBase64(b64_string="abc123", mime_type="image/png")
+        msg = messages.Message(content=image, sender=actors.user)
+        result = list(serializer.dump_message(msg))
+        content = result[0]["content"]
+        types_in_content = [item["type"] for item in content]
+        assert "image_url" in types_in_content
+    # FIXME: Operator precedence bug in dump_image — the `else [] + [image_part]`
+    # binds as `else ([] + [image_part])`, so when a caption IS present the image
+    # data is silently dropped and only the caption text is emitted.
+    def test_image_with_caption_includes_both_caption_and_image_data(self):
+        serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})
+        image = ImageBase64(b64_string="abc123", mime_type="image/png", caption="A cat")
+        msg = messages.Message(content=image, sender=actors.user)
+        result = list(serializer.dump_message(msg))
+        content = result[0]["content"]
+        types_in_content = [item["type"] for item in content]
+        # Both caption text and image data must be present
+        assert "text" in types_in_content
+        assert "image_url" in types_in_content
+class TestModelProxyVideoSerialization:
+    """Videos should use image_url as a generic file URL carrier for Model Proxy.
+    Model Proxy doesn't support the video_url content type and rejects it
+    with a 400 error. The old code used image_url as a carrier, which worked.
+    """
+    # FIXME: dump_video uses {"type": "video_url", ...} but Model Proxy only
+    # accepts {"type": "image_url", ...} as a generic file URL carrier.
+    # The old code used VideoContent.get_payload() which produced image_url.
+    def test_video_uses_image_url_carrier(self):
+        serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})
+        video = videos.VideoURL(url="https://youtube.com/watch?v=dummy")
+        msg = messages.Message(content=video, sender=actors.user)
+        result = list(serializer.dump_message(msg))
+        content = result[0]["content"]
+        assert content == [
+            {
+                "type": "image_url",
+                "image_url": {"url": "https://youtube.com/watch?v=dummy"},
+            }
+        ]
+class TestModelProxyDictContent:
+    """Dict content should be serialized as a JSON string."""
+    # FIXME: dump_json_message calls message.copy() which doesn't exist on
+    # the Message dataclass. Should use dataclasses.replace() or similar.
+    def test_dict_content_serialized_as_json(self):
+        serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})
+        msg = messages.Message(content={"key": "value"}, sender=actors.user)
+        result = list(serializer.dump_message(msg))
+        assert result == [{"role": "user", "content": '{"key": "value"}'}]
+class TestModelProxyPydanticContent:
+    """Pydantic model content should be serialized as a JSON string."""
+    # FIXME: Pydantic models go through base.dump_message → copy.copy(msg) +
+    # model_dump() → dump_json_message → message.copy() crash (same root cause
+    # as dict content above).
+    def test_pydantic_model_serialized_as_json(self):
+        import json
+        serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})
+        model = _SamplePydanticModel(name="test", value=42)
+        msg = messages.Message(content=model, sender=actors.user)
+        result = list(serializer.dump_message(msg))
+        assert result[0]["role"] == "user"
+        # Content should be a valid JSON string matching the model
+        parsed = json.loads(result[0]["content"])
+        assert parsed == {"name": "test", "value": 42}
+class TestModelProxyDataclassContent:
+    """Dataclass content should be serialized as a JSON string, not Python repr."""
+    # FIXME: Dataclass content is not matched by any isinstance check in
+    # dump_message, so it falls through to _dump_message which uses str(content)
+    # producing Python repr instead of JSON.
+    def test_dataclass_serialized_as_json(self):
+        import json
+        serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})
+        dc = _SampleDataclass(name="test", value=42)
+        msg = messages.Message(content=dc, sender=actors.user)
+        result = list(serializer.dump_message(msg))
+        assert result[0]["role"] == "user"
+        # Content should be a valid JSON string, not Python repr
+        parsed = json.loads(result[0]["content"])
+        assert parsed == {"name": "test", "value": 42}
+class TestModelProxyToolInvocationResult:
+    """Standalone ToolInvocationResult should be serialized as text content.
+    Model Proxy doesn't support native tool calls, so the result should be
+    rendered as a human-readable text message rather than silently dropped.
+    """
+    # FIXME: ModelProxyOpenAISerializer._dump_invocation is a noop (`if False: yield`),
+    # so standalone ToolInvocationResult messages are silently dropped.
+    def test_standalone_tool_result_serialized_as_text(self):
+        serializer = openai_serializer.ModelProxyOpenAISerializer(roles_mapping={})
+        tool_result = ToolInvocationResult(
+            name="calc", arguments={"a": 1}, call_id="c1", output="42"
+        )
+        msg = messages.Message(content=tool_result, sender=actors.user)
+        result = list(serializer.dump_message(msg))
+        # Should produce at least one message, not silently drop the content
+        assert len(result) > 0
+        assert result[0]["role"] == "user"
+class TestModelProxyRoleMapping:
+    """Verifies that the production role mapping {"tool": "system"} works correctly."""
+    def test_tool_role_maps_to_system(self):
+        serializer = openai_serializer.ModelProxyOpenAISerializer(
+            roles_mapping={"tool": "system"}
+        )
+        tool_actor = actors.Actor(name="tool_actor", role="tool")
+        msg = messages.Message(content="tool output", sender=tool_actor)
+        result = list(serializer.dump_message(msg))
+        assert result[0]["role"] == "system"
+    def test_user_role_passes_through(self):
+        serializer = openai_serializer.ModelProxyOpenAISerializer(
+            roles_mapping={"tool": "system"}
+        )
+        msg = messages.Message(content="hello", sender=actors.user)
+        result = list(serializer.dump_message(msg))
+        assert result[0]["role"] == "user"
+    def test_assistant_role_passes_through(self):
+        serializer = openai_serializer.ModelProxyOpenAISerializer(
+            roles_mapping={"tool": "system"}
+        )
+        assistant = actors.Actor(name="assistant", role="assistant")
+        msg = messages.Message(content="response", sender=assistant)
+        result = list(serializer.dump_message(msg))
+        assert result[0]["role"] == "assistant"

Conversation

s-alexey commented Apr 3, 2026

Uh oh!

dolaameng left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dolaameng left a comment

Choose a reason for hiding this comment

Uh oh!

dolaameng commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants