⚡️ Speed up method `MarkupLMEmbeddings.create_position_ids_from_input_ids` by 26% #135

codeflash-ai · 2025-11-12T06:12:55Z

📄 26% (0.26x) speedup for `MarkupLMEmbeddings.create_position_ids_from_input_ids` in `src/transformers/models/markuplm/modeling_markuplm.py`

⏱️ Runtime : 1.88 milliseconds → 1.49 milliseconds (best of 191 runs)

📝 Explanation and details

The optimized code achieves a 26% speedup by eliminating unnecessary tensor operations and type conversions in the create_position_ids_from_input_ids method.

Key optimizations applied:

Eliminated redundant .int() cast: The original code converted the boolean mask to int unnecessarily. The optimized version keeps the mask as a boolean tensor, which PyTorch can work with directly in torch.cumsum().
Removed .type_as() operation: The original code used .type_as(mask) to match tensor types, but this is redundant since torch.cumsum() on boolean tensors already returns the appropriate integer type (long).
Simplified conditional addition: Instead of always adding past_key_values_length in the expression, the optimized code only performs the addition when past_key_values_length != 0, avoiding unnecessary operations in the common case where it's zero.
Eliminated final .long() cast: The cumsum operation already produces long tensors, making the explicit cast redundant.

Why this leads to speedup:

Fewer tensor allocations: Each avoided type conversion (.int(), .type_as(), .long()) eliminates temporary tensor creation
Reduced memory bandwidth: Boolean tensors are more memory-efficient than integer tensors for masks
Conditional optimization: The if check for past_key_values_length != 0 avoids arithmetic operations in ~87.5% of test cases (35 out of 40 calls in the profiler)

Performance characteristics by test case:

Best gains (30-40% faster): Cases with all padding or simple patterns where the boolean mask operations shine
Consistent gains (15-28% faster): All other test cases benefit from reduced allocations
Larger sequences: The optimization scales well with sequence length, maintaining ~20-28% improvements

The optimization is particularly valuable since this function is likely called frequently in transformer model forward passes, making even small per-call improvements significant for overall model performance.

✅ Correctness verification report:

Test	Status
⚙️ Existing Unit Tests	🔘 None Found
🌀 Generated Regression Tests	✅ 40 Passed
⏪ Replay Tests	🔘 None Found
🔎 Concolic Coverage Tests	🔘 None Found
📊 Tests Coverage	100.0%

🌀 Generated Regression Tests and Runtime

import pytest # used for our unit tests
import torch # used for tensor operations
from transformers.models.markuplm.modeling_markuplm import MarkupLMEmbeddings

unit tests

----------- Basic Test Cases -----------

def test_basic_no_padding():
# All tokens are non-padding, padding_idx=0
input_ids = torch.tensor([[1, 2, 3, 4]])
padding_idx = 0
# Positions should be [1,2,3,4] + padding_idx = [1,2,3,4]
expected = torch.tensor([[1,2,3,4]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 48.2μs -> 38.2μs (26.2% faster)

def test_basic_with_padding_middle():
# Padding in the middle
input_ids = torch.tensor([[5, 0, 6, 7]])
padding_idx = 0
# Positions: [1,0,2,3] + padding_idx = [1,0,2,3]
expected = torch.tensor([[1,0,2,3]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 46.5μs -> 35.0μs (32.8% faster)

def test_basic_all_padding():
# All tokens are padding
input_ids = torch.tensor([[0,0,0,0]])
padding_idx = 0
# All positions should be 0
expected = torch.tensor([[0,0,0,0]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 45.0μs -> 32.0μs (40.3% faster)

def test_basic_batch_multiple_rows():
# Batch of 2 sentences
input_ids = torch.tensor([[1,2,0,3], [0,4,5,0]])
padding_idx = 0
# First row: [1,2,0,3]
# Positions: [1,2,0,3]
# Second row: [0,4,5,0]
# Positions: [0,1,2,0]
expected = torch.tensor([[1,2,0,3],[0,1,2,0]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 43.9μs -> 33.8μs (30.0% faster)

def test_basic_nonzero_padding_idx():
# Padding index is not zero
input_ids = torch.tensor([[3,3,1,2]])
padding_idx = 3
# Only [1,2] are non-padding, so positions: [0,0,4,5]
expected = torch.tensor([[3,3,4,5]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 44.2μs -> 34.1μs (29.8% faster)

def test_basic_past_key_values_length():
# Test with past_key_values_length
input_ids = torch.tensor([[1,0,2,3]])
padding_idx = 0
past_key_values_length = 5
# Positions: [6,0,7,8]
expected = torch.tensor([[6,0,7,8]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx, past_key_values_length); result = codeflash_output # 42.2μs -> 35.8μs (17.9% faster)

----------- Edge Test Cases -----------

def test_edge_empty_tensor():
# Empty tensor
input_ids = torch.empty((0,0), dtype=torch.long)
padding_idx = 0
expected = torch.empty((0,0), dtype=torch.long)
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 40.7μs -> 33.0μs (23.5% faster)

def test_edge_single_token_non_padding():
# Single token, non-padding
input_ids = torch.tensor([[42]])
padding_idx = 0
expected = torch.tensor([[1]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 43.0μs -> 33.5μs (28.4% faster)

def test_edge_single_token_padding():
# Single token, padding
input_ids = torch.tensor([[0]])
padding_idx = 0
expected = torch.tensor([[0]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 42.5μs -> 33.1μs (28.5% faster)

def test_edge_all_padding_nonzero_idx():
# All padding, nonzero padding idx
input_ids = torch.tensor([[3,3,3]])
padding_idx = 3
expected = torch.tensor([[3,3,3]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 40.5μs -> 35.0μs (15.7% faster)

def test_edge_alternating_padding_nonpadding():
# Alternating padding and non-padding tokens
input_ids = torch.tensor([[0,1,0,2,0,3]])
padding_idx = 0
# Positions: [0,1,0,2,0,3]
expected = torch.tensor([[0,1,0,2,0,3]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 42.9μs -> 32.0μs (34.1% faster)

def test_edge_large_padding_idx():
# Large padding index
input_ids = torch.tensor([[100,101,100,102]])
padding_idx = 100
# Positions: [0,101,0,102]
expected = torch.tensor([[100,101,100,102]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 46.1μs -> 36.7μs (25.6% faster)

def test_edge_negative_padding_idx():
# Negative padding index (should work for negative values)
input_ids = torch.tensor([[-1, 2, -1, 3]])
padding_idx = -1
# Positions: [0,0+1,0,0+2]
# mask: [0,1,0,1]
# cumsum: [0,1,0,2]
# positions: [0,-1+1=0,0,-1+2=1]
expected = torch.tensor([[-1,0,-1,1]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 46.0μs -> 32.9μs (39.7% faster)

----------- Large Scale Test Cases -----------

def test_large_scale_long_sequence():
# Sequence of length 1000, no padding
input_ids = torch.arange(1, 1001).unsqueeze(0) # shape (1,1000)
padding_idx = 0
# Positions should be [1,2,3,...,1000]
expected = torch.arange(1, 1001).unsqueeze(0)
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 42.7μs -> 35.1μs (21.7% faster)

def test_large_scale_long_sequence_with_padding():
# Sequence of length 1000, padding every 10th token
input_ids = torch.arange(1, 1001)
input_ids[::10] = 0 # set every 10th token to padding
input_ids = input_ids.unsqueeze(0)
padding_idx = 0
# Positions: positions increment for non-padding, 0 for padding
expected = torch.zeros_like(input_ids)
counter = 0
for i in range(input_ids.shape[1]):
if input_ids[0,i] != padding_idx:
counter += 1
expected[0,i] = counter
else:
expected[0,i] = 0
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 82.6μs -> 71.4μs (15.7% faster)

def test_large_scale_batch():
# Batch of 10 sequences, each length 100
batch_size = 10
seq_len = 100
input_ids = torch.arange(1, seq_len+1).repeat(batch_size,1)
# Set padding_idx at index 0 for each row
input_ids[:,0] = 0
padding_idx = 0
expected = torch.zeros_like(input_ids)
for b in range(batch_size):
counter = 0
for i in range(seq_len):
if input_ids[b,i] != padding_idx:
counter += 1
expected[b,i] = counter
else:
expected[b,i] = 0
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 82.1μs -> 71.8μs (14.5% faster)

def test_large_scale_past_key_values_length():
# Test with large past_key_values_length
input_ids = torch.tensor([[1]*100])
padding_idx = 0
past_key_values_length = 500
# Positions: [501,502,...,600]
expected = torch.arange(501,601).unsqueeze(0)
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx, past_key_values_length); result = codeflash_output # 43.9μs -> 36.0μs (22.0% faster)

def test_large_scale_all_padding():
# All padding, large sequence
input_ids = torch.zeros((1,1000), dtype=torch.long)
padding_idx = 0
expected = torch.zeros((1,1000), dtype=torch.long)
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 50.6μs -> 39.9μs (26.8% faster)

----------- Determinism Test -----------

def test_determinism():
# Running the same input twice should yield the same output
input_ids = torch.tensor([[1,0,2,3]])
padding_idx = 0
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result1 = codeflash_output # 47.3μs -> 35.9μs (31.8% faster)
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result2 = codeflash_output # 14.6μs -> 8.75μs (66.4% faster)

----------- Type and Shape Test -----------

def test_type_and_shape():
# Output should be long dtype, same shape as input
input_ids = torch.tensor([[1,2,0,3]])
padding_idx = 0
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 43.0μs -> 31.4μs (37.0% faster)

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

#------------------------------------------------
import pytest # used for our unit tests
import torch
from transformers.models.markuplm.modeling_markuplm import MarkupLMEmbeddings

class XPathEmbeddings:
# Dummy stub for XPathEmbeddings, not used in test
def init(self, config):
pass
from transformers.models.markuplm.modeling_markuplm import MarkupLMEmbeddings

unit tests

1. Basic Test Cases

def test_basic_single_row_no_padding():
# Simple case: no padding, single row
input_ids = torch.tensor([[1, 2, 3, 4]])
padding_idx = 0
# Positions: [1,2,3,4] + padding_idx = [1,2,3,4]
expected = torch.tensor([[1,2,3,4]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 48.6μs -> 38.7μs (25.5% faster)

def test_basic_single_row_with_padding():
# Padding at start and end
input_ids = torch.tensor([[0, 1, 2, 0, 3, 0]])
padding_idx = 0
# Positions: [0,1,2,0,3,0] -> [0,1,2,0,3,0]
# Mask: [0,1,1,0,1,0]
# Cumsum: [0,1,2,2,3,3]
# Positions: [0,1,2,0,3,0] + padding_idx = [0,1,2,0,3,0]
expected = torch.tensor([[0,1,2,0,3,0]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 47.5μs -> 37.5μs (26.4% faster)

def test_basic_batch_rows():
# Batch of two rows, mixed padding
input_ids = torch.tensor([
[0, 1, 2, 0, 3],
[1, 2, 0, 0, 0]
])
padding_idx = 0
expected = torch.tensor([
[0,1,2,0,3],
[1,2,0,0,0]
])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 41.5μs -> 32.4μs (28.2% faster)

def test_basic_nonzero_padding_idx():
# Padding index is not zero
input_ids = torch.tensor([[5, 1, 5, 2, 3]])
padding_idx = 5
# Mask: [0,1,0,1,1]
# Cumsum: [0,1,1,2,3]
# Positions: [0,1,0,2,3] + padding_idx = [5,6,5,7,8]
expected = torch.tensor([[5,6,5,7,8]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 46.3μs -> 37.0μs (25.0% faster)

def test_basic_past_key_values_length():
# Test with past_key_values_length
input_ids = torch.tensor([[0, 1, 2, 0, 3]])
padding_idx = 0
past_key_values_length = 2
# Mask: [0,1,1,0,1]
# Cumsum: [0,1,2,2,3]
# Positions: ([0,1,2,0,3] + 2) * mask = [0,3,4,0,5]
# Positions: [0,3,4,0,5] + padding_idx = [0,3,4,0,5]
expected = torch.tensor([[0,3,4,0,5]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx, past_key_values_length); result = codeflash_output # 42.1μs -> 36.4μs (15.8% faster)

2. Edge Test Cases

def test_edge_all_padding():
# All tokens are padding
input_ids = torch.tensor([[0,0,0,0]])
padding_idx = 0
expected = torch.tensor([[0,0,0,0]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 43.3μs -> 31.8μs (36.1% faster)

def test_edge_no_tokens():
# Empty input
input_ids = torch.empty((1,0), dtype=torch.long)
padding_idx = 0
expected = torch.empty((1,0), dtype=torch.long)
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 40.7μs -> 32.3μs (26.1% faster)

def test_edge_single_token_padding():
# Single token, padding
input_ids = torch.tensor([[0]])
padding_idx = 0
expected = torch.tensor([[0]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 44.3μs -> 34.5μs (28.5% faster)

def test_edge_single_token_non_padding():
# Single token, non-padding
input_ids = torch.tensor([[2]])
padding_idx = 0
expected = torch.tensor([[1]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 46.4μs -> 33.6μs (38.2% faster)

def test_edge_high_padding_idx():
# High padding_idx value
input_ids = torch.tensor([[99, 1, 99, 2]])
padding_idx = 99
# Mask: [0,1,0,1]
# Cumsum: [0,1,1,2]
# Positions: [99,100,99,101]
expected = torch.tensor([[99,100,99,101]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 43.2μs -> 37.3μs (15.7% faster)

def test_edge_past_key_values_length_zero():
# Explicitly set past_key_values_length=0, should be same as default
input_ids = torch.tensor([[0, 1, 2]])
padding_idx = 0
expected = torch.tensor([[0,1,2]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx, past_key_values_length=0); result = codeflash_output # 42.5μs -> 31.9μs (33.0% faster)

def test_edge_past_key_values_length_large():
# Large past_key_values_length
input_ids = torch.tensor([[0, 1, 2, 0, 3]])
padding_idx = 0
past_key_values_length = 100
# Mask: [0,1,1,0,1]
# Cumsum: [0,1,2,2,3]
# Positions: [0,101,102,0,103]
expected = torch.tensor([[0,101,102,0,103]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx, past_key_values_length); result = codeflash_output # 46.3μs -> 34.1μs (35.8% faster)

def test_edge_2d_batch_padding():
# 2D batch, mixed padding
input_ids = torch.tensor([
[0, 1, 2],
[3, 0, 4]
])
padding_idx = 0
# First row: [0,1,2] -> [0,1,2]
# Second row: [3,0,4] -> [1,0,2]
expected = torch.tensor([
[0,1,2],
[1,0,2]
])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 46.3μs -> 33.9μs (36.7% faster)

def test_edge_different_types():
# Input IDs as int32
input_ids = torch.tensor([[0, 1, 2, 0, 3]], dtype=torch.int32)
padding_idx = 0
expected = torch.tensor([[0,1,2,0,3]])
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 46.2μs -> 35.6μs (29.7% faster)

3. Large Scale Test Cases

def test_large_batch_size():
# Large batch size, small sequence length
batch_size = 500
seq_len = 5
input_ids = torch.zeros((batch_size, seq_len), dtype=torch.long)
# Set first token in each row to non-padding
for i in range(batch_size):
input_ids[i,0] = 1
padding_idx = 0
# Should be [1,0,0,0,0] for each row
expected = torch.zeros((batch_size, seq_len), dtype=torch.long)
expected[:,0] = 1
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 47.1μs -> 37.1μs (26.9% faster)

def test_large_seq_len():
# Large sequence length, single batch
seq_len = 900
input_ids = torch.ones((1, seq_len), dtype=torch.long) # no padding
padding_idx = 0
# Should be [1,2,...,900]
expected = torch.arange(1, seq_len+1).unsqueeze(0)
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 45.6μs -> 35.6μs (28.0% faster)

def test_large_batch_and_seq_len_some_padding():
# Large batch and sequence length, with padding in random places
batch_size = 50
seq_len = 200
input_ids = torch.ones((batch_size, seq_len), dtype=torch.long)
padding_idx = 0
# Set every 10th token to padding
for i in range(batch_size):
input_ids[i,::10] = 0
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 63.0μs -> 52.9μs (19.1% faster)
# Check that every 10th token is padding_idx
for i in range(batch_size):
pass
# Check that the positions increase between paddings
for i in range(batch_size):
pos = 1
for j in range(seq_len):
if j % 10 == 0:
pass
else:
pos += 1

def test_large_past_key_values_length():
# Large past_key_values_length, large sequence
seq_len = 500
input_ids = torch.ones((1, seq_len), dtype=torch.long)
padding_idx = 0
past_key_values_length = 100
expected = torch.arange(1+past_key_values_length, seq_len+1+past_key_values_length).unsqueeze(0)
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx, past_key_values_length); result = codeflash_output # 45.6μs -> 37.6μs (21.2% faster)

def test_large_all_padding():
# Large input, all padding
batch_size = 100
seq_len = 100
input_ids = torch.zeros((batch_size, seq_len), dtype=torch.long)
padding_idx = 0
expected = torch.zeros((batch_size, seq_len), dtype=torch.long)
codeflash_output = MarkupLMEmbeddings.create_position_ids_from_input_ids(input_ids, padding_idx); result = codeflash_output # 71.2μs -> 61.2μs (16.3% faster)

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

To edit these changes git checkout codeflash/optimize-MarkupLMEmbeddings.create_position_ids_from_input_ids-mhvlsner and push.

The optimized code achieves a **26% speedup** by eliminating unnecessary tensor operations and type conversions in the `create_position_ids_from_input_ids` method. **Key optimizations applied:** 1. **Eliminated redundant `.int()` cast**: The original code converted the boolean mask to int unnecessarily. The optimized version keeps the mask as a boolean tensor, which PyTorch can work with directly in `torch.cumsum()`. 2. **Removed `.type_as()` operation**: The original code used `.type_as(mask)` to match tensor types, but this is redundant since `torch.cumsum()` on boolean tensors already returns the appropriate integer type (long). 3. **Simplified conditional addition**: Instead of always adding `past_key_values_length` in the expression, the optimized code only performs the addition when `past_key_values_length != 0`, avoiding unnecessary operations in the common case where it's zero. 4. **Eliminated final `.long()` cast**: The cumsum operation already produces long tensors, making the explicit cast redundant. **Why this leads to speedup:** - **Fewer tensor allocations**: Each avoided type conversion (`.int()`, `.type_as()`, `.long()`) eliminates temporary tensor creation - **Reduced memory bandwidth**: Boolean tensors are more memory-efficient than integer tensors for masks - **Conditional optimization**: The `if` check for `past_key_values_length != 0` avoids arithmetic operations in ~87.5% of test cases (35 out of 40 calls in the profiler) **Performance characteristics by test case:** - **Best gains** (30-40% faster): Cases with all padding or simple patterns where the boolean mask operations shine - **Consistent gains** (15-28% faster): All other test cases benefit from reduced allocations - **Larger sequences**: The optimization scales well with sequence length, maintaining ~20-28% improvements The optimization is particularly valuable since this function is likely called frequently in transformer model forward passes, making even small per-call improvements significant for overall model performance.

codeflash-ai bot requested a review from mashraf-222 November 12, 2025 06:12

codeflash-ai bot added ⚡️ codeflash Optimization PR opened by Codeflash AI 🎯 Quality: High Optimization Quality according to Codeflash labels Nov 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡️ Speed up method `MarkupLMEmbeddings.create_position_ids_from_input_ids` by 26% #135

⚡️ Speed up method `MarkupLMEmbeddings.create_position_ids_from_input_ids` by 26% #135

Uh oh!

codeflash-ai bot commented Nov 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method MarkupLMEmbeddings.create_position_ids_from_input_ids by 26% #135

Are you sure you want to change the base?

⚡️ Speed up method MarkupLMEmbeddings.create_position_ids_from_input_ids by 26% #135

Uh oh!

Conversation

codeflash-ai bot commented Nov 12, 2025

📄 26% (0.26x) speedup for MarkupLMEmbeddings.create_position_ids_from_input_ids in src/transformers/models/markuplm/modeling_markuplm.py

📝 Explanation and details

unit tests

----------- Basic Test Cases -----------

----------- Edge Test Cases -----------

----------- Large Scale Test Cases -----------

----------- Determinism Test -----------

----------- Type and Shape Test -----------

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

unit tests

1. Basic Test Cases

2. Edge Test Cases

3. Large Scale Test Cases

codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

⚡️ Speed up method `MarkupLMEmbeddings.create_position_ids_from_input_ids` by 26% #135

⚡️ Speed up method `MarkupLMEmbeddings.create_position_ids_from_input_ids` by 26% #135

📄 26% (0.26x) speedup for `MarkupLMEmbeddings.create_position_ids_from_input_ids` in `src/transformers/models/markuplm/modeling_markuplm.py`