Name		Name	Last commit message	Last commit date
parent directory ..
tokenizer		tokenizer
LICENSE		LICENSE
README.md		README.md
cross_encoder_mmarco.py		cross_encoder_mmarco.py

README.md

Cross-Encoder for multilingual MS Marco

The model can be used for Information Retrieval: Given a query, encode the query will all possible passages (e.g. retrieved with ElasticSearch). Then sort the passages in a decreasing order.

input

Query and Paragraph.

output

Logits

Usage

Set the Query and Paragraph as an argument.

$ python3 cross_encoder_mmarco.py -q "How many people live in Berlin?" -p "Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."
$ python3 cross_encoder_mmarco.py -q "How many people live in Berlin?" -p "New York City is famous for the Metropolitan Museum of Art."
$ python3 cross_encoder_mmarco.py -q "ベルリンには何人が住んでいますか？" -p "ベルリンの人口は891.82平方キロメートルの地域に登録された住民が3,520,031人います。"
$ python3 cross_encoder_mmarco.py -q "ベルリンには何人が住んでいますか？" -p "ニューヨーク市はメトロポリタン美術館で有名です。"

Output : [array([[10.761541]], dtype=float32)]
Output : [array([[-8.127746]], dtype=float32)]
Output : [array([[9.374646]], dtype=float32)]
Output : [array([[-6.408309]], dtype=float32)]

Reference

jeffwan/mmarco-mMiniLMv2-L12-H384-v

Framework

PyTorch 2.2.1
Transformers 4.33.3

Model Format

ONNX opset = 11

Tokenizer

XLMRobertaTokenizer (Same with SentenceTransformer and E5)

Netron

mmarco-mMiniLMv2-L12-H384-v1.onnx.prototxt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cross_encoder_mmarco

cross_encoder_mmarco

README.md

Cross-Encoder for multilingual MS Marco

input

output

Usage

Reference

Framework

Model Format

Tokenizer

Netron

Files

cross_encoder_mmarco

Directory actions

More options

Directory actions

More options

Latest commit

History

cross_encoder_mmarco

Folders and files

parent directory

README.md

Cross-Encoder for multilingual MS Marco

input

output

Usage

Reference

Framework

Model Format

Tokenizer

Netron