multimodal-transformer

Star

Here are 7 public repositories matching this topic...

yikaiw / TokenFusion

Star

[CVPR 2022] Code release for "Multimodal Token Fusion for Vision Transformers"

transformer semantic-segmentation image-translation 3d-detection rgbd-segmentation multimodal-transformer tokenfusion

Updated Jul 21, 2022
Python

MILVLG / mt-captioning

Star

A PyTorch implementation of the paper Multimodal Transformer with Multiview Visual Representation for Image Captioning

pytorch image-captioning multimodal-transformer

Updated Sep 4, 2020
Python

VachanVY / Transfusion.torch

Sponsor

Star

PyTorch Implementation of Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

ai deep-learning transformers pytorch artificial-intelligence pytorch-implementation mutli-modal multimodal-transformer multimodal-large-language-models

Updated Oct 10, 2024
Python

Snehil-Shah / Multimodal-Image-Search-Engine

Star

Text to Image & Reverse Image Search Engine built upon Vector Similarity Search utilizing CLIP VL-Transformer for Semantic Embeddings & Qdrant as the Vector-Store

nlp computer-vision multimodal-transformer vector-embeddings openai-clip qdrant qdrant-vector-database

Updated Jan 10, 2024
Jupyter Notebook

Bachfischer / COMP90042-Rumour-Detection-on-Twitter

Star

Source code for COMP90042 Project 2021

nlp bert multimodal-transformer bertweet rumour-detection

Updated May 18, 2021
Jupyter Notebook

This project implements a Generalist Robotics Policy (GRP) using a Vision Transformer (ViT) architecture. The model is designed to process multiple input types, including images, text goals, and goal images, to generate continuous action outputs for robotic control.

multimodal-transformer robotics-policy

Updated Oct 16, 2024
Python

pabloggarc / TFG

Star

Clasificación de imágenes y asignación de textos mediante redes neuronales convolucionales y transformers multimodales

machine-learning tensorflow transformer neural-networks deeplearning convolutional-neural-network multimodal-deep-learning multimodal-transformer vertex-ai

Updated Jul 10, 2024
Jupyter Notebook

Improve this page

Add a description, image, and links to the multimodal-transformer topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multimodal-transformer topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multimodal-transformer

Here are 7 public repositories matching this topic...

yikaiw / TokenFusion

MILVLG / mt-captioning

VachanVY / Transfusion.torch

Snehil-Shah / Multimodal-Image-Search-Engine

Bachfischer / COMP90042-Rumour-Detection-on-Twitter

thivyanth / grp

pabloggarc / TFG

Improve this page

Add this topic to your repo