Skip to content

Application of Vision Transformer in CIFAR-10 dataset

Notifications You must be signed in to change notification settings

first-coding/VIT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Vision Transformer (ViT) for CIFAR-10 Dataset

Links to related papers are https://arxiv.org/abs/2010.11929

Dataset

The data can be downloaded from the official website or in the image place, or the Download setting to True when the code loads the data

Overview

This project implements the Vision Transformer (ViT) model using the CIFAR-10 dataset. The Vision Transformer is a state-of-the-art architecture for image classification tasks, leveraging the power of self-attention mechanisms.

Features

Utilizes the latest advancements in deep learning for image classification. Integrates the powerful capabilities of transformers into computer vision tasks. Provides a robust and efficient solution for handling image data.

Key Components

Patch Embedding: Extracts image patches and converts them into token embeddings. Attention Mechanism: Captures global dependencies and relationships between tokens. MLP Layers: Employs multi-layer perceptrons for non-linear transformations. Transformer Blocks: Comprises attention layers followed by feed-forward neural networks. Vision Transformer (ViT) Model: Combines these components into a cohesive architecture.

Contributing

Contributions are welcome! Feel free to fork the repository and submit pull requests for improvements or bug fixes.

The final accuracy of the test set was 74% by training,Of course, there would have been better hyperparameter selection and a better way to define the model that would have made VIT perform better on the CIFAR-10 dataset, but I stopped at this level of accuracy because of the resources and time. If you have a higher level of accuracy, I hope you can give me a lot of advice, thank you.

About

Application of Vision Transformer in CIFAR-10 dataset

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages