Encoder And Decoder In Computer Vision

Continuous flash suppression of neural responses and population orientation coding in macaque V1

Continuous flash suppression reduces V1 orientation responses in an ocular-dominance-dependent manner, which may still allow low-level coarse orientation discrimination but provide insufficient ...

IEEE

ECM: Enhancing Compressibility of Quantized Vision Encoder and LLM for Large Vision-Language Models

Abstract: Quantizing the large language model (LLM) in vision-language models (VLMs) is an effective approach to reducing memory size. However, quantizing only the LLM shifts the memory bottleneck to ...

18d

Clarifying HEVC licensing fees, royalties, and why vendors kill HEVC support

In addition to the financial burdens of HEVC licensing, the risk of lawsuits from patent holders can deter companies from seeking HEVC support. The space is crowded with pending and settled lawsuits, ...

IEEE

GeoViTMamba: A Hybrid Vision Transformer and Transformer Decoder for Semantic Remote Sensing Image Captioning

Abstract: Remote sensing image captioning (RSIC) links high resolution aerial imagery with naturallanguage descriptions for urban analysis, environmental monitoring, and autonomous planning. We ...

GitHub

5_cifar100.py

Runs MAE on CIFAR-100 (100 classes, same 32×32 images). Trains its own encoder from scratch — does NOT load mae_encoder_improved.pth. Shows scalability of the same approach on a harder problem.

Some results have been hidden because they may be inaccessible to you

Show inaccessible results