Hello, I’m Sk4Dl

A postgraduate student of Harbin Institude of Technology, Shenzhen. A deeping learning amateur.

Latest Posts

Masked Autoencoders Are Scalable Vision Learners

Contribution Author develops an asymmetric encoder-decoder architecture, with an encoder that operates only on the visible subset of patches(without mask tokens), along with ...

Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

Contribution Author proposes a general-purpose Transformer backbone, called Swin Transformer, which constucts hierarchical feature maps and has linear computational complexit...

Prithvi WxC: Foundation Model for Weather and Climate

Contribution Despite the mirroring successes of large AI model in both computer science and nature language process, applications of the foundation model principle to atmosph...

Taming Transformers for High-Resolution Image Synthesis

Contribution Using CNNS to learn a context-rich vocabulary of image constituents. Utilizing transformers to efficiently model their composition within high-resolution image...

Neural Discrete Representation Learning

Contributions Introducing the VQ-VAE model combining the variational autoencoder (VAE) framework with discrete latent representations. The model is simple, does not suffer fr...