Abstract: Although Vision Transformer (ViT) has achieved significant success in computer vision, it does not perform well in dense prediction tasks due to the lack of inner-patch information ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results