| 123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265 |
- Models:
- - Name: vit_vit-b16_mln_upernet_8xb2-80k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 47.71
- mIoU(ms+flip): 49.51
- Config: configs/vit/vit_vit-b16_mln_upernet_8xb2-80k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - ViT-B
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 9.2
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_mln_512x512_80k_ade20k/upernet_vit-b16_mln_512x512_80k_ade20k_20210624_130547-0403cee1.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_mln_512x512_80k_ade20k/20210624_130547.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_vit-b16_mln_upernet_8xb2-160k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 46.75
- mIoU(ms+flip): 48.46
- Config: configs/vit/vit_vit-b16_mln_upernet_8xb2-160k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - ViT-B
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 9.2
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_mln_512x512_160k_ade20k/upernet_vit-b16_mln_512x512_160k_ade20k_20210624_130547-852fa768.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_mln_512x512_160k_ade20k/20210623_192432.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_vit-b16-ln_mln_upernet_8xb2-160k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 47.73
- mIoU(ms+flip): 49.95
- Config: configs/vit/vit_vit-b16-ln_mln_upernet_8xb2-160k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - ViT-B
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 9.21
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_ln_mln_512x512_160k_ade20k/upernet_vit-b16_ln_mln_512x512_160k_ade20k_20210621_172828-f444c077.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_vit-b16_ln_mln_512x512_160k_ade20k/20210621_172828.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_deit-s16_upernet_8xb2-80k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 42.96
- mIoU(ms+flip): 43.79
- Config: configs/vit/vit_deit-s16_upernet_8xb2-80k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - DeiT-S
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 4.68
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_512x512_80k_ade20k/upernet_deit-s16_512x512_80k_ade20k_20210624_095228-afc93ec2.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_512x512_80k_ade20k/20210624_095228.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_deit-s16_upernet_8xb2-160k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 42.87
- mIoU(ms+flip): 43.79
- Config: configs/vit/vit_deit-s16_upernet_8xb2-160k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - DeiT-S
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 4.68
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_512x512_160k_ade20k/upernet_deit-s16_512x512_160k_ade20k_20210621_160903-5110d916.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_512x512_160k_ade20k/20210621_160903.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_deit-s16_mln_upernet_8xb2-160k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 43.82
- mIoU(ms+flip): 45.07
- Config: configs/vit/vit_deit-s16_mln_upernet_8xb2-160k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - DeiT-S
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 5.69
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_mln_512x512_160k_ade20k/upernet_deit-s16_mln_512x512_160k_ade20k_20210621_161021-fb9a5dfb.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_mln_512x512_160k_ade20k/20210621_161021.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_deit-s16-ln_mln_upernet_8xb2-160k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 43.52
- mIoU(ms+flip): 45.01
- Config: configs/vit/vit_deit-s16-ln_mln_upernet_8xb2-160k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - DeiT-S
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 5.69
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_ln_mln_512x512_160k_ade20k/upernet_deit-s16_ln_mln_512x512_160k_ade20k_20210621_161021-c0cd652f.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-s16_ln_mln_512x512_160k_ade20k/20210621_161021.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_deit-b16_upernet_8xb2-80k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 45.24
- mIoU(ms+flip): 46.73
- Config: configs/vit/vit_deit-b16_upernet_8xb2-80k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - DeiT-B
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 7.75
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-b16_512x512_80k_ade20k/upernet_deit-b16_512x512_80k_ade20k_20210624_130529-1e090789.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-b16_512x512_80k_ade20k/20210624_130529.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_deit-b16_upernet_8xb2-160k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 45.36
- mIoU(ms+flip): 47.16
- Config: configs/vit/vit_deit-b16_upernet_8xb2-160k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - DeiT-B
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 7.75
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-b16_512x512_160k_ade20k/upernet_deit-b16_512x512_160k_ade20k_20210621_180100-828705d7.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-b16_512x512_160k_ade20k/20210621_180100.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_deit-b16_mln_upernet_8xb2-160k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 45.46
- mIoU(ms+flip): 47.16
- Config: configs/vit/vit_deit-b16_mln_upernet_8xb2-160k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - DeiT-B
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 9.21
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-b16_mln_512x512_160k_ade20k/upernet_deit-b16_mln_512x512_160k_ade20k_20210621_191949-4e1450f3.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-b16_mln_512x512_160k_ade20k/20210621_191949.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
- - Name: vit_deit-b16-ln_mln_upernet_8xb2-160k_ade20k-512x512
- In Collection: UPerNet
- Results:
- Task: Semantic Segmentation
- Dataset: ADE20K
- Metrics:
- mIoU: 45.37
- mIoU(ms+flip): 47.23
- Config: configs/vit/vit_deit-b16-ln_mln_upernet_8xb2-160k_ade20k-512x512.py
- Metadata:
- Training Data: ADE20K
- Batch Size: 16
- Architecture:
- - DeiT-B
- - UPerNet
- Training Resources: 8x V100 GPUS
- Memory (GB): 9.21
- Weights: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-b16_ln_mln_512x512_160k_ade20k/upernet_deit-b16_ln_mln_512x512_160k_ade20k_20210623_153535-8a959c14.pth
- Training log: https://download.openmmlab.com/mmsegmentation/v0.5/vit/upernet_deit-b16_ln_mln_512x512_160k_ade20k/20210623_153535.log.json
- Paper:
- Title: 'An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale'
- URL: https://arxiv.org/pdf/2010.11929.pdf
- Code: https://github.com/open-mmlab/mmsegmentation/blob/v0.17.0/mmseg/models/backbones/vit.py#L98
- Framework: PyTorch
|