Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
INPUTMIX: A STRATEGY TO REGULARIZE AND BALANCE MULTI-MODALITY AND MULTI-VIEW MODEL LEARNING
Umeå University, Sweden.
RISE Research Institutes of Sweden, Digitala system, Mobilitet och system. Umeå University, Sweden.
2024 (engelsk)Inngår i: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ISSN 1520-6149, s. 5455-5459Artikkel i tidsskrift (Fagfellevurdert) Published
Abstract [en]

Real-world perception tasks often involve multiple modalities or views of input. While joint training of multiple modality classification models has been explored previously, it has not consistently outperformed the best single modality model. This paper aims to address one of the reasons for this: the difficulty in balancing the contributions of each input in the end-to-end training of multi-input models. Additionally, the increased capacity of multi-input networks can lead to overfitting. To solve these issues, we propose InputMix, a simple yet effective method for optimally mixing different inputs. Our method mixes a certain proportion p of input pairs to relieve the increased capacity problems and assigns a weighting factor λ for each input to generate a mixed target, allowing us to specify the contributions of each input. Experimental results on three multi-input classification tasks demonstrate that our method significantly improves the generalization performance of multi-input neural networks. Codes are available at https://github.com/JesseWong333/inputmix/. 

sted, utgiver, år, opplag, sider
Institute of Electrical and Electronics Engineers Inc. , 2024. s. 5455-5459
HSV kategori
Identifikatorer
URN: urn:nbn:se:ri:diva-74868DOI: 10.1109/ICASSP48485.2024.10446664Scopus ID: 2-s2.0-85195364390OAI: oai:DiVA.org:ri-74868DiVA, id: diva2:1895059
Konferanse
49th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024. Seoul, South nKOrea. 14 April 2024 through 19 April 2024
Merknad

The computations were enabled by resources in project NAISS 2023/22-19] provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS) at Alvis, funded by the Swedish Research Council through grant agreement no. 2022-06725.

Tilgjengelig fra: 2024-09-04 Laget: 2024-09-04 Sist oppdatert: 2025-09-23bibliografisk kontrollert

Open Access i DiVA

Fulltekst mangler i DiVA

Andre lenker

Forlagets fulltekstScopus
Av organisasjonen
I samme tidsskrift
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing

Søk utenfor DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric

doi
urn-nbn
Totalt: 32 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
v. 2.47.0