Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
INPUTMIX: A STRATEGY TO REGULARIZE AND BALANCE MULTI-MODALITY AND MULTI-VIEW MODEL LEARNING
Umeå University, Sweden.
RISE Research Institutes of Sweden, Digitala system, Mobilitet och system. Umeå University, Sweden.
2024 (Engelska)Ingår i: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ISSN 1520-6149, s. 5455-5459Artikel i tidskrift (Refereegranskat) Published
Abstract [en]

Real-world perception tasks often involve multiple modalities or views of input. While joint training of multiple modality classification models has been explored previously, it has not consistently outperformed the best single modality model. This paper aims to address one of the reasons for this: the difficulty in balancing the contributions of each input in the end-to-end training of multi-input models. Additionally, the increased capacity of multi-input networks can lead to overfitting. To solve these issues, we propose InputMix, a simple yet effective method for optimally mixing different inputs. Our method mixes a certain proportion p of input pairs to relieve the increased capacity problems and assigns a weighting factor λ for each input to generate a mixed target, allowing us to specify the contributions of each input. Experimental results on three multi-input classification tasks demonstrate that our method significantly improves the generalization performance of multi-input neural networks. Codes are available at https://github.com/JesseWong333/inputmix/. 

Ort, förlag, år, upplaga, sidor
Institute of Electrical and Electronics Engineers Inc. , 2024. s. 5455-5459
Nationell ämneskategori
Elektroteknik och elektronik
Identifikatorer
URN: urn:nbn:se:ri:diva-74868DOI: 10.1109/ICASSP48485.2024.10446664Scopus ID: 2-s2.0-85195364390OAI: oai:DiVA.org:ri-74868DiVA, id: diva2:1895059
Konferens
49th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024. Seoul, South nKOrea. 14 April 2024 through 19 April 2024
Anmärkning

The computations were enabled by resources in project NAISS 2023/22-19] provided by the National Academic Infrastructure for Supercomputing in Sweden (NAISS) at Alvis, funded by the Swedish Research Council through grant agreement no. 2022-06725.

Tillgänglig från: 2024-09-04 Skapad: 2024-09-04 Senast uppdaterad: 2025-09-23Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus
Av organisationen
Mobilitet och system
I samma tidskrift
Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing
Elektroteknik och elektronik

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 32 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf