System disruptions
We are currently experiencing disruptions on the search portals due to high traffic. We are working to resolve the issue, you may temporarily encounter an error message.
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
The Inhibitor: ReLU and Addition-Based Attention for Efficient Transformers
RISE Research Institutes of Sweden, Digital Systems, Data Science. Luleå University of Technology, Sweden.ORCID iD: 0000-0003-4293-6408
2024 (English)In: Proceedings of the AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence , 2024, Vol. 38, no 21, p. 23445-23446Conference paper, Published paper (Refereed)
Abstract [en]

To enhance the computational efficiency of quantized Transformers, we replace the dot-product and Softmax-based attention with an alternative mechanism involving addition and ReLU activation only. This side-steps the expansion to double precision often required by matrix multiplication and avoids costly Softmax evaluations but maintains much of the core functionality of conventional dot-product attention. It can enable more efficient execution and support larger quantized Transformer models on resource-constrained hardware or alternative arithmetic systems like homomorphic encryption. Training experiments on four common benchmark tasks show test set prediction scores comparable to those of conventional Transformers with dot-product attention. Our scaling experiments also suggest significant computational savings, both in plaintext and under encryption. The ReLU and addition-based attention mechanism introduced in this paper may enable privacy-preserving AI applications operating under homomorphic encryption by avoiding the costly multiplication of encrypted variables.

Place, publisher, year, edition, pages
Association for the Advancement of Artificial Intelligence , 2024. Vol. 38, no 21, p. 23445-23446
Keywords [en]
Artificial intelligence; Computational efficiency; Computational savings; Conventional transformer; Core functionality; Double precision; Ho-momorphic encryptions; Homomorphic-encryptions; MAtrix multiplication; Scaling experiments; Test sets; Transformer modeling; Cryptography
National Category
Computer and Information Sciences
Identifiers
URN: urn:nbn:se:ri:diva-72840DOI: 10.1609/aaai.v38i21.30422Scopus ID: 2-s2.0-85189627116OAI: oai:DiVA.org:ri-72840DiVA, id: diva2:1854899
Conference
38th AAAI Conference on Artificial Intelligence, AAAI 2024. Vancouver, Canada. 20 February 2024 through 27 February 2024
Available from: 2024-04-29 Created: 2024-04-29 Last updated: 2024-04-29Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Brännvall, Rickard

Search in DiVA

By author/editor
Brännvall, Rickard
By organisation
Data Science
Computer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 191 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf