Spike-SLR: An Energy-efficient Parallel Spiking Transformer for Event-based Sign Language Recognition


Xinxu Lin (Sichuan University), Mingxuan Liu (Tsinghua University, Tsinghua University), Kezhuo Liu (Tsinghua University, Tsinghua University), Hong Chen (Tsinghua University, Tsinghua University)
The 35th British Machine Vision Conference

Abstract

Event-based cameras are suitable for sign language recognition (SLR) by providing movement perception with highly dynamic range, high temporal resolution, high power efficiency and low latency. Spike Neural Networks (SNNs) are naturally suited to deal with the asynchronous and sparse data from the event cameras due to their spike-based event-driven paradigm, with less power consumption compared to artificial neural networks. In this paper, we introduce spiking transformer into event-based SLR by proposing a model named Spike-SLR, which includes two novel blocks: a spike soft-attention block, which enables model to focus on regions with high spike rates, reducing the impact of noise to improve the accuracy and a parallel spike transformer block with simplified spiking self-attention mechanism, increasing computational efficiency. On SL-Animals-DVS-4sets and SL-Animals-DVS-3sets, Spike-SLR achieves the accuracy of 89.47% and 90.06%, outperforming the state-of-the-art (SOTA) model by 1.35% and 2.61%, respectively. Besides, Spike-SLR only need 0.03mJ to process a sequence of event frames, achieving a 99.27% reduction in power consumption compared to the SOTA model. Code is available at https://github.com/Arktis2022/Spike-SLR.

Citation

@inproceedings{Lin_2024_BMVC,
author    = {Xinxu Lin and Mingxuan Liu and Kezhuo Liu and Hong Chen},
title     = {Spike-SLR: An Energy-efficient Parallel Spiking Transformer for Event-based Sign Language Recognition},
booktitle = {35th British Machine Vision Conference 2024, {BMVC} 2024, Glasgow, UK, November 25-28, 2024},
publisher = {BMVA},
year      = {2024},
url       = {https://papers.bmvc2024.org/0493.pdf}
}


Copyright © 2024 The British Machine Vision Association and Society for Pattern Recognition
The British Machine Vision Conference is organised by The British Machine Vision Association and Society for Pattern Recognition. The Association is a Company limited by guarantee, No.2543446, and a non-profit-making body, registered in England and Wales as Charity No.1002307 (Registered Office: Dept. of Computer Science, Durham University, South Road, Durham, DH1 3LE, UK).

Imprint | Data Protection