posted on 2025-10-09, 14:47authored byJia Zheng, Wanjin Hou, Hua Zhang, Ming Lv, Huiyu ZhouHuiyu Zhou
<p dir="ltr">There are the excessive queries to the targeted model during the generates of gray-box adversarial examples for speaker recognition systems, which result in high costs of attacks. In this paper, a fast generates algorithm of gray-box adversarial example is proposed based on FakeBob, named F-FakeBob. This algorithm introduces a threshold mechanism for optimization to the optimization strategy of gradient. Only when the increasing of the confidence scores of the adversarial example before and after optimizing is less than the threshold, the gradient is recalculated for the next iteration. By reducing the frequency of gradient calculations, the number of queries to the targeted system is decreased. Experiments on three public datasets of speech, TIMIT, Common Voice, and Voxceleb2, are conducted to generate adversarial examples. The targeted speaker recognition models are based on ECAPA-TDNN and TitaNet architectures. The experimental results show that F-FakeBob can achieve a targeted attack success rate of 99.2% and the numbers of queries are effectively reduced in the adversarial example generates, with an average query reduction of 25.71% compared to FakeBob.</p>
Funding
National Natural Science Foundation of China (Grant Nos. 62472047, 62072051)
History
Author affiliation
College of Science & Engineering
Comp' & Math' Sciences