University of Leicester
Browse

F2Attack: Two-Factors Scoring Method For Query-Efficient Hard-Label Black-Box Textual Adversarial Attacks

Download (1.76 MB)
journal contribution
posted on 2025-10-03, 13:36 authored by J Wang, H Zhang, Y Wang, H Gao, Q Li, Huiyu ZhouHuiyu Zhou
<p dir="ltr">In the hard-label black-box setting, existing attack methods randomly select words for perturbation, generating invalid word replacement operations, resulting in low attack success rate. Recent works alleviate this problem by evaluating the impact of words on model predictions, but they can only evaluate the impact of words on model predictions, not the impact of words on attack. If the attacker replaces too many words that have significant impact on the text semantics during the attack process, the adversarial example has poor semantics and the attack behavior is easily detected. To address the above issues, this paper proposes a two-factor word scoring method, which uses the attention score output by the pre-attack model and the semantic similarity after word replacement to evaluate the impact of the word on attack. Based on the scoring method, this paper proposes a query-efficient hard-label black-box adversarial attack method called F2Attack. F2Attack uses the two-factors method to score words, and then replaces words have great impact on the model prediction but small impact on text semantics based on scoring results to generate the initialized adversarial example. Then, F2Attack adopts the simulated annealing algorithm to optimize the semantic similarity of the adversarial example. We conduct experiments on four representative natural language models, seven text classification datasets, two natural language inference datasets, and four commercial APIs, and compared them with baseline methods. When the number of queries is limited to 100, F2Attack increases the attack success rate by an average of 15.165%, and the semantic similarity by 0.067, which is significantly better than the baseline methods.</p>

Funding

10.13039/501100001809-National Natural Science Foundation of China (Grant Number: 62172055, 62472047, 62502042, 62572068 and 62572073)

BUPT Excellent Ph.D. Students Foundation (Grant Number: CX2023219)

History

Author affiliation

College of Science & Engineering Comp' & Math' Sciences

Version

  • AM (Accepted Manuscript)

Published in

IEEE Transactions on Information Forensics and Security

Publisher

Institute of Electrical and Electronics Engineers

issn

1556-6013

eissn

1556-6021

Copyright date

2025

Available date

2025-10-03

Language

en

Deposited by

Professor Huiyu Zhou

Deposit date

2025-09-24

Usage metrics

    University of Leicester Publications

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC