Generate-Text (Index of Posts):


Combinatorial Masking based Adversarial Text Generation

Title: MCTSBug: Generating Adversarial Text Sequences via Monte Carlo Tree Search and Homoglyph Attack

Paper ArxivOnline

GitHub: Coming

Preliminary Abstract

Crafting adversarial examples on discrete inputs like text sequences is fundamentally different from generating such examples for continuous inputs like images. This paper tries to answer the question: under a black-box setting, can we create adversarial examples automatically to effectively fool deep learning classifiers on texts by making imperceptible changes? Our answer is a firm yes. Previous efforts mostly replied on using gradient evidence, and they are less effective either due to finding the nearest neighbor word (wrt meaning) automatically is difficult or relying heavily on hand-crafted linguistic rules. We, instead, use Monte Carlo tree search (MCTS) for finding the most important few words to perturb and perform homoglyph attack by replacing one character in each selected word with a symbol of identical shape. Our novel algorithm, we call MCTSBug, is black-box and extremely effective at the same time. Our experimental results indicate that MCTSBug can fool deep learning classifiers at the success rates of 95% on seven large-scale benchmark datasets, by perturbing only a few characters. Surprisingly, MCTSBug, without relying on gradient information at all, is more effective than the gradient-based white-box baseline. Thanks to the nature of homoglyph attack, the generated adversarial perturbations are almost imperceptible to human eyes.

Citations

@article{gao2018mctsbug,
  title={MCTSBug: Generating Adversarial Text Sequences via Monte Carlo Tree Search and Homoglyph Attack},
  author={Gao, Ji and Lanchantin, Jack and Qi, Yanjun},
  year={2018}
}

Support or Contact

Having trouble with our tools? Please contact Ji Gao and we’ll help you sort it out.


Masking based Adversarial Text Generation

Title: Black-box Generation of Adversarial Text Sequences to Fool Deep Learning Classifiers

evadePDF

Paper Arxiv / Github

Its shorter version was Published @ 2018 IEEE Security and Privacy Workshops (SPW), co-located with the 39th IEEE Symposium on Security and Privacy.

GitHub: https://github.com/QData/deepWordBug

TalkSlide: URL

Abstract

Although various techniques have been proposed to generate adversarial samples for white-box attacks on text, little attention has been paid to a black-box attack, which is a more realistic scenario. In this paper, we present a novel algorithm, DeepWordBug, to effectively generate small text perturbations in a black-box setting that forces a deep-learning classifier to misclassify a text input. We develop novel scoring strategies to find the most important words to modify such that the deep classifier makes a wrong prediction. Simple character-level transformations are applied to the highest-ranked words in order to minimize the edit distance of the perturbation. We evaluated DeepWordBug on two real-world text datasets: Enron spam emails and IMDB movie reviews. Our experimental results indicate that DeepWordBug can reduce the classification accuracy from 99% to around 40% on Enron data and from 87% to about 26% on IMDB. Also, our experimental results strongly demonstrate that the generated adversarial sequences from a deep-learning model can similarly evade other deep models.

Citations

@INPROCEEDINGS{JiDeepWordBug18, 
author={J. Gao and J. Lanchantin and M. L. Soffa and Y. Qi}, 
booktitle={2018 IEEE Security and Privacy Workshops (SPW)}, 
title={Black-Box Generation of Adversarial Text Sequences to Evade Deep Learning Classifiers}, 
year={2018}, 
pages={50-56}, 
keywords={learning (artificial intelligence);pattern classification;program debugging;text analysis;deep learning classifiers;character-level transformations;IMDB movie reviews;Enron spam emails;real-world text datasets;scoring strategies;text input;text perturbations;DeepWordBug;black-box attack;adversarial text sequences;black-box generation;Perturbation methods;Machine learning;Task analysis;Recurrent neural networks;Prediction algorithms;Sentiment analysis;adversarial samples;black box attack;text classification;misclassification;word embedding;deep learning}, 
doi={10.1109/SPW.2018.00016}, 
month={May},}

Support or Contact

Having trouble with our tools? Please contact Ji Gao and we’ll help you sort it out.