SegmentPerturb: effective black-box hidden voice attack on commercial ASR systems via selective deletion
Voice control systems continue becoming more pervasive as they are deployed in mobile phones, smart home devices, automobiles, etc. Commonly, voice control systems have high privileges on the device, such as making a call or placing an order. However, at the same time, they are vulnerable to voice attacks, which may lead to serious consequences. In this thesis, SegmentPerturb was proposed to craft hidden voice commands via inquiring the target models. The basic idea of SegmentPerturb is that the original command audio was separated into multiple segments and a certain degree of perturbation was applied to each segment by probing the target speech recognition system. Experiments were conducted on four popular commercial speech recognition APIs plus one smart home device to show the practicability of SegmentPerturb. Results suggest that SegmentPerturb can generate voice commands which can be recognized by the machine but are hard to understand by a human.