Sitemap
HackerNoon.com

Elijah McClain, George Floyd, Eric Garner, Breonna Taylor, Ahmaud Arbery, Michael Brown, Oscar Grant, Atatiana Jefferson, Tamir Rice, Bettie Jones, Botham Jean

Member-only story

Does your NLP model able to prevent adversarial attack?

Adversarial Attack

Edward Ma
3 min readJun 30, 2019

--

Adversarial attack is way to fool models through abnormal input. Szegedy et al. (2013) introduces it on computer vision field. Given a set of normal pictures, superior image classification model can classify it correctly. However, same model is no longer classify input with noise (not random noise).

Left: Original input. Middle: Difference between left and right. Right: Adversarial input. Image classification model classify left 3 inputs correctly but model classify all right inputs as “ostrich”. (Szegedy et al. 2013)

In natural language processing (NLP) field, we can also generate adversarial example to see how your NLP model resistance to adversarial attack. Pruthi et al. use character level error to simulate adversarial attack. Performance of state-of-the-art model achieve 32% relative (and 3.3% absolute) error reduction.

Architecture

Pruthi et al. use semi-character based RNN (ScRNN) architecture to build a word recognition model. A sequence of words will feed into the RNN model. It does not consume whole word but splitting to prefix, body and suffix.

  • Prefix: First character
  • Body: Second character to second last character
  • Suffix: Last character

--

--

HackerNoon.com
HackerNoon.com

Published in HackerNoon.com

Elijah McClain, George Floyd, Eric Garner, Breonna Taylor, Ahmaud Arbery, Michael Brown, Oscar Grant, Atatiana Jefferson, Tamir Rice, Bettie Jones, Botham Jean

Edward Ma
Edward Ma

Written by Edward Ma

Focus in Natural Language Processing, Data Science Platform Architecture. https://makcedward.github.io/

No responses yet