Enabling Autoregressive Models to Fill In Masked Tokens