Action Chunking with Transformer(ACT), RSS23

less than 1 minute read

Published: December 22, 2023

ACT explained.

Motivation

To allievate temporal accumulated errors, so introduce $k$ future actions for current prediction.

Explanation

Conditioned Variational Auto Encoder(CVAE), Transformer as encoder and decoder.

Settings

action=${imgs, joints}$

Training

Inference

Mannually set K and K running-average possiblity to esamble current action embedding (to tackle accumulated errors).

Questions

We see the condition is not the same, even though paper claimed that it uses “Conditioned VAE”. It’s a mathematically wrong approach in the first place. We are not even talking about the [CLS] and [POS_EMD] auxiliary input. inconsistancy

Thoughts & Comments

New Ideas

Transformer as CVEA encoder and decoder
K temporal ensamble

Comments

I assume the author would keep the condition the same, but as the ablation experiment goes by, the ablated variaty condition can perform even higher than the original?

Share on

Twitter Facebook LinkedIn

Zecheng Yin