A team of researchers at UC Berkeley have developed an AI that can copy actions from one person onto another. As described by the title of their report Everybody Dance Now the obvious thing to do is make videos of people dancing.
Their AI takes a ‘source’ video of a person, in this case professional dancers, and overlays their actions to a ‘target’ video. Both people are turned into 3-dimensional stick figures which can be adjusted to their position in the frame. Then the music starts and the AI generates new frames of the target dancing.
Akin to “deepfakes” which is often associated with the act of superimposing existing images or video onto source images or videos, these have been showcased mostly as manipulations of facial expressions, some more impressive than others.
While you can tell in most of their brief YouTube demonstrations that they’re not really dancing, this is a real improvement in AI technology because the AI isn’t trained with seeing the target dancing first. This means that, at least in theory, even a video shot from a smartphone could be used by this AI to make the person in it do whatever the AI wants.
I can’t wait for the Snapchat filter – but it does raise some serious concerns.
With some improvement over a few years this AI could turn a video of a mundane shopping trip into a robbery and that’s a scary thought. Reassuringly, the AI can’t account for substantially different camera angles or limb lengths yet. Meanwhile let's just enjoy all the groovy videos that'll inevitably swamp social media as soon as this can go mainstream.