AI image generators could use a hand

In an episode of the TV sitcom Seinfeld, George Costanza almost becomes a hand model after an agent notices his “beautiful” hands. This exciting career possibility is abruptly cut short after his hands meet the heat of a clothes iron, causing his modeling dreams to slip from his (charred) fingertips.

Ironically, the fictional character George Costanza is perhaps the most famous (almost) hand model. While fashion models like Cindy Crawford, Kate Moss, and Tyson Beckford rose to the level of household names, the more specialized field of hand modeling has yet to produce an equivalent star.

And that may be part of the reason why AI image generators aren’t so good at drawing human hands.

To be fair, image generators such as OpenAI’s Dall-E 3 have gotten pretty good at drawing humans, after training against countless snapshots of the Homo sapiens species.

But hands are a small part of the human body, and rather complicated ones at that. They don’t tend to be the focus of photo shoots or selfies, so in training sets consisting of people, they play a small role. In fact, many AI models have focused on faces, as that tends to be the first thing people look at. (At least the non-introverted ones.) Modern digital cameras also prioritize focusing on a subject’s face and eyes, paying little attention to extremities such as hands or feet. As William Shakespeare said, the eyes are the window to your soul.

Even when hands are featured prominently in a photo, they can be much more dynamic than, say, a nose. Each human hand consists of 27 bones, 34 muscles, and over 100 ligaments and tendons, making it a rather complex part of our anatomy. This enables hands and fingers to move and bend in countless ways. And that trips up AI systems. Consider the vastly different appearances of a hand clenched in a fist, a thumb rising up from the palm of a hitchhiker, or two figures interlacing their fingers as they walk down a street. And hands may disappear entirely as they are slipped into pockets, or the comfort of warm gloves.

AI systems don’t have an inherent understanding of human anatomy. They don’t know what directions fingers can (or can’t) bend in, nor do they have insight into the different emotions represented by various hand gestures. They focus on identifying patterns in large amounts of data, and those patterns can be elusive when it comes to hands.

Humans are also really good at noticing when something is off about a hand, making even minor errors easy to spot. Because they are such an important part of our bodies, we’re more likely to notice an unusually long or bent finger than we are to notice shoelaces that don’t quite interlace correctly on a pair of shoes.

But with each update to AI image generators, their hand-drawing skills continue to improve. This is likely due to a focus on selecting training sets that include clear photos of human hands, while eliminating sets where the hands are partially obscured. More accurate hands are within AI’s grasp.