Facebook AI Gets Better at Describing Photos for Visually Impaired Users

The social network rolled out an update to its automatic alternative text (AAT) technology.

By Stephanie Mlot | Jan 20, 2021

Rafael Henrique | SOPA Images | LightRocket | Getty Images via PCMag

This story originally appeared on PCMag

In an effort to better accommodate users who are blind or visually impaired, Facebook this week updated its automatic alternative text (AAT) technology.

The feature, introduced in 2016 (and granted the Helen Keller Achievement Award from the American Foundation for the Blind in 2018), relies on object recognition to generate descriptions of photos on demand.

Blind and visually impaired (BVI) users have long relied on individuals to tag images with alternative text, or screen readers to mechanically describe pictures on their News Feed. The next generation of Facebook’s AAT, however, makes scrolling through social media much more enjoyable.

“The latest iteration … represents multiple technological advances that improve the photo experience for our users,” according to a Facebook AI blog post. The team expanded tenfold the number of concepts AAT can reliably detect and identify, promising more photos with more detailed descriptions, including activities, landmarks, types of animals, and more.

If someone navigating their feed, for instance, stops at a photo of friends posing in front of a famous Italian tourist attraction, the audio caption might say something like “May be a selfie of two people, outdoors, the Leaning Tower of Pisa.”

Image Credit: Facebook via PCMag

In an apparent industry first, Facebook even makes it possible to include details of positional location and relative size of elements in a picture. So instead of describing the contents as “May be an image of five people,” the site can specify that there are two people in the center and three on the sides. Or, rather than describing a landscape with “May be a house and a mountain,” it can determine that the summit is the primary object based on its comparable size.

“Taken together, these advancements help users who are blind or visually impaired better understand what’s in photos posted by their family and friends—and in their own photos—by providing more (and more detailed) information,” the blog said.

When it launched nearly five years ago, the first version of AAT used human-labeled data to train a neural network; the completed model could recognize 100 common concepts like “tree,” “mountain,” and “outdoors,” and identify faces (with opt-in consent). “But we knew there was more than AAT could do,” Facebook said, “and the next logical step was to expand the number of recognizable objects and how we describe them.”

Now trained on weakly supervised data in the form of billions of public Instagram images and their hashtags, automatic alternative text is more accurate and culturally and demographically inclusive, able to perceive more than 1,200 concepts. “We want to give our users who are blind or visually impaired as much information as possible about a photo’s contents—but only correct information,” the company added.

Facebook subsidiary Instagram in 2018 took steps to become more accessible, embracing object recognition technology that automatically identifies items in a photo and creates an audible description. Users are also encouraged to write up to 100 characters of alt text detailing what’s in their images.

In an effort to better accommodate users who are blind or visually impaired, Facebook this week updated its automatic alternative text (AAT) technology.

Latest

This Nostalgic Apple Product Is Making a Major Comeback — and You Can Thank Gen Z For It

This Startup Claims It Can Stop Lightning Strikes. Scientists Have Questions.

The Conflict in the Middle East Could Sting at the Gas Pump. Here’s the Number to Watch.

Related Content

Why the Payments Industry Needs to Be Rebuilt From the Inside

This Lightweight HP Laptop Won’t Slow Down Your Workday

Almost Half of U.S. Workers Want to Switch Careers. Here’s What’s Stopping Them.

This Nostalgic Apple Product Is Making a Major Comeback — and You Can Thank Gen Z For It

The Conflict in the Middle East Could Sting at the Gas Pump. Here’s the Number to Watch.

The Rich Don’t Get Lucky — They Practice These 20 Entrepreneur Habits