SPOT: The Future of Accessibility

Katie Silverman
8 min readOct 17, 2020

When I was eight years old, I went into my vision checkup hoping that I would get glasses. It was unlikely, as my vision had always been good. However, when I covered my left eye, suddenly I couldn’t see… anything. I don’t know how I hadn’t previously realized it, but through my right, the world was a soupy blur of gray with a few scattered bright colors.

Being semi-cycloptic has posed its challenges, to be sure. Even with my super-powerful contact lens, I have no depth perception, making walking into objects a legitimate (though often hilarious for bystanders) risk, sports a joke, and driving a death sentence. However, as I navigate my two-dimensional world, I cannot even imagine the difficulties faced by those with little or no vision at all.

Across the world, 285 million people are visually impaired, of whom 39 million are blind. 18% of legally blind participants in a survey conducted by UC Santa Cruz experience head-level accidents (knocking their head into an unexpected object) more than once a month, and 23% of these accidents have medical consequences, requiring treatments including stitches, staples, plastic surgery, and dental treatment for broken teeth. 10% of legally blind participants trip and fall more than once a month, with 36% of falls having medical consequences. Participants said that they had required treatments ranging from stitches to orthopedic surgery to rehabilitation.

Overall, the world is a dangerous and even scary place, especially for someone who is visually impaired. This is where the challenge lies: how can we make it as easy as possible for people with visual impairments to navigate the world around them, so that the risks are close to or even the same as if they had no impairment at all?


This is where SPOT comes in. Our goal is to use artificial intelligence to improve accessibility for people with visual impairments, and to help every individual reach their maximum potential regardless of ability. While technologies such as Google’s Envision glasses are already available for preorder, we want to develop a more complete software that will improve the experiences of visually impaired people. More traditional aids, such as guide dogs and canes, are also not infallible, and are used by only 2–8% of the blind community.


SPOT glasses offer:

  • Object recognition
  • Face recognition
  • Fast decision making
  • GPS Navigation
  • Text-to-speech (including handwriting)
  • Haptic notifications (vibration)
  • A discrete camera
  • Bone conduction
  • Transition lenses to protect the user from harmful UV rays

But how does it work?


A visualization of how YOLO detects objects in real-time

YOLO (You Only Look Once) is a real-time object detection algorithm — one of the most effective out there. Many image analysis agents use classification algorithms, which can only determine what objects are present. However, YOLO can also tell where these objects are. Classification algorithms can also detect only one object at a time, while YOLO can run predictions on every object in its field of view with just one pass through its neural network (hence “only look once”), making it extremely fast.

YOLO allows SPOT users to get a complete idea of their surroundings very quickly, including signs, obstacles, an object they’re looking for (such as a door or their mug), or even a friend’s face. This is vital to performing certain tasks, such as crossing the street, safely — SPOT is not only useful in determining whether a light is red or green, but in locating a crosswalk, or helping the user to “look both ways.”

This will improve all areas of life for a user with a visual impairment, from being able to locate things in their home to being able to “look” at people when talking to them, an important social and professional skill.


A simple example of reinforcement learning

Part of what makes SPOT different from existing technologies is that it is trained through a type of machine-learning called reinforcement learning. SPOT stands for Simulation-Powered Optic Technology, and as the name suggests, SPOT learns in a simulated environment, where it is rewarded for making good choices (e.g. correctly identifying an object, crossing the street safely, etc.). This means that in addition to recognizing its environment, it knows the optimal reaction. This is similar to how self-driving cars are being trained.

A Voyage Deepdrive Simulation

Voyage DeepDrive, a “deep reinforcement learning” algorithm for self-driving cars, uses a similar approach. With Voyage DeepDrive, the AI is trained on simulated streets and learns how to behave under certain conditions through a system of punishments and rewards , so that when it’s paired with image recognition software it already knows what to do.

SPOT relies on this synthesis of reinforcement learning and object recognition. SPOT is trained to recognize city streets for increased pedestrian safety, and can even be mapped to a specific user’s home.

One simple event that would require reinforcement learning is walking up and down a staircase. Ordinarily, there might be a risk of tripping, but by coupling image-recognition software with reinforcement learning, SPOT can recognize not only how many stairs there are, but also when the best time to step is.

Optical Character Recognition (OCR)

SPOT glasses also offer Optical Character Recognition, which can interpret and read out characters and words, including handwritten ones. This feature allows users to “read,” as the SPOT glasses can perform a text-to-speech function. This feature is especially important, as fewer than 10% of legally blind individuals in the U.S. can read Braille, and there is also a lack of availability of books in Braille — not to mention writing that is necessary and imperative to read, such as street signs.

This will also have major impacts on both work and education, allowing people with visual impairments to hold jobs that require the reading of handwritten or printed writing and allowing blind students to be better integrated into a traditional classroom setting.

Faster Reaction Times

A comparison of human and AI braking speeds

A well-timed AI can respond to danger faster than a human (or guide dog), but humans still take time to process that reaction. On average, it takes humans 0.17 seconds to respond to auditory stimulus — SPOT glasses saying “Stop” when there’s a car nearby, for example. This reaction time is further cut down by features that utilize the user’s other senses — a vibration, for example, has a slightly shorter reaction time of 0.15 seconds, a small but crucial difference in a possibly life-threatening situation.


Snapchat’s “Spectacles” look trendy and fashionable, but they have 3D cameras and technology!

Another important feature of SPOT glasses is that they are discreet. Current models have large, bulky cameras — you might as well just strap a GoPro to your head. SPOT glasses have better-concealed cameras so that users can feel comfortable using the product without announcing their disability to the world. This is not only a matter of aesthetics but of safety — a person who is very obviously visually impaired might be taken advantage of or stolen from, and in the United States, people with disabilities are three times as likely to be victims of serious violence. Glasses that do not call as much attention to themselves may also prevent young wearers from being bullied.

Rather than an external speaker, SPOT glasses utilizes bone conduction. Bone conduction bypasses the eardrum to send sound waves directly through your skin and skull, making instructions sound like a voice in the user’s head. This also makes the device accessible to hearing-impaired users, since hearing loss is frequently caused by damage to the eardrums and will not interfere with bone-conducted sound.

Affordability and Access

One of the biggest reasons why SPOT is necessary is because current technologies are just not affordable to those who need them the most. Envision glasses cost two thousand dollars, while the Envision app is priced at fourteen dollars per month. On the other hand, products such as the OrCam MyEye are not available in the United States unless you are an eligible veteran. Sight is a human right, which is why one of our most important goals is to make SPOT as accessible as possible.

Because SPOT glasses are designed to help people with visual impairments, not to be a cool gadget, they do not need features such as Bluetooth pairing, various apps, or a general operating system. While it’s possible to have more expensive models with those options, the base model includes only a frame, a small camera, a bone-conducted headset, and a small computer. This drastically reduces the cost of both making and purchasing the glasses.


For people with disabilities, the world can be difficult to navigate in ways that many of us take for granted. However, we are living in an age where technology can be used to close the accessibility gap between those living with and without disabilities. SPOT aims to make use of emerging AI technologies to benefit real people in a tangible way. Here’s a quick reminder of what makes SPOT different:

  • YOLO-image recognition
  • Simulation-powered reinforcement learning
  • Optical Character Recognition
  • Faster reaction times
  • A sleek, subtle design
  • Bone conduction
  • An affordable model that’s accessible to those who need it the most

Thank you for reading! We hope that this technology can soon be a reality. For more information, please check out our website.



Katie Silverman

17-year-old human-longevity researcher, actress, songwriter, TKS Innovator, and marshmallow enthusiast