Bing Visual Search
Search what you see with the camera on your phone.
Powered by artificial intelligence, Visual search on the Bing app takes full advantage of Bing’s vast knowledge to identify elements in a photo. The user can take a photo of their surroundings to explore landmarks, or to identify plants and animals. Take photos of outfits and furniture that catch your eye and be inspired with similar images and products online. Or even take action on text from your images, and scan QR and barcodes.
Design lead — Product and motion design
The experience was broken down to three main parts: first run, pre-capture, post-capture
First run
The first run experience did two things:
Ask permission for the app to access and take photos with the phone’s camera
It also ran through the basic instructions on how to use the camera in the app to visually search.
The experience used a composed looping video as the base background. Support copy about camera access and an action button that triggered the system modal was overlaid on the video background.
After flighting the experience, data showed that people using the feature were taking photos, but were not tapping on search results.
Based on that, the video was refocused to emphasize the search results and tapping through a result.
Pre-capture
The pre-capture experience uses edge detection technology to visually represent what visual search.
Edge detection uses series of image processes, to detect the edges of objects in the scene. The goal was a visual representation what the camera does.
The design team worked with our developers to create a proof of concept.
As the developers were still coding the experience, the team quickly mocked up the proof of concept by using two phones with different feeds.
The first phone was the regular camera feed, and the second was the edge detection filter applied to the camera feed.
After working with the devs, the design team was able to refine the visuals.
The edge detection filter was used as a matte for pulse ring loop animation. The effect acted like a radar and made it feel like we were scanning through the image: