Project Diversita: When Machine Learning Meets User Experience

 

In this two-quarter grad school project, I explored how to unleash the power of Machine Learning to fulfill requirements that were tricky for the previous generation of technologies, with the consideration of Human-Centered Design and Engineering.

Background

Project Diversita was a Microsoft sponsored “Launch Project” at the University of Washington GIX in 2018 under the AI for Earth initiative. It aimed to empower biodiversity research by utilizing Machine Learning (ML) on Edge technology. The ML model had been trained by Microsoft Research based on the iNaturalist dataset.

I received training in Machine Learning and sensors, however for this project my role was mainly on product design. The challenge was to explore the potential market space and find a niche that fits our technology backbone best.

Time Span

Jun. 27, 2018–Dec. 07, 2018

About the Team

A ten-people team consists of UW GIX students (Phelps Xia, Hal Zhang, Ben Keller, and I) and Microsoft researchers (Dan Morris, Lucas Joppa)/engineers (Wee-Hyong Tok, Siyu Yang, Erika Menezes, Xiaoyong Zhu)

Role in the Project

Product Designer (UX, ID), Product Manager, User Researcher

General Objectives
  • Study the market to identify a niche camera trap scenario to disrupt with Microsoft’s ability in AI.
  • Ideate concepts that are desirable for the customers, viable for the business and feasible for engineering.
  • Develop working prototypes to prove the concept.
Keywords

Machine Learning, Edge Computing, Raspberry Pi, IoT, computer vision


1. Definition of the Product

Original Input from Microsoft

In the kick-off meeting on the Microsoft Campus, the project team from the Microsoft side had the following requirements as the input:

  • Consumer-facing: The final product should be for consumers, able to disrupt the camera trap market;
  • Mobility: The final product should be easy to carry;
  • Connectivity: The final product should have connectivity to other devices or the cloud.

To know the subject matter better, our team studied the case in three sessions:

  1. Competitive Analysis;
  2. User Research: Interview;
  3. User Research: Survey.

1.1. Competitive Analysis

This study aims to answer three research questions and extract insights to inspire and guide the conceptual design. The insights come from reasoning with findings found among the competitor products.

The research questions were:

  • How do the competitor products fit with the specific use case?
  • What are the proper interaction models for an imaging product?
  • What are the good approaches to develop our product?

Identified Competitors

The “competitors” include direct competitors and indirect competitors. Direct competitors are products that are likely to compete with our prospective product. While indirect competitors are products that share similarities in specific dimensions with our prospective product. In this study, an “indirect competitor” does not need to be able to substitute this project’s prospective solution. Indirect competitors selected for this study include products or services that share similar functionalities (for instance, image capturing) or could coexist in the same scenario (such as binoculars, wildlife tracking systems).

Metrics for Observation

When going through the competitor products, team members took notes of the findings for three aspects of the product—1. The business side; 2. The interaction experience; 3. The development approach.

Insights Gained from the Competitive Analysis

Four team members contributed to 84 findings. 44 of the 84 findings were on use case/business model or user interaction experience, while the rest were findings on engineering solutions. Based on the findings, we extracted the following insights:

  1. Platform as a Service. — The product can be a general-purpose platform for different services. When the products are networked, the synergy can generate extra value. The value can be packaged as a service for sale.
  2. Taking differentiators as the advantage. — Low Cost, Mobile, No Need for Connectivity, Privacy. Machine Learning on Edge devices can be designed solely for specific use cases, which makes it low cost compared with smartphones. Compared with ML on the cloud, it saves for the subscription credits and expenses for connectivity.
  3. Using the tiering strategy to boost the market size. — Offering low-cost MVPs to the market to stimulate the demand by allowing more trial purchase.
  4. Seeking cooperations with communities to disrupt specific niche markets. — Whether it’s a professional community or residential community, working with them closely to solve their unique problem is an effective way to set foot in the market.
  5. Enabling others. — Aside from being an independent product. It is also viable for the product to be a module that can utilize/empower other products to fulfill similar functionalities as the standalone product does.
  6. Proactively serve and listen to the users. — Users’ feedbacks are valuable inputs for iterations and improvements. Predicting user’s frustration and prepare well-structured resources to help the users out.
  7. Made for sharing. — Sharing is in human’s nature. Enable the users to express freely can both fulfill their desire and create impact beyond the user.
  8. Deliver what users care. — The user cares about the result. They do not mind the algorithms or engineering efforts behind the enclosure. Avoid using jargons or prompting unnecessary dialogs for the user to configure.
  9. Form follows the function. — Logic design is both functional and aesthetically appealing. By adopting a logic industrial design, the same functionality can be achieved with a lower cost, leaving more space to create value for the customers.
  10. Using the phone as the interface. — Smartphone screens have rich interactivity. Moving specific operations to the smartphone can lower the device cost and enable better interaction experience.
  11. Redundancy brings reliability. — Using multiple sets of components for the same purpose can enhance the system’s performance. Also, it makes the system more robust.
  12. Ready for expansion.  — Leaving open ends for specific modules will make the product capable of adapting to different scenarios with minor efforts.

The Conclusion of the Competitive Analysis

Through the Competitive Analysis, we explored the competitive space for Project Diversita. Based on the insights we have extracted, the prospective product for Project Diversita should take differentiators of ML on Edge device to compete with smartphones and other competitors. By seeking cooperation with communities, the product can enlarge its market by disrupting niche markets. In the long term, it should be ready for expansions to more scenarios; the device nodes could form a platform as a whole to offer services. When designing the product, we should keep user-centric principle in mind—keep listening to their voice and deliver what could motivate them to engage in biodiversity conservation actions.

1.2.1. User Research

The research questions for User Research were:

  • In what scenarios do people want to identify wildlife animals near them?
  • What perspectives do different stakeholder groups hold about the natural world, and how do these viewpoints influence their behavior and purchases?
  • What’s the technical capability and limitations of current ML on Edge setup?

Interview

In one of the user interview sessions, the participant was an active hiker and world traveller. She showed us her tools for species recognition and shared her opinion on design’s impact on the environment.

We interviewed both Subject Matter Experts (SME) and potential users (people who have a backyard and are willing to interact with the animals). We processed the interview data by affinity map. Keynotes were aggregated to different clusters — Scenario,​ ​Feature,​ ​User Property, Motivation,​ ​Concern​. Based on these five affinity groups, we identified Barriers,​ ​Motivations,​ ​User Requests,​ Other Directions​ for this project.

Interview data processing.
Barriers:
  1. Lack of variation on encountered species.– Less than ten species were mentioned in the user interviews. Less than 20 species were mentioned in the 100-sample survey.
  2. The accuracy of the recognition result.– SME3*: “Researchers require high accuracy (98%) when publishing in peer-reviewed papers.”
  3. Maintenance.– SME3: “10-20% of cameras break or are stolen every year.”
  4. Ethical question.– Is it good to increase the interaction between human and animals?
Motivations:
  1. Curiosity + Engagement.– U1*: ​“I want to know what has come to the bird feeder.” “I need some actionable piece of information.”
  2. Safety + Security.– U2: ​“When a coyote comes, alert each other to aware.”
  3. Social network.– U2: ​“I’ll text neighbors when elks come to the wetland in front of our house.”
User Requests:
  1. Voice-based recognition.– U2: ​“Bird photos are hard to capture, but it’s easy to hear.”
  2. Flora recognition.– U1: ​“I do mushroom. Some people feel it intimidating because it’s hard to identify whether it’s edible.”
  3. Footprint recognition.– U2: ​“animal prints in a snowy environment just beside the trail, easy to recognize.”
  4. More than recognition.– U1: ​“I prefer to know how to improve, instead of just identification of species.”
Other Directions:
  • Snorkeling.– When snorkeling, people have a great chance to encounter multiple species in a short time. Moreover, there is no connectivity available underwater. ML on Edge can take its advantages to differentiate from other computing platforms.
  • Conservation Research and Preservation
    • Invasive species removal;
    • Radio telemetry collaring;
    • Genetic sampling.
  • Protected Area Management
    • Monitoring enclosures;
    • Urban development ecological impact assessment.

Note: SMEx = Subject matter expert No. x; Ux = User No. x.

1.2.2. Survey

The team collected quantitative data from the participants who own a house. The team collected 120 samples in total and analyzed the result to distill insights around interest to utilize animal identification capability in different scenarios. The result reflects more common characteristics of the target who show the most interest in the application. They are homeowners with the backyard and own pets, especially cats and dogs. The insights will guide the team on conceptual design and direction selection in the next phase.

The research questions were as following:

  • In what scenarios do people want to identify wildlife animals near them?
  • For what reasons do people want to buy a device to monitor their backyard?
  • From the user’s perspective, what could the edge computing device benefit them?
Survey Tools

Google Form, Amazon Mechanical Turk, Tableau.

Analysis Methodology

We analyzed the data from two aspects. First, we analyzed the data from the quantitative aspects. We used these data to scope down and support the persona making process. At the same time, we wanted to understand the potential user size for different scenarios as part of our market research to figure out the possible biggest market. Meanwhile, we analyzed all the open-ended questions to get more inspiration and valuable ideas from the respondents, which would help us to understand more on the possible scenarios as well as evidence to support or be opposed to our thoughts.

Insights from the survey:

  1. The potential market for backyard could be massive.
  2. For consumers, they would consider the device more like an entertainment device.
  3. People monitor their backyard mainly for security purpose.
  4. People who have pets are more likely to purchase the devices.
  5. People care about engagement instead of technology.

1.3. Concept Development

I facilitated a brainstorm session for the concept creation. There were two sets of constraints for the ideation: technical limitations and non-technical constraints. The technical limitations: 1. The Machine Learning model is only capable of recognizing land animals; 2. The device requires mobility. Thus we need to balance the performance and power consumption; 3. As a project input, it must have connectivity capacity. The non-technical constraints: 1. The device should be consumer-facing; 2. There will be only seven weeks available for us to work on the project.

Ideation

With these constraints in mind, the team came up with the following ideas:

  1. Neighborhood Curiosity:​ A device to share animals captured by the device to the community. Give them a space to share the experience of witnessing wildlife in their place.
  2. Little Friends’ Guard​: Using the device to capture moments when the kid is playing with home pets. When dangerous animals being captured, warn the parents about the safety threat to their kids or pets.
  3. Camping Discovery​: A device to put outside the tent or near the camping site. To capture animals pass by.
  4. Window to Nature​: A disabled person what to experience nature. He/she can set up the device and view photos of animals passing by the device he set up far away in the forest at home.
  5. Predator Watch:​ As a neighborhood watch system, many devices work together to monitor the yards, roads, and green belts of the neighborhood and watch out for the dangerous animal. Alerts residents when the predators appear so that they can avoid them, stay in the home, stay safe.
  6. Camping Guard:​ Setup on the tent or by the food storage to watch out for predators for you while you sleep records exciting animal encounter during your camping trip.
  7. Plant and Animal Mobile Wiki:​ Most of the time when you travel to an area with exotic species, you don’t’ have cell internet coverage. It would be great if you can get the species’ name and info while taking a stunning picture of them.
  8. Smart GoPro​: Help users capture the moment with animals around them that they don’t recognize.
  9. Pet Tracker:​ Help the users to track their pets’ activities in the backyard.
  10. Predator Alarm​: Warm the users for the coming predator in the neighborhood.
  11. Conservation Platform:​ Networked camera traps to enable a new way of studying and conserving nature.
  12. AI Bird Feeder:​ Count and classify birds that arrive at your feeder to learn how best to encourage local biodiversity.
  13. AI trail Cam for Hunters​: Help hunters better understand deer population to know which to remove selectively.

We synthesized these ideas and figured out three concepts:

  • Concept I: Curiosity​ — Utilizing ML on Edge to satiate people’s curiosity and educational purposes. Example scenario: Go camping with the device, set up the device and go enjoying nature. Come back and check the captured images.
  • Concept II: Security​ — Identify species that can threaten kids or pets safety. Issue warning when necessary. Example scenario: Deploy the device where needed. Configure species to monitor. If the target species detected, send a notification to the user.
  • Concept III: Platform​ — Deploy a network of devices to enable new methods for conservation research and management. Example scenarios: Protected area management, invasive species response, enclosure security, climate monitoring.
Solution Description/Work Breakdown Structure


There should be three pillars in the system:

  • Device (IoT kit)​: The kit will be deployed at the place where the user needs. It scans the field, if an animal passes by, wakes up the camera and takes a photo. The photo will be processed locally on the device. If the on-edge ML model determines there is an animal in the image, the device will send the image to the cloud service via the wireless connection.
  • Azure services:​ When the cloud service receives the image, the on-cloud ML model will analyze the image to extract species information. As the image data sent by kits accumulates, these data can be used for ML model evolution. After the image receives the species recognition result as part of its metadata. The notification service on the cloud will determine if the image should trigger a notification action based on the user’s setting.
  • App console:​ The App is the portal for the user to interact with the IoT kit and cloud services. The user could configure settings for the IoT kit and the cloud service. For example, the user can set the system to send him/her a notification when the IoT kit has detected a coyote. The app is a content distribution channel. Users can view the positive images captured by the device. It is also a social platform. Multiple users can share the data from the same device or device network and communicate on the contents. Moreover, the App is a feedback channel for the user to give back to the recognition quality — users can rate the recognition quality and provide correct species names. Due to time and resource constraint, the App will be developed for a mobile web browser instead of a native app.

In the long-term, the system should allow multiple kits to form an IoT network. More abilities like workflow simplification, remote triggering, behavior mapping, age/gender recognition can be added to bring more value to customers.


2. Prototype Development

2.1. User Journey Mapping

 

I drafted a user journey mapping to generate the feature list:

Hardware:

  1. switch button to turn on/off the camera;
  2. reset button;
  3. manual capture button;
  4. LED indicator for statuses;
  5. industrial design: deployable on typical scenarios—pole/wall/tree;
  6. sensors to detect moving objects in the field of view.

Software:

  1. login;
  2. device management;
  3. task management;
  4. local ML processing;
  5. Azure ML API;
  6. push notification service.

2.2 User Experience

In this project, I was responsible for the product design, which included the product definition, the digital experience, and the physical prototype.

2.2.1 Digital Experience

The digital touch point of the service was the app. Initially, when we plan the project, as no one on our team had hands-on experience of developing native iOS/Android app, we decided to develop a web app. So when I was designing the product, I only used basic web interaction patterns like scrolling and clicking. Considering the time constraint (four people, two months), the app can only provide very basic functionality—register/log in, add a device, set up a trigger, view captured images. In fact, the minimum viable product had already been a challenge for the team—we also needed to test and develop the hardware part, configure and test the ML on Edge and Azure, design the mechanic structure and fabricate the prototypes.

2.2.2 Physical Experience

After more than 60 days of experiments, our team identified the final set of components to use for the final prototype. The physical dimensions of these parts and the way they connected with each other became the constraints for the form factor of the prototype.

I used CAD tools to arrange the spatial relationship of the circuit system with careful consideration of how a user will use the product. The V2 prototype was fabricated to evaluate my design and the performance of the 3D printer. Based on feedback from our faculty and my observation, I further iterated the model to V3.

Comparing with V2, which used 360 mL of ABS, V3 only consumed 268 mL, that was 25.6% less material consumption, but kept the same strength (please refer the following image). What’s more, the new design enabled us to seal all gaps between the parts and the enclosure, made it a waterproof-capable device.

Below is an explosion view of the final CAD model, you can rotate the model by dragging horizontally.

Parts just got printed from the Stratasys F170 machine, the chamber was still very hot.

The back of the prototype, from left to right: University of Washington (UW), Global Innovation eXchange (GIX), Microsoft. Someone joked it was Word, Excel, Windows.
I designed the device with a tripod mounting point.
The viewfinder was my answer to a teammate’s plan to add a video streaming feature, which was just for the user to check what the camera can see. I’m proud to solve a seemingly tricky requirement with a stone age solution.

3. Afterward

3.1. Diversita+

On the stage for our final presentation, I commented on the Diversita cam:

However, our long-term vision is not only a camera.

When we have a matrix of ML enabled camera, we can form a platform — it’s a platform for intelligent edge devices. Which can be used to collect data and gain insights from the world, for example, monitor water purity, changes in the environment and so on.

The essence of the Project Diversita was using Machine Learning on Edge technology to recognize specific patterns and react to different results accordingly. While Diversita was built for land animals, I individually developed the Diversita+—an extension of scenarios for the technology framework—for underwater use. The dataset to train the model could be optical camera images for individual species like shark, ray, barracuda. It could also be sonar reflection pattern. In general, Diversita+ could be used to assist beach safety monitoring, marine fauna research, and water purity monitoring.

Three example concepts:

  1. Huntaway: a fencing system deployed at beaches or similar places to monitor and fend off marine species that threaten human life.
  2. Remora: an ML on Edge device that is installed at the side of a ship below the ship’s waterline, it combines images from an optical camera and a sonar array, recognizes the marine life species and geo-tag the location where the data was collected.
  3. Seahorse: a device that integrates an electronic microscope with an edge computing module.

If you are interested, please check out the following document for detailed descriptions: http://bit.ly/2EFxdgj.

UW_GIX_Diversita_Plus_Rev.1221

3.2. Media Coverage

group photo of the Diversita Team at UW GIX
photo credit: Scott Eklund

Our team was lucky to be featured on a story posted on Microsoft on the Issueshttp://bit.ly/2GxYvYA. When the story released, Brad Smith, the president of Microsoft, retweeted it. It was amazing to see our group photo appeared on the timeline of the most valued public company’s president’s Twitter account.

From monitoring clucking chickens with #AI to identifying wildlife with machine learning – the inaugural @GIX_edu projects show that collaborative and global education will promote technology that is a powerful force for good. Bravo! https://t.co/iarKbk28XJ 

3.3. Credits

At last, I’d like to give great applause to the following persons:

  • Blake Hannaford, Linda Wagner, John Raiti: for their guidance and help;
  • Nicholas Ames: for providing us with reliable fabrication machinery;
  • Kent Foster: for making the connection between Microsoft and UW to make the project possible;
  • Microsoft project team: for providing us this opportunity to work on this project, special thanks to Siyu Yang, for making a timely connection to fabrication capabilities on Microsoft campus when our machine was down;
  • Maksim Surguy: for being our 5th member, providing us soldering tools, screws, and glue.

Leave a Reply

Your email address will not be published. Required fields are marked *