Human-computer interactions in Machine Learning applications #1

5 min readJul 24, 2020

The more I work on building machine learning applications, the more I focus on intentionally designing the interface that stands in between a model’s final predictions and the way it is presented.

In the Google Play App Store, explanations are included on why an app is being recommended. Source:https://www.makeuseof.com/tag/downloading-apps-android-everything-need-know/

Presentation drives behaviour

Presentation affects perceptions, and hence drives and directs how users respond and behave. How we present a model’s predictions carries a heavy responsibility worth thinking about. For example, when I receive a product recommendation, a developer could choose to include a snippet explaining that I got this recommendation because my friends have also liked the product. By tweaking how the product recommendation is presented (in this case including a peer element into the presentation design), the developer is able to nudge me towards also liking the product.

Gmail Smart Reply : https://research.google/pubs/pub45189/

Presentation affects how users feel

Another example where interfaces matter is Gmail’s Smart Reply function. When composing an email, Smart Reply offers a selection of possible replies to choose from. This gives the user a diversity of options and also a feeling of control over how they compose their emails. It’s an effective way of giving autonomy back to the user. This design theme becomes clearer when we compare this format to an alternative — an autocomplete function where a possible reply is suggested in-line. Autocomplete offers fewer choices; there is usually only one recommended reply to choose from.-

Guides for designing ML-human interfaces

To help me with approaching how to design human-computer interfaces for machine learning, I’ve come to look for frameworks to use as my guide. One of these is a presentation given by Apple, which I digest in this post.

Apple’s presentation Designing Great ML Experiences at WWDC 2019 lists four design criteria for enhancing ML interfaces.

Multiple Options
Attributions
Confidence
Limitations

Multiple Options

Multiple options allow users to choose between a range of options that a machine learning feature generates. Apple uses the example of Maps, where the routing engine gives the user a few routes to choose from. However, number of options is not the only thing to maximise! Giving a diversity of options is important too. Here, the recommended route is the fastest one, and another route involves paying toll. Over time, the app can collect metrics on what a user prefers and learns to personalize rankings.

More than one route is suggested to the user

Attributions

Attributions explain why an app makes certain decisions. They’re my favorite design criteria because they handle a pressing need in machine learning — the need to make models understandable. Some examples of attributions at play come from the Apple App Store and Netflix, where they include small explanations for why certain apps are suggested for you. These explanations make models less of a black box, and go a long way towards demystifying machine learning applications.

Attributions on Netflix explain why a recommendation was made.

There is a caveat to using Attributions in an interface, however. Attributions can become problematic when they are based on subjective criteria, or criteria that are commonly thought of as out of bounds. Making recommendations based on someone’s ethnicity, for example, is not acceptable by today’s norms. To avoid these pitfalls, it’s better to base attributions on more objective criteria such as viewing and browsing history.

Confidence

Confidence relates to how sure a model is about it’s result. It could relate to how sure it is that a dog is detected in a picture, how much of a match a book is for you on the Kindle Store, or how certain your inbox is that an email you’ve received is an advertisement. Like Attributions, Confidence relates to a unique characteristic of machine learning applications — mainly that predictions are never deterministic. There’s always and element of probability to a model’s outputs because models are built to recognize statistical patterns. This element of stochasticity is more important to recognize in some situations than others. While having an email wrongly put into the spam folder might result in a small annoyance, taking strong action based on a model predicting that an employee might be leaving the company could lead to a awkward situation for both the employer and employee. For the latter case, it’s important to acknowledge the uncertainty involved in the prediction and proceed accordingly.

A dog and a bicycle is detected in the picture. aken from: You Only Look Once: Unified, Real-Time Object Detection

Communicating confidence levels, when done correctly, is great for setting expectations on what a model can or can’t do, which is the first step to building trust between a user and a model. When taking a photo with dim light, a phone’s photo classifier may be more likely to tag a photo incorrectly, and this can be represented in the final classification scores. Similarly, when planning a route in a rural place, there may not be enough geo-information to optimise travel time. The route information that the app gives should convey this decreased certainty accordingly.

Limitations

Confidence shows that a model is never 100% sure about its results. It can never get things right 100% of the time. Here, being upfront about what the model can and can’t do can smoothen the user experience because the user is less likely to be thrown off by a result he or she didn’t expect.

Allowing an app to fail gracefully is another way of working around the limitations of ML applications. In fact, the People and AI Guidebook as a whole section on this. Basically, one solution is to provide alternatives and suggestions. If a search result is not what you were looking for, provide a few suggestions for alternative search terms so the user can continue looking for relevant content.

Multiple options, attributions, confidence and limitations are ways to design how a model presents its results to an end user. By thinking about interface design through these four lenses, we can be more upfront about the strengths and limitations of machine learning. We can avoid situations where failures lead to awkward moments or frustration. We can help the user develop trust with our application. However, designing outputs is only half of the problem. The other half is thinking about how to gather feedback so that we improve both the user experience and our model. That’s the topic of another post!