Human-Centred Machine Learning

We are currently asking for submissions to a workshop on Human-Centred Machine Learning at CHI 2016. The workshop aims to bring together people working on the HCI of machine learning, an emerging filed.

If you are interested in finding out more about Human Centred Machine Learning, here is an extract from our proposal:

Statistical machine learning is one of the most successful areas of computer science research in recent decades. It has driven advances in domains from medical and scientific research to the arts.  It provides people the ability to create new systems based on example data, for instance creating a face recognition system from a large dataset of face images, rather than by reasoning about what features make something a face and translating that reasoning into program code. This makes it possible to provide excellent performance on tasks for which it would be very difficult, if not impossible, to describe computational procedures explicitly in code.

In practice, however, machine learning is still a difficult technology to use, requiring an understanding of complex algorithms and working processes, as well as software tools which may have steep learning curves. Patel et al.  studied expert programmers working with machine learning and identified a number of difficulties, including treating methods as a “black box” and difficulty interpreting results. Usability challenges inherent in both existing software tools and the learning algorithms themselves (e.g., algorithms may lack  a human-understandable means for communicating how decisions are made) restrict who can use machine learning and how. A human-centered approach to machine learning that rethinks algorithms and interfaces to algorithms in terms of human goals, contexts, and ways of working can make machine learning more useful and usable.

Past work also demonstrates ways in which a human-centered perspective leads to new approaches to evaluating, analysing, and understanding machine learning methods (Amershi 2014). For instance, Fiebrink showed that users building gestural control and analysis systems use a range of evaluation criteria when testing trained models, such as decision boundary shape and subjective judgements of misclassification cost. Conventional model evaluation metrics focusing on generalisation accuracy may not capture such criteria, which means that computationally comparing alternative models (e.g., using cross-validation) may be insufficient to identify a suitable model. Users may therefore instead rely on tight action-feedback loops in which they modify model behavior by changing the training data, followed by real-time experimentation with models to evaluate them and inform further modifications. Users may also develop strategies for creating training sets that efficiently guide model behaviour using very few examples (e.g., placing training examples near desired decision boundaries), which results in training sets that may break common theoretical assumptions about data (e.g., that examples are independent and identically distributed). Summarizing related work in a variety of application domains, Amershi et al. enumerate several properties of machine learning systems that can be beneficial to users, such as enabling users to critique learner output, providing information beyond mere example labels, and receiving information about the learner that helped them understand it as more than a “black box.” These criteria are not typically considered when formulating or evaluating learning algorithms in machine learning research.

I’ve also included a full reference list at the bottom of the post. If you are interested here is the Call for Papers and you can find the full proposal here.

Saleema Amershi, Maya Cakmak, W. Bradley Knox, and Todd Kulesza. 2014. Power to the people: The role of human sininteractive machine learning. AI Magazine 35, 4 (2014), 105–120.

Saleema Amershi, James Fogarty, and Daniel S. Weld. 2012. Regroup: Interactive machine learning for on- demand group creation in social networks. In Pro- ceedings of the SIGCHI Conference on Human Fac- tors in Computing Systems (CHI ’12). 21–30. DOI:

Bill Buxton. 2007. Sketching user experiences: Getting the design right and the right design. Morgan Kauf- mann Publishers Inc., San Francisco, CA, USA.

Steven P. Dow, Alana Glassco, Jonathan Kass, Melissa Schwarz, Daniel L. Schwartz, and Scott R. Klemmer. 2010. Parallel prototyping leads to bet- ter design results, more divergence, and increased self-efficacy. ACM Transactions on Computer- Human Interaction 17, 4 (Dec. 2010), 1–24. DOI:

Jerry Alan Fails and Dan R. Olsen Jr. 2003. Interactive machine learning. In Proceedings of the International Conference on Intelligent User Interfaces (IUI ’03). 39– 45. DOI:

Rebecca Fiebrink. 2011. Real-time human interaction with supervised learning algorithms for music compo- sition and performance. Ph.D. Dissertation. Princeton University, Princeton, NJ, USA.

Andrea Kleinsmith and Marco Gillies. 2013. Customizing by doing for responsive video game characters. International Journal of Human-Computer Studies 71, 7–8 (2013), 775–784. DOI: ijhcs.2013.03.005

Todd Kulesza, Saleema Amershi, Rich Caruana, Danyel Fisher, and Denis Charles. 2014. Structured labeling for facilitating concept evolution in machine learning. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’14). 3075–3084. DOI: 2557238

Kayur Patel, James Fogarty, James A. Landay, and Beverly Harrison. 2008. Investigating statistical ma- chine learning as a tool for software development. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI ’08). 667–676. DOI: