Archive for the ‘user experience’ Category

IA Summit 2010 in review

Thursday, April 15th, 2010

“Graduation is only a few days away and the recruits of Platoon 3092 are salty. They are ready to eat their own guts and ask for seconds. The drill instructors are proud to see that we are growing beyond their control.” – Joker, Full Metal Jacket

“Show me your kill face!” Dan Willis‘s UX Deathmatch encouraged us to unleash the beast within, taking sides in the battle between agency and in-house design. The sparring was sharp but good-spirited, with claws largely sheathed after the scars of 2009.

The eleventh IA Summit found strength in reconciliation, but the spirit of free speech still ran strong. Whitney Hess was the recruit unexpectedly promoted to squad leader, to the muttered chagrin of a few veterans. She accepted the role with surprising vulnerability and admirable humility. Extrapolating Jesse James Garrett’s dream of a UX designer-turned-CEO, Whitney proposed that it’s time to graduate and take on the world. Keynote compatriot Richard Saul Wurman, meanwhile, headed straight for a Section 8. Meandering and rude, he demonstrated why hypertext is best delivered on screen, not in speech. The audience killed its idol through backchannel sarcasm and planned for a better world.

This hunger to improve led, unsurprisingly, into continued debates about the format of the Summit. There’s no doubt that it’s grown beyond its original constraints and that it suffers from a lack of vision compared to more recent events. I expect that 2011 will see a notably different Summit – indeed, Lou Rosenfeld has fired the first salvo in the battle for reformation. But the conference’s strength is still its outstanding content. Of particular note this year were Kevin Hoffman‘s detailed thoughts on kickoff meetings (sending many agencies scurrying back to their drawing boards), Karl Fast’s tour of the semantic richness of a messy desk and Cindy Blue revealing the face of the supposed enemy in Adventures of an IA in business school. My session The future of wayfinding appeared to be well received, and I’m delighted to have been able to contribute.

It’s a pity that more didn’t share these excellent three days with us, but slipping attendance is understandable in the face of alternatives and the maturation of our field. Although we revel in the company of our passionate yet introverted peers, the field is increasingly eager to take the fight to the outside world. It’s natural therefore that practitioners will look outside the industry for maximum impact – but with some rejuvenation, I’m confident the IA Summit will find a niche of reflection and ‘going deep’. I for one can’t wait for next year’s boot camp in Denver.

Posted in conferences, user experience | 1 Comment »

The perils of persuasion

Wednesday, April 7th, 2010

On graduation, I found the business world laughable. I saw otherwise intelligent people wrapped up in circular rituals of ‘doing business’, oblivious to customer disinterest. My cynicism lasted until I discovered user-centred design and realised there were others who shared my viewpoint. From that point, I saw user experience as a refreshing break from the almost Fordist attitudes I’d witnessed, where business tried to create the market and efficient production appeared more important than demand.

My mindset was naive, but I stand by the principle. One of the things that excites me about UCD is that it isn’t only a mode of design: its values amplify the voice of those previously ignored, who now form part of our network economy.

The success of UCD has sustained demand for user experience design skills, and the land rush has continued in 2010. UX is becoming a cookie cutter add-on for digital agencies and I rarely meet a web designer now who doesn’t claim UX proficiency, although not all can articulate what that means. And it’s not just the designers: I also see back-end developers, SEO professionals and marketers rapidly appending these two magical letters to their CVs.

Many of these people do have genuine user empathy and knowledge of the diverse skills required of UX design. Many do not. I welcome them to the field regardless and hope we can all learn from each other. However, I am concerned at the expansion of the User Experience label to include activities I see as contrary to the values of user empowerment. In particular, I’m worried about persuasion design. Although it’s a powerful and topical approach, I also believe it has the potential to severely damage our industry.

A political model of design

Interaction designers often advocate design as an agent of behaviour change. Jesse James Garrett frames this as an extension of the classic information architecture v interaction design debate, with IA optimising for the way people think, and IxD attempting to drive particular user actions.

When I try to make sense of this struggle, the crude model I keep returning to is a political spectrum. User-centred design, empathetic and inclusive, sits left of centre. Persuasion design, individualised and competitive, sits right of centre.

As with politics, one’s stance is a matter of preference and most mainstream modes are appropriate. The problems lie in the extremes: let’s call them radical UCD and radical persuasion design.

Radical UCD

Under radical UCD, the user’s priorities outweigh others. It’s here that we see Naoto Fukasawa‘s notion of design ‘dissolving into behaviour’ realised. Design becomes an ethically neutral activity whose role is to amplify and liberate the end user. The rewards are intangible, long-term and altruistic: we hope to engender loyalty and word of mouth referrals, but the effects are notoriously hard to measure.

However, as with the political equivalent, radical UCD is economically unrealistic and unworkable. At this extreme, design could only cause consensus-building timidity that reinforces current modes – an accusation already pointed at milder contemporary user-centred practice.

Radical persuasion design

Persuasion design doesn’t share UCD’s ethical neutrality. Instead, it makes an implicit but undeniable judgment that certain behaviours are preferable to others. We need only look at the vocabulary of persuasion design to see this. Jon Kolko’s infamous Johnny Holland article talks of design’s contribution “to the behaviour of the masses, [helping to] define the culture of our society.”

While I respect Jon’s intellect, I find this to be dangerous rhetoric from which we can draw uncomfortable parody: Fear not, huddled masses – the design elite will lead you to the promised land. Persuasion design’s assured ethical superiority is unfortunate. Although some of the cases put forward are compelling – guiding people toward better macroscopic decisions about environment, health etc – we must recognise that, for all the good deeds behaviour change can encourage, it is prone to murkier applications.

What privileges the designer to dictate desired behaviour? And since we’re for hire, does that mean we’re ethical relativists, bending people toward whatever agenda lines our pockets?

Whomever the paymaster, the common pattern I observe in digital persuasion design is that its values are uniformly technocratic. Science is better than faith. Action is better than reflection. Progress is better than the status quo. These values strike me as practically Futurist and, at the risk of invoking Godwin’s Law, I’m concerned that radical persuasion design is vulnerable to similar autocratic pitfalls.

Persuasion design is marketing. UX isn’t.

I have struggled for months to unify my understanding of these two political wings, and now conclude that I cannot. I believe that persuasion design is not part of user experience design. It is marketing. Persuasion design prioritises business goals above those of the user, and its values are irreconcilable with empathy, the central value of UX.

That’s not to say that persuasion design isn’t highly valuable and attractive to business. After all, it matches the recognised business patterns of marketing, making its effects felt in tangible measures that UCD’s intangible altruism cannot: conversion rates, signups, and so on.

I subscribe to Peter Drucker‘s view that business has only two functions: innovation and marketing. Under this model, user experience design is innovation. It uncovers people’s needs and and gives makers the knowledge to develop new products and services that meet those needs.

This, finally, is why I disagree with Josh Porter’s assertion that UX is really just good marketing – however, my disagreement isn’t with his framing of marketing, but of user experience. As far as persuasion design is concerned, he is right – but the equation does not apply to UCD and UX.

Opinions and unwinnable arguments

I am of course straying close to two notoriously unwinnable arguments: semantics and politics. I have neither time nor inclination to enter into political debate or vanish down the rabbithole of Defining The Damn Thing, and I am all too aware that, like any model, the one I give is simplistic. It overlooks the complexities of authoritarianism and liberalism, which are not necessarily tied to economic left or right, and belies the greys that lie between black and white. I raise it instead as a way to highlight the risky territory I believe we are heading toward. All I ask is that the community considers these issues carefully and reaches its own conclusions. I’m happy if those differ from mine.

Even if my thoughts turn out to be at odds with those of the broader UX community, I’ll take heart from the words of Dieter Rams, who also took a stance against the involvement of persuasive techniques:

Braun categorically rejects the idea of motivating people to buy its products by adding features that toy with the psychological sub-terrain of the consumer’s consciousness. Braun refuses to swell sales by exploiting human frailties: neither its products nor its advertising use such seduction techniques.

Those who wish to employ persuasive techniques are welcome to do so. But my focus continues to be on striving to make better products by listening, not driving behaviour change. At times I will use tactics from the persuasion design toolkit, as I do with other tools of marketing, but I will do so only when I have fully considered the ethical implications. I hope that others will do the same.

Posted in politics, user experience | 8 Comments »

Beauty in web design, part 3

Sunday, March 21st, 2010

The final part of a 3-part essay, based on my presentation at SXSW Interactive.

In Part 1 we saw that the web presents an ideal vehicle for beauty, and in Part 2 I argued that beautiful design is reflective, exploring message and meaning. How can we use this knowledge to create beautiful websites?

Making the web beautiful

Meshlike roof of the British Museum

We are certainly making progress, and perhaps I’m being harsh on a field still in its infancy. The web is only 7,000 days old, after all. Technological improvements such as new authoring tools, better screen resolutions, more bandwidth and technical convergence will free us to experiment.

We’re already seeing fresh visceral approaches courtesy of developments such as CSS3, typographic tools like Typekit and Fontdeck, Canvas and SVG. Even the death of web-safe colours freed us to try new visceral design techniques. Better understanding of usability, better design patterns and better web education has also freed us to try new behavioural approaches, such as the horizontal, keyboard-driven navigation on Thinking For A Living. It’s too early to know whether these paradigms will stick, but it’s heartening to see previously locked-in approaches challenged.

However, the key to creating beautiful websites that our users actually love, rather than merely tolerate, is to think at the reflective level.

1. Get emotional

Appealing to emotion is an important way to create reflective design. It means we must understand people, not merely user tasks. What makes them tick? What would they never dream of asking for? How can we improve their life beyond this one visit? The focus is therefore on experience, not just usability. These days I see calling a website ‘easy to use’ as like praising a restaurant for serving edible food. It should be a given, not an exception.

One way to engender emotion is through stories – an area where what we patronisingly call ‘old media’ is streets ahead. Advertisers, writers and film makers have long known the power of narrative and created emotional content to reinforce their message. Content strategists in particular should therefore take centre stage in our quest for emotion, using not just text but other content types. Some of the most emotionally resonant content on the web today is photographic, such as Pictory or the Boston Globe Big Picture.

2. Think bigger

User and business form the classic duality of design. We’re well accustomed to solving for the needs of both, making compromises and tradeoffs where appropriate. I now believe this model overlooks a third piece of the puzzle: the ecosystem. We should design systems that are good for the surrounding web and for society.

Many experienced designers already consider this intuitively through their work, but there’s benefit in explicitly considering these issues in our design process. Are we trying to make a genuine difference, or just churning out more wireframes to keep the client happy?

3. Lead

When did you last see a statue of a committee? The classics of design have typically been created by one person with strong vision and the technical and political skills required to execute upon it. In film, this is known as the auteur theory: the director is regarded as the custodian of the creative vision and the final product is his or her realisation of it. At the least we need to appoint leaders who formulate and communicate a vision for the site.

Assuming leadership can be difficult in real business contexts and can foster problematic attitudes, but without strong leadership, clear vision and faithful execution, we have no hope of creating beauty.

4. Think long term

It’s relatively easy to make something viscerally attractive, but how can we maintain interest after the initial lust wears off? Just as in a romantic relationship, we should consider long-term seduction. The odd surprise can be rewarding, bringing joy in unexpected moments of the experience. By varying things we prevent over-familiarity and the contempt that this can breed.

Possible approaches include rewarding people who explore to deep areas of the system – a tactic frequently used by game designers – or something as simple as unannounced free shipping on your tenth order. Google’s holiday logos provide a real example of how the tiniest detail can keep users interested.

5. Notice everyday beauty

My mother, a retired teacher, told me recently of the ‘golden moment’ in education. It’s the point you always remember, when you discovered something and suddenly your worldview was shifted – that “one way valve to a new way of seeing” again. Educational theory suggests that to create golden moments, you must recognise them for yourself. So notice the world. Where’s the beauty around you?

As we previously discussed, there’s beauty all around us: art, writing, architecture, music, products, nature. We should breathe it in and learn from it. It may even be that inspiration lies close to home. Perhaps web standards specialists could take inspiration from developments in the Flash world, and vice versa. Maybe designers can be inspired by developers. We should be aware and scan the horizon to find our own golden moments.

6. Be brave

Finally, since reflective design is about meaning and message, we needn’t fear making statements. We should stand for something and convey ideals through our work: both ours and those of our clients. Surprisingly, the web design community seems reluctant to do this. At last year’s IA Summit, Jesse James Garrett asked why there are no schools of UX thought. Why indeed are there no major schools of web design thought? Our movements and sub-communities are, instead, almost entirely technique-driven. To me, it’s sad that we’re more interested in endlessly debating topics such as HTML5 v Flash, rather than exploring the important philosophical approaches that drive our work.

Caveats

There are of course some dangers to these approaches. The demands of client work mean we’d be unwise to blindly apply these rules, and there are some difficult questions left unanswered. The most important is whether beauty is always appropriate. I suspect not. When I’m filing a tax return, I don’t want the system to speak about who I am; I just want it to work. When getting the job done is more important than enjoying it, beauty is cruft. Better for designers to let the task and usability have priority.

Reflective design shouldn’t become dogma. Fortunately, when we take time to truly understand users and what they want, it soon becomes clear when it’s appropriate to strive for beauty in design.

Hero design

It would be easy to misinterpret our discussion of leadership and bravery and overestimate our authority. Designers aren’t heroes; instead we must serve our industry, our clients and our users faithfully, discarding ego. Too frequently, I see design that is more about impressing other designers than solving the problem and making the web better. There’s no beauty in hero design, only narcissism.

That said, I think web designers should appreciate that we can play an important role in society. We’re lucky enough to work on the coalface of the most exciting innovation of modern times. We’re on the brink of wonderful things. So yes, we’ve underachieved, but given the evolution of beauty and the tools now available to us, the web is an ideal vehicle for beautiful design. We’re the generation to turn that promise into action.

I hope in five years to look back on this essay and laugh. If we work hard, aim for reflective design, and believe in the power of the web, I’m convinced we can create our own beautiful design landmarks.

Posted in creativity, design, user experience, web | 8 Comments »

Beauty in web design, part 2

Sunday, March 21st, 2010

The second part of a 3-part essay, based on my presentation at SXSW Interactive.

Three types of beauty

In Part 1 we saw that the web presents an ideal vehicle for beauty. But how will we know it when we see it? What is beauty anyway? I consider beauty to be presented in three main modes: universal, sociocultural and subjective.

Universal beauty

Universal beauty is based on timeless, globally accepted principles. It seems to hit at some innate response within us all, as demonstrated by the concept of human ‘averageness’. Here, we see a composite image of dozens of female faces created by Face Research. We might expect to see average attractiveness as a result, but this prototype is certainly more attractive than average. One theory is that prototypicality shows the mate has no defects and thus is likely to produce healthy offspring. Another theory claims that average faces are pleasing because the brain finds them easier to process. (Perhaps the average face is Plato’s ideal Form in the flesh).

Designing for universal beauty involves careful consideration of the fundamental aesthetic principles of design, such as symmetry, harmony, the rule of thirds and the golden ratio.

Sociocultural beauty

Sociocultural beauty is governed by the preferences of a particular time or place. This is most clearly seen in sexual attitudes.

Here we see Rubens’ Venus and a modern runway model: a clear depiction of changing sociocultural attitudes to beauty.

However, there are more subtle examples: fashion, music trends and even philosophical interpretations of the world all go in and out of style, regardless of their inherent universal beauty.

Subjective beauty

Subjective beauty is the wholly personal encapsulation of one’s likes and dislikes. If you like big butts and cannot lie, you’re merely exercising your right to a subjective opinion on beauty. While Rubens’ work is reflective of the Baroque era, it also reveals his subjective preference for larger models.

These three types of beauty are hierarchical. Subjective beauty can overrule sociocultural beauty: we can individually find beauty in things that society considers out of fashion. Sociocultural beauty can in turn overrule universal beauty: universally beautiful things may simply not be en vogue in a particular time or place.

Three modes of design

So how can we design for these types of beauty? Don Norman’s book Emotional Design gives a deep exploration of the role of emotion & beauty in design. Adapting an established model of cognitive processing, Norman claims design typically falls into one of three dominant modes.

Visceral design

Visceral design - screenshot from Smashing Magazine

Visceral design is aimed at our gut. We experience a visceral reaction when we bite into a sweet apple, see a stunning sunset or hear a harmonious chord – it’s entirely sensory, before the brain has a chance to shape the feeling. A positive visceral response is often called attraction – it’s what draws bees to flowers, or babies to a beautiful face.

To design for visceral response, we should concentrate on immediate properties of a system: shape, colour and form. These can make the instant impact required for a visceral reaction – we know, for instance, that visceral response to a website can occur in fractions of a second.

Visceral design was an early frontier of exploration for the web, once the technology was sufficiently mature. This early landrush of artistic, highly visual sites was helped by the advent of visually-oriented authoring tools such as Dreamweaver, which helped graphic specialists make the leap into the web arena with familiar UIs.

It is easy to belittle visceral design as ‘eye candy’, but without this immediate attraction, sites struggle to succeed in other modes of design. That said, visceral design’s clear failing is that it rewards attraction over usability and real beauty. Command-Shift-3, which describes itself as the HotOrNot of web design, has all the depth of a wet T-shirt contest. Since we can’t use the sites it features, we must judge solely on aesthetics. Visceral sites often win awards (since awards are rarely concerned with use) and appear in those ‘Top 20’ lists we all know and dread.

Behavioural design

Behavioural design - example from Facebook.com

Behavioural design is concerned with use. Does the system work? Is it easy to perform my tasks? Does it sustain flow, or make us suffer constant interruptions by not doing what we expect? To achieve successful behavioural design, we can call on our nearest ergonomist or usability specialist. She will ensure our design has appropriate dimensions, is well mapped to user mental models, is forgiving of improper use, sends clear messages about function, and so on.

No one can deny that the web usability movement has been successful. However, understanding the user’s tasks and crafting a site around them isn’t sufficient to bring us genuine beauty. The reason is that behavioural design doesn’t always trump visceral design. Social psychologists have found, for instance, that women prefer prototypically attractive men (square jaws, broad shoulders etc) for one-night stands and flings, but they choose more feminine, ‘nicer’ men for commitment: the so-called “cads and dads” theory. This pattern is particularly pronounced at certain points of the female ovulation cycle. In short, we don’t always plump for reliability; sometimes we need something more exciting.

Perhaps the usability movement has created too many dads, and too few cads. Critics often claim it has ‘made the web boring’ – and it’s true that, when misapplied, usability approaches can create very mediocre products. For a slightly daft example, look at the work of artists Vitaly Komar & Alexander Melamid, who surveyed the musical preferences of the general public. They asked opinions on instrumentation, tempo, pitch, duration and lyrical subject and assembled these into two musical extremes: the Most Wanted Song and the Most Unwanted Song.

The Most Wanted Song features a soft rock / R&B sound, using well established instruments. To quote the artists, it creates “a musical work that will be unavoidably and uncontrollably liked by 72% of listeners”. Unsurprisingly, this crowdsourced composition, designed for maximum ‘ease of listening’, is anything but beautiful.

Listening to The Most Wanted Song, we can almost understand why some people equate usability with tedium. While it can help our sites to become useful and profitable, it can’t make them beautiful. For that, we should aim at the third, most complex mode of design.

Reflective design

Reflective design reaches beyond visceral and behavioural design to look at message and meaning. It asks difficult questions. What does this system say about who I am? Does it improve my life? Am I glad I did it? These questions are subjective and complex, and our responses will vary with experience, personality, culture and even mood. But there are strong benefits to asking them. Successful reflective design makes us feel good: we show it off, tell others and repeat the experience. It can even change the way we think about things. In short, I believe that successful reflective design and beautiful design are one and the same.

Reflective design - NextTime's Word Clock

Consider the Nextime Word Clock. It’s made from two cylinders that rotate so that the time can be read from the face: “Five minutes to ten” or “It’s about four”. It’s less accurate than a cheap digital watch and hence less usable – and, while it looks good, it’s not as elegant as an analogue clock. But, to me, this clock is an excellent example of reflective design. Its accuracy is appropriate for the living room (do you really need to know the difference between 2:57 and 2:58?) and its unconventional design is a conversation starter. I see beauty in the concept, and the product says something about me. It’s for these reasons, rather than usability or attraction, that I count this clock as one of my favourite possessions.

Where usability focuses on behavioural design, reflective design is more the domain of user experience. It involves truly understanding what makes people tick and what makes them excited. It involves creating something meaningful that changes perceptions. Reflective design is a relatively recent focus on the web, which is perhaps why we’ve not yet created beautiful websites. But with sufficient focus on experience, I believe we will.

Rate of change

"Shearing layers" concept from Stewart Brand's How Buildings Learn

These three modes of design – visceral, behavioural and reflective – move at different speeds, creating shearing layers (familiar from Stewart Brand’s How Buildings Learn).

Visceral trends come and go in a matter of months. Top 20 trends are quickly dated, be they illustration, fat footers or any other pattern du jour. Behavioural innovation is slower. Interaction design patterns and de facto standards (search box in the top right, logo and link to homepage in top left) emerge over the course of years and require more traction and mass support to become established. Reflective design moves the slowest of all. This is best demonstrated by ‘movements’ that define how we interact with the web – social media, the realtime web and so on – which take many years to emerge and stabilise.

Concluded in Beauty in web design, part 3.

Posted in articles, creativity, design, user experience, web | No Comments »

Undercover User Experience

Monday, March 1st, 2010

At last, the big announcement. I’m delighted to confirm that Undercover User Experience, written by myself and fellow Clearleftie James Box, will be published by New Riders this autumn.

Once you catch the user experience bug, the world changes. Doors open the wrong way, websites don’t work, and companies don’t seem to care. Fortunately, anyone can learn the UX remedies – usability testing, personas, prototyping and so on – but, unless your organization ‘gets it’, putting them into practice is trickier.

Undercover User Experience will show you how to do great UX work with tiny budgets, no time, and even without official clearance.

The idea came about in a Utrecht hotel, where James and I got talking about the early stages of our careers, when we didn’t have the luxury of doings things ‘by the book’. Through the IA Institute mentoring scheme I’ve met several people in the same situation. For them, what makes UX work difficult isn’t lack of skill, but not knowing how to make headway in companies that don’t appreciate the need. Pioneering UX and inspiring colleagues who’ve never cared about design takes improvisation, persistence and diplomacy. So we’ll cover guerrilla approaches to the UX techniques we know and love, along with frank advice on how to make them most of them in your business.

On a personal note, I’m thrilled to be partnering with New Riders. They were our first choice publisher due to their outstanding UX portfolio, including the classics Don’t Make Me Think!, Designing for Interaction and Elements of User Experience.

The writing experience is already demanding and rewarding. There’s been much to-ing and fro-ing over titles and much confusion over the US tax system and self-assessment, but we’re well under way and hoping to wrap the writing up by June.

But enough – I’ve no wish to turn this blog into a marketing vehicle. If you want to keep up to date with our progress and be the first to hear when the book’s due out, follow UndercoverUX on Twitter or visit the Undercover User Experience website and sign up for updates.

Posted in book, user experience | 1 Comment »

Oxymoron

Tuesday, November 24th, 2009

The ELSE mobile

This is the ELSE Mobile. It’s a touch screen phone. They’re all the rage, I hear.

I’ve not used the ELSE Mobile, but I know from their website that I needn’t bother. I know because they claim this handset demonstrates a:

“user-experience-centric philosophy designed to enhance man-machine capabilities through pre-integration services.”

With this lone sentence, ELSE instantly destroy any pretence of user-centred design. No user-centred company would let their copywriters produce such unmitigated nonsense. I barely need to mention the splash screen, the breaking of the Back button, the grammatical errors (“Most device are…”) and the autoplaying music on the Flash monstrosity they call a website.

This, dear reader, is the opposite of user experience design.

[Thanks to Lewis for the link.]

Posted in mobile, user experience | 8 Comments »

Latest Clearleftie happenings

Tuesday, November 24th, 2009

Lots to report, much of which I neglected to mention thanks to my brush with our porcine friends.

UX London 2010

First, we’ve announced our programme for UX London 2010, which features (amongst others) Jesse James Garrett, Scott McCloud, Whitney Hess and Bill Moggridge.

Once again we were delighted with how much of our Christmas wishlist came true. The north wall of the office has been awash with post-its of names and topics for several weeks now, and there’s a certain Machiavellian joy in seeing it come together into a coherent programme. I’m particularly happy to see some names I pressed especially strongly for.

Bands always say their difficult second album will surpass their first, but I think it’s true this time. Not a prog-rock bass solo in sight. It’s happening 19–21 May 2010, and tickets are on sale on 1 December.

Spring internship

We’re taking on a User Experience intern early next year. It’s the first time this is a dedicated UX position – our previous interns have come from across the whole web spectrum. It’s a paid position lasting ten weeks, and would suit anyone with a talent and love for good user experience design. More details are on the Clearleft site – drop us a line or talk to me if you’re interested.

The book

Finally, my big news is that I’m writing a book with James. Several daunting but hopefully inspiring months lie ahead. More details will follow when we confirm them.

Posted in conferences, personal, user experience | No Comments »

Statistical significance & other A/B test pitfalls

Monday, November 16th, 2009

2p coin

Last week I tossed a coin a hundred times. 49 heads. Then I changed into a red t-shirt and tossed the same coin another hundred times. 51 heads. From this, I conclude that wearing a red shirt gives a 4.1% increase in conversion in throwing heads.

A ridiculous experiment (yes, I really did it) with a ridiculous conclusion, yet I sometimes see similarly unreliable analysis in A/B testing.

It’s logical and laudable that designers should seek data in our quest for verifiability and return on investment. But data must be handled with care, and mathematical rigour isn’t a common part of a designer’s repertoire.

Here’s an example from ABTests.com, a worthwhile project that I feel slightly bad to pick on.

Screen shot 2009-11-09 at 18.32.14

The two versions are subtly different:

Although minor changes can cause major surprises, I wouldn’t expect these small differences to improve the form’s usability. With the caveat that I don’t know the users or product, I’d even speculate that Version B could perform worse since it reduces the priority of the calls to action and removes the signifier of progression.

The designer claims that version B showed a 30.4% conversion improvement in an A/B test. Here’s why this isn’t quite accurate.

The role of chance

Any A/B test is a trial, so called because we’re observing evidence gained by trying something out. I can never truly know that there’s a 50% chance of a coin landing as a head or a tail – I can only run trials and observe the evidence. Similarly, we can never truly know that a design leads to higher conversion – we can only run trials and observe the evidence. If that empirical evidence is strong enough, we conclude that the design is an improvement. If not, we don’t.

To be valid, trials need to be sufficiently large. By tossing my coin 100 or 1000 times I reduce the influence of chance, but even then I’ll still get slightly different results with each trial. Similarly, a design may have 27.5% conversion on Monday, 31.3% on Tuesday and 26.0% on Wednesday. This random variation should always be the first cause considered of any change in observed results.

The null hypothesis

Statisticians use something called a null hypothesis to account for this possibility. The null hypothesis for the A/B test above might be something like this:

The difference in conversion between Version A and Version B is caused by random variation.

It’s then the job of the trial to disprove the null hypothesis. If it does, we can adopt the alternative explanation:

The difference in conversion between Version A and Version B is caused by the design differences between the two.

To determine whether we can reject the null hypothesis, we use certain mathematical equations to calculate the likelihood that the observed variation could be caused by chance. These equations are beyond the scope of this post but include Student’s t test, χ-squared and ANOVA (Wikipedia links given for the eager). Here’s a site that does the calculations for you, assuming a standard A/B conversion test with a clear Yes or No outcome.

Statistical significance

If the arithmetic shows that the likelihood of the result being random is very small (usually below 5%), we reject the null hypothesis. In effect we’re saying “it’s very unlikely that this result is down to chance. Instead, it’s probably caused by the change we introduced” – in which case we say the results are statistically significant. Note that we still can’t guarantee that this is the right interpretation – significance is about proof only beyond reasonable doubt.

Running the calculations on the above data shows that the results aren’t statistically significant: the evidence isn’t strong enough to reject the null hypothesis that the difference in conversion is simply down to luck. The main problem is the small sample size (128 and 108 users respectively), so I would advise the designer, Johann, to repeat the test with more users. Assuming the observed conversions seen didn’t change (a big assumption) a sample size of approximately 200 users per variant should be sufficient for significance. He could then either reject the null hypothesis or the results would remain inconclusive, in which case there’s no evidence the design has made a difference. In Johann’s defence, he recently posted that he takes the point about significance, and I’m looking forward to seeing more conclusive data for this intriguing test.

Percentage confusion

Significance isn’t the only slippery problem A/B tests face. For starters, quoting conversion improvements is always fraught with difficulty. Since conversion is usually measured in percentages (in this example, 31.3% and 40.7%) there are two ways to quote improvements. We can say that conversions increased by:

Any percentage improvement quoted in isolation should be challenged: which of these two calculations has been used? It’s dangerously easy to assume the wrong figure without sufficient context.

The A/B death spiral

A/B tests also suffer from a common quantitative problem, in that they tell us what but not why. I’ve written about this previously in What if the design gods forsake us. It’s wise to back up numerical tests with qualitative evaluation (eg. a guerrilla usability test) so we can make informed decisions if data suggests we need to rethink a design.

Even with backup, sometimes A/B tests are simply the wrong tool for the job. They can provide powerful insight in some cases, but in the wrong place they can be a blind alley or, worse, a weapon of disempowerment. Logical positivism and design don’t mix – not everything we do can be empirically verified – yet some businesses fall back on A/B testing in lieu of genuine design thinking. I call this the “A/B death spiral”, and it plays out something like this:

Designer: Here’s a new design for this screen. You’ll see it has a new navigation style, tweaked colour palette and I’ve moved the main interactions to a tabbed area.

Product owner: Wow, those are pretty big changes for such a high-risk screen. I tell you what: let’s test them individually to see which of these changes works and which doesn’t…

As the proverb suggests, sometimes you can’t jump a twenty foot chasm in two ten foot leaps. Cherry-picking only those design elements that are “proven” by an A/B test can be a route to fragmented, incoherent design. It may earn marginally more money in the short term, but it becomes hard to avoid a descent into poor UX and the long-term harm this causes.

Being faithful to data

Given the potential hazards, I’m concerned about the naïveté with which some designers approach quantitative testing. The world of statistics rewards an honest search for the truth, not dilettantism, and I’d advise any designer moving in statistical circles to pick up some basic stats theory, or at least partner with someone knowledgeable.

A flawed A/B test, be it statistically insignificant, misapplied or misquoted, is nothing more than anecdotal evidence. It’s the same crime as making a website red on the feedback of one user. Yet an impatient designer, seeing the example I quoted above, could quickly jump to a false conclusion: “I should remove arrows from continue buttons: it’s 30.4% better.” Perhaps this designer deserves what he gets. It’s likely he’s only really interested in shortcuts to good UX, and linkbait lists of “Twelve ways to make your site more usable.” Since he understands neither the mathematics nor the context of this trial (timescales, userbase, surrounding task) he will inevitably grab the wrong end of the stick. Nonetheless, he is out there.

Don’t let yourself be that designer.

Photo: snellgrove
* subject to rounding.

Posted in design, statistics, user experience | 41 Comments »