Crowdsourcing Usability – Or Not?
There have been some recent crowd-sourcing business models making their way on the Usability Research and User Experience Design scene. The crowd source value proposition is, “High Volume Results – Cheap” – with some important variables like: Quality, Usefulness, Relevance, Focus, Strategy, and more.
How do you make the right decision on whether or not to crowd-source UX research for your project/product and where will you get the most yield for your time, money and energy? Here’s a quick review of Feedback Army and Loop 11 as well as some tips for your back pocket.
What is it?
We first heard of Feedback Army back in January ’09. This site is almost exactly what you think it might be.
- You post up a URL and a list of questions/criteria to evaluate against (3-6 recommended)
- You select the number of responders to your posting (3 tiers – 10 users for $10, 25 users for $23 and 50 users for $40)
- Make a payment, wait and watch the reviews roll in.
What do you get?
Just what the site claims you get: “Simple, Cheap ‘Usability Testing’ for your Website.”
Depending on your questions and the range of responses you select, you have some variable control on the quality of the responses. The site allows you to reject responses that are not of the quality you feel is deserving of $1.00 (or less depending on how many you selected). The site has some tips on usability testing and some guidance on how best to use the service with a nice little endorsement for Steve Krug’s “Don’t Make Me Think“. For what you pay, you get a fair shake.
As part of my research, I read over comments in the sample reviews, I submitted my own request for review, assessed the responses and I also nosed around some discussion forums where Feedback Army was the topic de jour.
Certainly, this service has it’s benefits (particularly on your bottom line) but there are the typical responses from folks disappointed by their own misguided expectations. Look, you can’t use a service like this, then complain when you’re not handed a glossy analysis of your user findings broken down by persona & scenario that map 1 to 1 with your research goals. It just won’t happen. So when you get shorthand “unintelligent” lol-speak responses you really can’t complain. Some users may/may not follow your posting to the letter and may spout off whatever comes to mind… that’s the level of expectation you should have going in.
What you don’t get…
User Demographics & Targeted Personas – you’re dreaming. The reviewer pool comes from Amazon’s Mechanical Turk – a crowd-sourcing work-in-progress. While there are advantages here, there is limited control over who is actually doing the work. The m-turk pool is 70% American… combine that with Feedback Army’s English only UI framework, and you’re limited to US domestic testing.
Quantitative Metrics – You won’t get time to completion and conversion rates, and industry benchmarks. If you outfit your test environment with Google Analytics, you can get at some success metrics around goals, popular content, and bounce rates, but with limited specificity on who’s feedback maps to which metrics.
Qualitative Metrics – You can get if you’re explicit about ratings, but you’ll have to compile your own report if you want the pretty charts.
A Usability Report – this one is all you – if you played your cards right, you can get some decent raw feedback to compile into a report, but this requires a lot of planning.
How to make up the difference:
What are your research goals, what candidate features/functions to test, what evaluation criteria, etc? Ideally, you run a series of these to arrive at a more comprehensive view of your product’s usability, and compile the report in the end. Hiring a consultant or using an internal dedicated resource to own this task will help ensure the value added direction setting and iteration planning for your product post feedback solicitation.
More recently we took a look at Loop 11. Currently in private beta, Loop 11 is hooking up some usability testing bells and whistles. I used ‘quotes’ around “usability testing” on my FeedbackArmy review because it’s really just a feedback machine. Loop 11, however, has scratched the surface on tackling the tough stuff: Targeted Personas, Quantitative Metrics, Industry Benchmarks, and more.
What do will you get?
To be honest, I can’t tell you everything… Loop 11’s closed beta is by invitation only. Here’s what the site claims:
Create a user test. This is a lightweight form, but it takes more thought and detail than simply posting a URL. A 3 step set up walks you through adding test details, tasks & questions and additional test options. The demo suggests you can organize tests into “projects” and save tests as templates. (nice touch)
Invite test participants. This looks like a nice set of options: Get link to user test: presumably, you can send it out to a predetermined list of users (the ideal scenario), create pop-up invitation for your site: this gives you random users which may or may not be what you’re looking for (less ideal) or purchase from their panel users (needs investigation). The site claims separation of test participants, making data roll up and drill down more interesting.
Everyone loves dashboards… so why not, a nice dashboard to give you high level data on average page views, avg. time per page, avg. task completion rate and average industry completion rates… That’s right, I said Industry Benchmarks. Now that’s a rich claim – noting their closed beta partners, they’ve picked Amazon, Ikea, HSBC, Toyota… these will be your benchmarks folks! Not a bad competitive pool. Well done Loop.
Here is a list of metrics you can get in the dashboard:
What you don’t might not get.
You already know that I can’t get a good handle on the truth here based on their current closed beta status. But here’s a list of assumption you can make based on what they’ve exposed. I found a posting by Ann Smarty who somehow got into their beta, she posted a light review here.
Validated qualitative metrics – you may get ratings, but you miss out on non-explicit reactions. The classic,”users will say one thing but do another” is always in effect- you’ll get their feedback, but miss facial expressions, eye tracking, mouse hovering, heat mapping and general behavior surrounding their remarks.
That’s about it, it looks like you get a good set of data collection and analysis features – you still have to set up your test(s) properly. This means well thought out targeted test goals and participant recruitment.
Online User Testing Service/Tool Limitations:
If you’ve found yourself staring down the barrel of some usability crowd-source projects you’re most likely dealing with tight time-frames and or budgets and you’ve ruled out a lengthy and potentially costly full blown usability study. What tips can you learn from user research professionals to make the most of your crowd-sourced efforts and build a design strategy from your study outputs?
1) You can’t meet everyone’s needs. Take some time to look over the feedback and group them into “UI themes” or “issue categories”. There will always be outliers – if your study was targeted and you knew the demographic weight of missing the mark on an outlier, then you can factor them in – or, if this outlier hit the exact note that all of the others missed – the note you have been attempting to hit… then factor them in, but be careful not to upset the balance of maintaining a clear grasp of mass appeal. You can alway run multiple targeted feedback sessions once you know what your issue categories are. Try your hand at feedback and observation analysis you may find affinity diagramming or mental modeling useful, but don’t forget to segment and simplify your feedback – “verb + noun = atomic task”.
2) User segmentation and personas. Getting at the psychographics and demographics of your user takes a little extra time & thought and has very real user experience implications. While no two users on any given system are the same, you can loosely characterize their behavior and relationship to information, objects and tasks into 3-6 types. ex. Novice, Intermediate, Advanced, Specialist, etc. The more comprehensive your view of your users going into a study, the more focused your test and test results can be.
3) Reconciliation – User requirements and business requirements don’t always map1:1 to each other, and the technical architecture may or may not support all of the requirements. Map out your requirements into a functionality matrix where you look at all of the system functions and features, making sure that you account for all business and user requirements (using excel helps you stay concise and color coded). Rank each item by business benefit, user benefit and technical complexity (H/M/L). Use the matrix to build an iteration plan based on your ranking.
4) Mapping study results to information and interaction design strategy. You may have a head for this, and if you do, you’ve most likely covered your bases, but it never hurts to get an outside opinion. Great design is rarely achieved without a great deal of planning. Knowing where you are, where you’ve come from and where you’re going at all points of development can keep your tests and iteration plans focused and practical. Understanding how to meet the needs of your users in rapid order with a long range view of feature extensibility will go a long way towards keeping your product on track.
Additional on-line usability testing tools:
Other (unrelated) Product Crowd-Sourcing Sites:
Automobile Design (just because it’s cool)
Happy testing all. Remember: “Test early & test often”. Don’t be afraid to admit you need help, we’re pretty good at what we do.