We Still Need Humans For Those Things Robots Can’t Do
There are still some things only humans can do
I remember visiting Yahoo! back in 1995, just before everything exploded here in Silicon Valley – I, along with a number of cable TV execs (we had come down from Toronto to meet with reps from @Home, Netscape, and Yahoo! to get the lay of the land prior to launching our own cable internet service in Canada – think Xfinity. In the end, we never met with Netscape, this being about a week before they went IPO and they ended up not having enough time for us) in our tight business suits and ties walked into the back of that industrial unit in Mountain View. I distinctly remember a few things from that meeting – how stodgy I felt in my suit and tie while everyone else was in ripped jeans and t-shirt (yep, even Jerry Yang, who we met with that day) when we walked in the first thing we saw was not a formal business reception desk, but someone in the lobby sitting at a workstation with a huge screen, surfing the internet. She was looking over links to add to Yahoo!, which at the time wasn’t even a search engine, but just a hand-curated directory. She had a big dog lying across her lap and she was surfing away as we walked in. Someone met us in the lobby and escorted the 6 of us in stodgy suits and ties into a small conference room to the right. They told us to go to the kitchen if we wanted anything to drink and to help ourselves from the fridge there – I remember opening it and it was full of Jolt Cola and Twinkies (Jolt used to be the go-to drink for developers pulling all-nighters – guess you could consider it the first energy drink – pre-Red Bull and Rock Star). Anyways, I had no idea where this was all going to go, but to us Canadian execs, used to corporate IT, it was a really different work environment from the one we were used to. (In retrospect, I should have probably asked Jerry for a job right then, but who knew, right?) We discussed creating the first non-US version of Yahoo!, Yahoo! Canada. The talks went well.
Before Google, there were plenty of interesting attempts to classify the web, both manually through hand curation, like Yahoo!, and algorithmically, through services like Alta Vista and Lycos, who were pretty prominent at the time. Both had their shortcomings, but one of the reasons Yahoo! did as well as they did (they did practically invent banner advertising and did well enough to purchase the inventors of text advertising – Overture, previously GoTo.com) was that there was nothing like human curation of the web. The links in the original Yahoo! directory was of top quality, as each and every one of them was contributed by a human being reviewing the pages, not an algorithm attempting to game the system.
Now back in those days, it may have been easy to keep a fairly up-to-date listing of the best stuff on the internet just by hand curation, but of course, the size and scope of the internet exploded and there was no way to meet the need simply by continuing to hand curate everything.
The problem is – when you lose that hand curation, then you lose a lot of the quality of the directory. The links you get handed to you by the algorithm are just not as good or can be gamed, and it’s a constant struggle for developers at search engines like Google to tweak and tweak their algorithms in order to keep the most relevant content on top (oh and of course don’t forget the best paying ads). We still need humans to curate.
There is no doubt that hand curation brings you the best, most relevant stuff. But how do you hand curate when the web is growing by 500% a year (that’s just an average – some regions like Africa and the Middle East are growing at 2000-3000% per year). Crowdsourcing. But crowdsourcing in the right way. Crowdsourcing, as it is used today, is basically an extension of the original interactivity and community which was available on the internet, and even beforehand.
Here is the issue with that concept of crowdsourcing – it’s mostly unstructured and basically a crapshoot – you could throw out your question one day and get a ton of very useful answers in a very timely manner, or you could throw out your question and get nothing. (If you want an example of this type of variability just try posting a question on either of the above – I posted a question on startups and I got an immediate response – another day I posted a question on kettlebell training and I didn’t get a response for months) It really depends on the quality of the process, the question going out to the right crowd, etc. – there are a lot of parameters that need to be met in order for crowdsourcing to work.
That’s just one example of it – services like Amazon’s Mechanical Turk follow a similar model but assign tasks as well. For example, let’s say that you wanted to research a particular field (like marketing a small business on the web) and maybe list the top websites in a specific category. A discrete task like this could conceivably be delegated to a crowdsourced service, where the service would test and determine who would the best humans for the task would be and complete it – if the service was sophisticated enough then it would be able to share the task and collate the results – if we used the above example, the system could conceivably send a task of selecting the top small business marketing site to 100 people and simply collate and sort the responses. A lot of crowdsourcing today isn’t true crowdsourcing like that – it typically assigns a task to one specific person and doesn’t do any sorting of the responses.
While human curation on its own, original Yahoo! style, with individual surfers reading and reviewing every site is not scalable, a new, structured way to crowdsource findings, allowing for human input from everyone who touches a site, is completely doable. My sense is that in the drive for algorithmic, automated relevance, we’ve lost or radically deemphasized the need for humans to be involved in the relevance process.
Some say – what – why do we need crowdsourcing at all? Can’t we eventually get computers to do the work for us? Well, yes to some and no to others – there are plenty of tasks that we have yet figured out how to give computers do to, and secondly, there’s a lot of things that we need to be done which require a human touch. So the question is – how do we get crowdsourcing to work properly?
Think of it this way. If we can’t even get computers to have a conversation with us properly in order to glean our intent how can we possibly get them to perform these tasks which take a human almost no time to complete? Leveraging the crowd will be key in order to simply perform those tasks which require that a human clarify or research a point.
I firmly believe that the next wave of extremely powerful internet and web-based applications must not only grab specific intent but also leverage the crowd to provide that human insight. If we invent structured ways in which to process the work for and from the crowd and use techniques in order to train and improve the crowd’s response, then we should have an incredibly powerful force of ability and knowledge that can truly create the next web.
How do we harness the crowd? We can either harness the crowd in real-time, by using some type of instant answering mechanism, or we can utilize the data that the crowd generates by reviewing and adjusting things like reviews and ratings from individuals.
Here’s an example of how we may be able to leverage or use crowdsourcing in order to improve a future restaurant-going experience:
Suppose I’m driving up to San Francisco and its 3 PM in the afternoon for a meeting. The system knows that I usually have dinner around 6 PM because that is when I typically have dinner according to my calendar and history. Because it’s harvested the location of my meeting in San Francisco from my calendar it knows where I’m going to be for my meeting that goes from say 4 to 530.
As I’m driving up to the city and it knows that I’m driving up to the city by the way because it has information from my GPS it can tell that I’m traveling 65 mi./h up Highway 101 on my way to San Francisco. Since it knows that I’m traveling at 65 mi./h, which is I guess in retrospect a little bit unlikely since it is 3 PM in the afternoon on a weekday, then it will know not to text me with the information of possible place to go to eat after the meeting.
The system realizes that I will be able to get to my meeting on time because of my distance and my speed and the traffic and it’s not warning me that “hey you know maybe should get off here and take 280 the rest away because 101 is busy”. It knows all these things and it knows exactly when to talk to me and it knows when not to bother me. It knows that I’m in the car right now and it may even know that I’m listening to the radio or maybe I’m listening to Rhapsody or some other musical app on my phone so knows I’m listening to this musical app. Since it knows not to bother me during the meeting it knows that if it is going to ask me about dinner then it’s got to ask me before I get to my meeting so it waits for a time say 15 minutes before I’m supposed to get to the parking garage and park my car and it says “hey sorry to bother you but I realize that you might be hungry after your meeting so I’ve taken the liberty of looking at some restaurants in the area around where you’re having the meeting are you interested in hearing the names of those restaurants?” That’s where I can say “yes or no”.
It knows what I like. It knows my dietary restrictions. It knows what I like to eat and if I’d been to any restaurants in that vicinity already and I’ve given them a good review on Yelp, it knows the kind of restaurant I like in that area. I say “Yes I would like to know where to go for dinner after the meeting”. It could say “Do you want to go somewhere new?” I’d say “yes” the system then would automatically go through the list of restaurants and de-prioritize the ones that I have already been to it also knows that I like steak. The first thing that it does it say “There is a great steakhouse just around the corner from your meeting. It’s called XXX grill. Did you want me to see if they had a table ready, did you want me to check for reservations? So I say “Sure”. It would then ask “Shall I make it for two because your wife is at another meeting not too far away and might be able to join you?”
It checks for reservations, checks online, checks Yelp, check Open Table. If Open Table doesn’t have if this restaurant doesn’t have an open table it will actually find a website find the number, pick up the phone, call the restaurant and say “I’m calling for Mr. Kalaboukis table for two at 6 PM do you have any available press 1 for yes 2 for no or enter a message that I can pass on to him”. It can actually make a reservation for me like a real human concierge.
Don’t tell me that we do not have the capability to do that today.
How does the system know that these restaurants are good? Because it harvested the reviews from Yelp and all other sources they could find in order to determine which of these is the best. Remember when it asked me if I wanted to go somewhere new? That’s when it went out and it pulled data. Let’s say there was no data on restaurants in the vicinity online. It would then go out and a network of experts who had decided that they wanted to be part of an expert network that could respond to questions in real-time. The system would ask to send a question in real-time it would say something like “My boss is going to be in this area at that time and he likes steak can you recommend?” that gets blasted out to a bunch of people who live and/or work in that area. Those people respond the data is categorized sorted and then presented back to me and not only that we could conceivably present the data to Yelp so that they can add it to their database. Again, these systems are all in existence today they just need to be pulled together and implemented.
Personally, I think we have too much faith in algorithms to deliver the exact results that we want. We absolutely must have the input of humans in order to determine a useful result. We may need to sort and revise this input programmatically but it is essential that we leverage human input somewhere along the way in order to get the proper results. The human input could be at the beginning of the process – where we use actual human curators to pull together relevant content in a particular content area and then algorithmically revise and augment the results based on the curated data set, or we start with an algorithmic set of results which we then apply human judgment to in order to come up with a blended result that provides the best of both worlds.
The second pillar of the next web will be a systematic and appropriate implementation of crowdsourcing. Work will be envisioned, assigned, sliced, completed, recombined, and delivered, and will actually respond to a user’s true intent.