Uncategorized

The Case for Siri

Since Siri’s public debut as a key iPhone feature 18 months ago, I keep getting involved in conversations (read: heated arguments) with friends and colleagues, debating whether Siri is the 2nd coming or the reason Apple stock lost 30%. I figure it’d be more efficient to just write some of this stuff down…

siri icon

Due Disclosure:

I run Desti, an SRI International spin-out that is utilizes post-Siri technology. However, despite some catchy headlines, Desti is not “Siri for Travel”, nor do I have any vested interest in Siri’s success. What Desti is, however, is the world’s most awesome semantic search engine for travel, and that does provide me some perspective on the technology.

Oh, and by the way, I confess, I’m a Siri addict.

Siri is great. Honest.

The combination of being very busy and very forgetful, means there are at least 20 important things that go through my mind every day and get lost. Not forever – just enough to stump me a few days later.  Having an assistant at my fingertips that allows me to do some things – typically set a reminder, or send an immediate message to someone – makes a huge difference in my productivity. The typical use-case for me is driving or walking, realizing there is something I forgot, or thinking up a great new idea and knowing that I will forget all about it by the time I reach my destination. These are linear use cases, where the action only has a few steps (e.g. set a reminder, with given text, at a given time) and Siri’s advantage is simply that it allows me to manipulate my iPhone immediately, hands-free, and complete the action in seconds. I also use Siri for local search, web search and driving directions.

Voice command on steroids – is that all it is?

Frankly – yes. When Siri made its public debut as an independent company, it was integrated with many 3rd party services that were scrapped and replaced with deep integration with the iPhone platform when Apple re-launched it. Despite my deep frustration with Siri not booking hotels these days, for instance (not), I think the decision to do one thing really well – provide a hands-free interface to core smartphone functionality (we used to call it PIM, back in the days), was the right way to go. Done well, and marketed well, this makes the smartphone a much stronger tool.

But I hate Siri. It doesn’t understand Scottish and it doesn’t tell John Malkovich good jokes

As mentioned, I’ve run into a lot of Siri-bashers in the last year. Generally they break down into two groups. The people who say Siri never understands them, and the people who say Siri is stupid. I’m going to discuss the speech recognition story in a minute (SRI spin-out, right?) but regarding the latter point I have to say two things. First, most people don’t really know what the “right” use-cases for Siri are. Somewhere between questionable marketing decisions and too little built-in tutorial, I find that people’s expectations of Siri are often closer to a “talking replacement for Google, Wikipedia and the bible” than to what Siri really is. That is a shame; because the bottom line is that it is under-appreciated by many people who could really put it to good use. Apple marketing is great, but it’s better at drawing a grand vision than it is at explaining specific features (did I mention my loss on my AAPL?). While the Siri team has done great work at giving Siri a character, at the end of the day it should be a tool, not an entertainment app (my 8-year old daughter begs to differ, though).

OK, but it still doesn’t understand ME

First, let me explain what Siri is. Siri is NOT voice-recognition software. Apple licenses this capability from Nuance. Siri is a system that takes voice recognition output – “natural language”, figures out what the intent is – e.g send an email, then goes through a certain conversational workflow to collect the info needed to complete that intent. Natural language understanding is a hard problem, and weaving multiple possible intents with all the possible different flows is complex. It is hard because there is a multitude of ways for people to express the same intent, and errors in the speech recognition add complexity. Siri is the first such system to do it well and certainly the first one to do it well on such a massive scale.

So what? If it doesn’t understand what I said, it doesn’t help me.

That is absolutely true. If speech is not recognized – garbage in, garbage out. Personally I find that despite my accent Siri usually works well for me, unless I’m expressing foreign names, or there is significant ambient noise (unfortunately, we don’t all drive Teslas). There are however some design flaws that do seem to repeat themselves.

In order to improve the success rate of the automatic speech recognizer (ASR), Siri seems to communicate your address book to it. So names that appear in your address book are likely to be understood, despite the fact they may be very rare words in general. However this is often overdone, and these names start dominating the ASR output. One problem seems to be that Nuance uses the first and last names as separate words, so every so often I will get “I do not know who Norman Gordon is” because I have a Norman Winarsky and a Noam Gordon as contacts. I believe I see a similar flaw when words from one possible intent’s domain (e.g. sending an email) are recognized mistakenly when Siri already knows I’m doing something else (e.g. looking at movie listings).

This probably says something about the integration between the Nuance ASR and Apple’s Siri software. It looks like there is off-line integration – as in transferring my contacts’ names a-priori, but no real-time integration – in this case Siri telling the ASR that “Norman Gordon” is not a likely result. Such integration between the ASR and the natural language understanding software is possible, but often complex not just for technical reasons but for organizational reasons. It requires very close integration that is hard to achieve between separate companies.

So when will it get better?

It will get better. Because it has to. Speech control is here to stay – in smartphones as well as TVs, cars and most other consumer electronics. ASRs are getting better, mostly for one reason. ASRs are trained by listening to people. The biggest hurdle is how much training data they have. In the early days of ASRs, decades ago, this consisted of “listening” to news commentators – people with perfect diction and accent, in a perfect environment. In the last year, more speech sample data was collected through apps like Siri then probably in the two decades prior, and this data is (can be?) tagged with location, context and user information, and is being fed back into these systems to train them. And as this explanation was borrowed from Adam Cheyer, Siri’s co-Founder and formerly Siri’s Engineering Director at Apple – you better believe it. We are nearing an inflection point, where great speech recognition is as pervasive as internet access.

So will Siri then do everything?

That’s actually not something I believe will happen as such. Siri is a user interface platform that has been integrated with key phone features and several web services. But to assume it will be the front-end to everything is almost analogous to assuming Apple will write all of the iOS apps. That is clearly not the case.

However – Siri as a gateway to 3rd party apps, as an API that allows other apps that need the hands-free, speech-driven UI to integrate into this user interface, could be really revolutionary. Granted – app developers will have to learn a few new tricks, like managing ontologies, resolving ambiguity, and generally designing natural language user experiences. Apple will need to build methodology and instruct iOS developers, and frankly this is a tad more complex than putting UI elements on the screen. Also I have no idea whether Siri was built as a platform this way, and can dynamically manage new intents, plugging them in and out as apps are installed or removed. But when it does, it enables a world where Siri can learn to do anything – and each thing it “learns”, it learns from a company that excels at doing it, because that is that third party’s core business.

… and then, maybe, a great jammy dodger bakery chain can solve the wee problem with Scotland with a Siri-enabled app.

Oh, and by the way – you can learn more about Siri, speech, semantic stuff and AI in general at my upcoming SXSW 2013 Panel – How AI is improving User Experiences. So come on, it will be fun.

Advertisements
Mobile Platforms, Online Media

My Birthday Gift: The Kindle Fire, and Why It’s The First Credible Android Tablet

Over the past 6 months, I’ve been watching perplexed as vendor after vendor launched Android Tablets into the market with no success. Perplexed for a simple reason – I could not understand how they expected consumers to buy their $559, $499 or even $399 tablets when they could get an iPad 2 for $499 and get the real deal – the TRUE status symbol, the best content & app eco-system. What were Samsung, Motorola, Dell and Asus thinking, I was wondering. Was it a shortage / price of components that pushed them to that price bracket? Was it protecting the brand at all costs, even failure?

A couple months ago, I asked a question on Quora and the results were staggering – over 20:1 for iPad.

So what has changed?  The $199 Kindle Fire. You can get two of those, and still have money for another holiday gift.

Amazon’s Kindle is an ecosystem, not a device. Amazon sees it as a way to make sure you buy all your content – books, music, TV – from Amazon. Just yesterday they announced the streaming deal with FOX TV – more free content for Amazon Prime subscribers. Guess which devices will feature it? Remember Sony’s Howard Stringer’s announcement a few weeks ago – “Apple makes an iPad, but does it make a movie?“. Amazon doesn’t make them, but it sure-as-hell moves them around. In a move right out of Steve Jobs’ books, Amazon is tying it all together – device, app store, content store, streaming rights (with free content for Prime members), e-commerce for physical goods, payment options (from one-click to credit cards), cloud storage, even a loyalty program!

Kindle now touches everything Amazon does, and so many other companies. It threatens Netflix streaming – Amazon is securing more content for Prime members, and has a sound pay-TV model with a complete eco-system around it and it obliterates all other Android tablet manufacturers volume forecasts for the holiday season (a $200 rival with a strong brand behind it).

And it’s a credible contender for Apple’s eco-system. It is as broad, as far reaching, and goes even further with physical e-commerce embedded.

Probably the only risk is execution. If the software / hardware is good enough (defined as – better than most Android implementations), this will make a huge dent in the market. iPad will become the high-end product, but Android, through Kindle, could be the mass-market. Not so different from iPhones and Androids, actually.

My pre-order is in.

Mobile Platforms

How I Got It All Ass-Backwards, or How Android Got Free Again

Free!

Last week I wrote a piece about the huge cultural gap between Google and Motorola, and how Motorola is such an bad fit for the Google organization, and what it will do for it’s relationship with Android licensees. I also stated that if Google acquired Motorola for the patent portfolio alone – that’s not such a big deal in the marketplace.

Well boy was I wrong. A person who’s very close the story saw fit to fill me in.

Google’s acquisition of Motorola was indeed all about the patents. But not necessarily Google’s lack thereof, but really its licensees’. What Google is trying to do to the handset market is what Microsoft did to PCs – give the hardware market to cheap Chinese / Taiwanese / Korean manufacturers, and thereby own the software platform. The catch? The incumbents – Nokia, Apple, Microsoft (and Motorola) own restrictive patents. And they sue / charge these manufacturers to a point where they are agnostic between Google’s “free” OS and Microsoft’s “pricey” one. The only player in the Android camp who was relatively safe was Motorola, who owns a nice portfolio developed over many years.

Solution – Google buys Motorola and promises Android licensees a defensive umbrella – it will fight their patent wars for them with its newly acquired arsenal.

Right there and then, Android is free again.

So what is Google to do with the Motorola organization one might ask?

This is where it gets pretty interesting. You see Motorola is in Illinois. The state a certain president (and his associated mayor) come from. And 2012 is an election year. Who wants to see 10,000 layoffs in Illinois on an election year? Certainly not someone who wants to Do No Evil…
2012 Election

Mobile Platforms

Google acquires Motorola. Say again?!

With so many so-called experts (read: people who use Google and used to have a Motorola RAZR phone) providing different angles on this acquisition, I figured it’s time to chime in. I have a pretty good handle on Motorola (you can Google that!) and think I know something about Google too.

And what I don’t get is the culture clash. Truly. Motorola, like it or not, is an 83-year old Chicago (well Schaumburg) company, and no, the split to MMI and MMS did not change that. It is a slow mover 18,000-employee corporation, with an organization that takes years to design products, and even under Sanjay Jah that could not change much.
You see, when a company is hit as bad as Motorola Mobility was hit in 2008-2009 (and by the way – that happened through their complacence over the success of the RAZR), the good, dynamic, innovative people tend to leave. Especially in a market where Google, Facebook and Groupon are snatching all the good people who’d still like to work for a “safe” company. The culture has not changed all of the sudden, nor was there a good reason for great people to join lately.

Google is, or aspires to be, a fast-mover Silicon Valley company with a flat hierarchy, a market-driven (really numbers-driven) no-nonsense approach, with little respect for old-world processes. And it wants to retain this culture while growing to 25,000 employees.

See the issue?

So if, as some people have suggested, Google is only after the patents and will spin out Motorola again as a stand-alone device manufacturer, not so much has happened in the market (but congratulations to all the lawyers, accountants, bankers and management consultants who’re going to get the fat checks).

But if Google is truly looking to become the anti-Apple and the Motorola team is its weapon-of-choice… well, good luck with that.

P.S.: I especially like the theory that Microsoft was going to buy Motorola which forced Google to buy them first. It’s just lovely.

Mobile Platforms

Amazon’s Android Appstore (Tries) To Take Care of Business

So – the fabled Amazon Android Appstore (not App Store! That’s an App-le trademark!) is here. And almost as expected – these guys get the big things right, but the small things…

First thing you’ll notice -The Amazonian design. Besides the obvious branding elements, It is a much more effective design than Google’s. It is meant to generate sales. As soon as you open the store, you’re faced with credible alternatives – stuff you may well want to download, cause everyone else does. The screen space is used efficiently, and navigation is simple and easy. Very little innovation over, say iTunes, but also no clear disadvantages. The desktop web store is similar in approach, and not very far from the Amazon website that is so effective with retail shoppers in general.

The main attraction is a featured, “bonus” download, updated every day (i.e. a product that is usually not free being given away for free). Amazon takes care of business. To make an app store a business, you need paying customers. This requires people to have a payment method. That’s a hassle. That requires an incentive – give them something for free. But force them to connect a payment method to get the free stuff. Makes perfect sense. And gets me Angry Birds Rio for free. It also keeps me coming back every day for something else. Yes, it costs Amazon something. But probably not a lot. You see app developers have a great incentive to be providing these downloads for virtually (or literally) free – that day you were featured and provided as a free app, is going to put you very high in the Top Downloads chart – which will get you paid customers the following weeks (note the Top Downloads in the screenshot – yesterday’s free download, and today’s…). So even if Amazon pays virtually (or literally) nothing – it’s still a great deal. Everyone wins.

Caveat emptor – this also needs to work. The Appstore requires you to set up one-click mobile purchasing to get the download (as it should). However – no matter how many different ways I tried to do it, and despite the fact that all my info is shown, and my account shows mobile one-click purchasing activated (even when I connect on my desktop through a browser) – it still asks me again and again to “please add a payment mehod in your 1-click settings”. Now I am a loyal Amazon customer – Prime, Amazon Store Card and all that. My guess is that Amazon is not accepting its own store card on its own Appstore. Otherwise I don’t see how such a blatant bug could have slipped their people.

So – as expected, these people mean business, and know how to do it. They’ll have to cross a few t’s and dot a few i’s before they do, however.

Mobile Platforms

Post MWC: Android’s Tour-de-force. Is that the shape of things to come?

Over the last week I’ve had several discussions with colleagues about MWC 2011. The general gist of things was “wow, how far Android has gone”. And indeed, Android’s presence at the conference was impressive, to say the least. The usual Android suspects were there, of course – HTC, Motorola, Samsung and others. But what was even more impressive was the vast number of unknown Android manufacturers, mainly Chinese, who’ve flocked to the free platform en-masse. Known names like ZTE and Huawei were to be expected, but upstarts like Malata (who seems to make impressive Android tablets, incidentally) were there by the dozen. And of course – given Nokia’s and Apple’s absence, and RIM’s limited presence, it sometimes seemed like Android is the only game in town.

Malata Android Tablet

The Nokia / Microsoft news just fanned the fire. Essentially while it is a feather in Windows Phone’s cap (not necessarily a beautiful peacock feather, incidentally), it means that Nokia will be out of the smartphone game for a long time. And to judge by the employees’ reaction – could be long indeed.

The general conclusion I heard drawn, then is simple – Android is taking over the market, Android will define the shape of things to come, Android is where to take your mobile start-up / corporate mobile app first cause that’s where all the users will be. Right?

Sorry, it’s not that simple. Contrary to what some people think, Android to phones is not going to be Windows to PCs. At least not in the next 2-3 years. There are many reasons, but I think the most important one lies in the personal relationship between consumers and their phones. Unlike PCs (at the time), phones are a means for personal expression both explicitly (as in what you put on them / use them for) and implicitly (as in making sure your peers know what you have – just like cars). Most smartphone users associate their phone selection and habits with their identity. And with identity, a “one size fit all” strategy doesn’t work, fortunately. So as long as there are technologically credible alternatives with a well differentiated product (e.g iPhone, BlackBerry), they will draw significant audiences.

Furthermore, the wider Android spreads as a mid-market solution, the less appealing will it be to some of these people who seek to distance themselves from “the middle”. Think the Mac cult of the ’90s and early ’00s but at a wholly different level. After all – these devices are used in the open. People see what you use, so better pick the “right” one.

So clearly – the fragmentation in the smartphone space is going to continue. Each platform’s market segment will be different demographically and psycho-graphically,  and these compositions will continue evolving. I expect we’ll keep seeing Android pandering mostly to the mid-market (with of course a meaningful number of power-users and high-end customers too). iPhone will generally remain a high-end phenomenon. BlackBerry may well lose its hold on the enterprise, but acquire new audiences amongst the young and price-conscious (free messaging). And when Nokia eventually rolls out Windows Phone handsets, it is quite possible that their considerable distribution clout in European and Emerging markets will make this a meaningful platform for those audiences.

I believe a very similar phenomenon will be seen in tablets. While Android tablets are improving, the good ones are still not meaningfully cheaper than the iPad. Apple only needs some minor improvements with the anticipated iPad 2 in order to stay in the lead. Only when significantly cheaper tablets (probably running Android 3.0) will come to the market can the balance be upset. And what will we have then? A similar market structure with iPad as the premium product and Android tablets as cheaper, “good enough” devices for mid-market consumers.

Where does this leave the Android makers? With the proliferation of Chinese manufacturers with great pricing power, we will see the PC-wars re-enacted. Margins will drop to low single digits for most manufacturers, probably leading to consolidation and elimination of key brands.

So essentially – nothing earth-shattering really came out of MWC. We will see even more Androids, Symbian and MeeGo are dead (duh!) but little change to the fabric of the market as we’ve known it in 2010.