★ Google Home Now Recognizes Multiple Users by Voice (Apr 20, 2017)
This has been a long time coming – in fact, in just a few weeks it’ll be a year since Google debuted Home at its I/O developer conference and implied that it would have multi-user support, though of course it was missing when the device actually launched in the fall. And that’s been a big limitation of a device that’s supposed to get to know you as an individual. So the fact that Google Home now recognizes distinct users by voice is a big deal, and an important differentiator over Amazon Echo. I’ve just tried it with my unit and although it set up accounts for me and my daughter without problems the app conked out when I tried to add my wife, so the results are mixed (I suspect it may be because my wife’s account is a Google Apps account). It does recognize the two voices we set up and will now serve us up different responses, which is great. One big limitation, though, is that each user has to have a Google account and has to download the Google Home app onto their phone, which means it won’t recognize little kids who don’t have Google accounts. And given that it’s using voice recognition rather than, say, different trigger phrases, I can’t set up separate personal and work accounts. But for those who can use it, the Home will now be a much more useful device, serving up calendar information, music preferences and so on on an individualized basis rather than trying everyone in a home as the same person.
★ Amazon Scales Alexa Back-End by Opening Lex Voice and Text Service to All Developers (Apr 19, 2017)
So much of the focus of coverage of voice assistants and interfaces is on the dedicated consumer products which use them, and that’s natural: these are the most visible and measurable signs of a company’s success or failure in this space. And yet the scale of those dedicated voice product is still very small relative to smartphones, which carry their own voice assistants. And scale is vital if these products are to improve, because they require lots and lots of training to get better, and so the more users there are training them, the better they become. As such, I suspect the next phase of competition in this space is going to be about developer voice platforms at least as much as it is about first-party hardware and software, and we’re starting to see signs of this from the big companies in the space, including Google and Amazon. Today, Amazon announced that Lex, which is a back-end service that combines many of the technologies behind Alexa, is opening up to all developers. But critically, this isn’t just a voice platform – it supports text and voice processing, which means that many of the developers might use it in chat bots or other similar environments that have nothing to do with voice but still help train Amazon’s natural language processing tools. Google is doing similar things with its own voice processing technology, but it’s doubtful whether Apple will ever open its voice tools up in the same way. That’s not a huge deal, because it has massive scale in voice on smartphones alone, but it may make a bigger difference over time as these other platforms benefit not only from growing first party scale but increasing third party adoption and use too.
This actually isn’t news, at least if you paid attention a couple of weeks ago when Business Insider UK reported (and I noted) that Korean would be the launch language for Bixby, and that American English would follow in May, with British English later in the year. However, it appears that Samsung provided a somewhat different steer to US press, telling them that the assistant would be available at launch on April 21st. News of the later US launch is now filtering out through US reps too, however, and will be received as bad news by those who pre-ordered the phone (apparently in large numbers) ahead of reviews and the release of this news. Given that Bixby is at least on paper one of the headline features, at least some of those early buyers will be disappointed, though the screen is another big selling point and that should perform as advertised with the caveats I mentioned in my first comment on the S8 and in the podcast episode I did on the Samsung announcements. Releasing Bixby late is better than releasing a buggy version not ready for launch, but the delay had better not be too long, nor the version it does release too unpolished. Both are risks at this point.
This shouldn’t be a surprise to anybody – Google actually showed off what appeared to be multi-user support in its demo of Google Home at I/O last year, but then it turned out the finished product didn’t support it when released in the fall. A little while back, rumors began surfacing that it would add the feature soon (and Amazon too), but that hasn’t materialized yet. The screenshot shared here suggests it’s imminent at this point. This is important, because assistants have to be personal if they’re to be really useful, and most people live in homes with other people, whether family members or roommates, and so things like calendar, email, to-do lists aren’t much use to individuals unless they can be recognized and served up different results. That’s not easy to do, especially because these speakers tend to process voices before recognition takes place, which actually makes it harder to recognize the speaker, but the companies were bound to figure it out eventually. If Google does end up launching this before Amazon, this will be yet another performance advantage, even though its distribution disadvantage remains enormous.
via The Next Web
This is a fun little comparison done by a user in the UK of the ability to the two major home smart speaker units to answer 54 questions. Google Home wins in the end, with 32.5 answered correctly, to 19.5 for Echo/Alexa. The questions were a mix of simple and challenging, and the user was in the UK and asked quite a few UK-specific questions, taking advantage of the fact that both devices recently launched there. But it’s a great illustration of both how Google has the existing skillset to do really well in this category, and also the fact that all these assistants have some way still to go to answer all the questions users might reasonably expect them to deal with.
Google Develops Federated Machine Learning Method Which Keeps Personal Data on Devices (Apr 6, 2017)
This is an interesting new development from Google, which says it has created a new method for machine learning which combines cloud and local elements in a way which keeps personal data on devices but feeds back the things it learns from training to the cloud, such that many devices operating independently can collectively improve the techniques they’re all working on. This would be better for user privacy as well as efficiency and speed, which would be great for users, and importantly Google is already testing this approach on a commercial product, its Gboard Android keyboard. It’s unusual to see Google focusing on a device-level approach to machine learning, as it’s typically majored on cloud-based approaches, whereas it’s been Apple which has been more focused on device-based techniques. Interestingly, some have suggested that Apple’s approach limits its effectiveness in AI and machine learning, whereas this new technique from Google suggests a sort of best of both worlds is possible. That’s not to say Apple will adopt the same approach, and indeed it has favored differential privacy as a solution to using data from individual devices without attributing it to specific users. But this is both a counterpoint to the usual narrative about Google sacrificing privacy to data gathering and AI capabilities and to the narrative about device-based AI approaches being inherently inferior.
This change was reported by The Information a while back but has now been confirmed by Facebook: the M hybrid human-virtual assistant Facebook was testing last year has now been released in a much reduced and entirely AI-based role inside of Messenger. That makes a ton of sense and it sounds like Facebook has been successfully testing this feature for a while with positive user response. The only worry I’d have is that it could be seen as invasive or intrusive, both in the sense of invading users’ conversations uninvited and in the sense that it will appear to be “listening” to users’ conversations for key words and phrases that will trigger that intervention. Privacy isn’t a one-size-fits-all thing – everyone has their own preferences and tolerances for the tradeoffs online services can sometimes entail – so I’d expect to see a range of reactions from delight to outrage.
via The Verge
Apple’s Clips app offers promising & fun editing features, but confusing & difficult UI – 9to5Mac (Apr 6, 2017)
Apple announced the Clips app a couple of weeks ago along with the new iPad and other announcements, and when I commented on that announcement I said the proof would be in the pudding with regard to how well the app performed. We now have reviews (and the app itself is out now too), and it looks like a bit of a mixed bag. The app looks clever, with some nifty new features, but it looks like it may suffer from the same problem as some other Cook-era Apple product releases, in that it seems like it may try to do too much, and therefore can be confusing to use. Here, as with the Apple Watch, Apple Music, and other recent efforts from Apple, it looks like it may have been better served by starting simpler and adding functionality over time. The real test will be whether we start seeing Clips-generated videos showing up in a big way on Instagram, Snapchat, and Facebook, because since this app lacks its own social features the output needs to be shared elsewhere. I still suspect, as I said in my first comment, that this is a better fit for the older Facebook generations than the Snapchat and Instagram generation, but we’ll see.
It’s time to roll out that old Alan Kay maxim again: “those who are serious about software should make their own hardware”. Google started working on its own machine learning chip, which it calls a Tensor Processing Unit or TPU, a few years back, and has now shared some performance characteristics, suggesting that it’s more efficient and faster than CPUs and GPUs on the market today for machine learning tasks. While Nvidia and others have done very well out of selling GPU lines originally designed for computer graphics to companies doing machine learning work, Google is doing impressive work here too, and open sourcing the software framework it uses for machine learning. As I’ve said before, it’s extremely hard to definitively answer the question of who’s ahead in AI and machine learning, but Google consistently churns out evidence that it’s moving fast and doing very interesting things in the space.
Microsoft launches Sprinkles, a silly camera app powered by machine learning – TechCrunch (Apr 4, 2017)
As I mentioned recently in the context of Microsoft’s Indian AI chatbot, the company appears to be in an experimental mood as regards AI, trying lots of things in lots of separate spaces, without pushing all that hard in any particular direction. There’s nothing wrong with experimentation, but there is a worry that Microsoft both spreads itself a little thin and risks diluting its brand, which has become more focused of late around productivity. There’s an argument to be made that this Sprinkles app fits its other, newer focus on creativity, but it’s probably a bit of a stretch given the minimal ties into any of its other offerings. On the consumer side, Microsoft’s biggest challenge continues to be not just producing compelling offerings but finding ways to monetize them.
Almost exactly two months ago, I wrote in my Techpinions column that Facebook’s next big opportunity was finally stepping beyond the idea of showing users only content shared by their friends, and using AI and machine learning to show them other content like content they’d previously engaged with. Doing this, I said, would dramatically expand the amount of interesting content that could be shown to users, thereby keeping them on the service for longer, and giving Facebook more time and places to show ads. And as I wrote almost exactly a year ago, this is just another consequence of Facebook becoming less of a social network and more of a content hub. Today, we’re seeing Facebook not only roll out a video tab (and a video app for TVs) with suggested videos, but also now testing a dedicated tab for recommended content of all kinds in its apps. This is yet another extension of Facebook’s increasing absorption of activity from across users’ lives into its various apps in an attempt to capture more of users’ time and advertisers’ dollars, and I suspect it’ll work pretty well if it’s managed right. Of course, it’s demonstrated several times lately that it’s somewhat lost its touch in that department, so it will need to proceed carefully in pushing forward in this area to avoid alienating users.
Apple GPU Supplier Imagination Tech Says Apple Plans to Build its Own GPU in 1-2 Years (Apr 3, 2017)
This already feels likely to be one of the biggest news items of the week (incidentally, you can now use the Like button below to vote for this post if you agree – the posts that get the most votes are more likely to be included in my News Roundup Podcast at the end of the week). There have been ongoing reports that Apple would like to build more of its own in-house technology, and GPUs have seemed at least a candidate given that Apple was said for a while to be mulling an acquisition of the company, and has been bringing Imagination Tech employees on board since the deal didn’t go ahead. The GPU obviously has a number of existing applications, but GPU technology has increasingly been used for AI and machine learning, so that’s an obvious future direction, along with Apple’s reported investment in AR. Apple’s ownership of its A-series chips (and increasingly other chips like its M and W series) is a key source of competitive advantage, and the deeper it gets into other chip categories, the more it’s likely to extend that advantage in these areas. This is, of course, also a unique example of Apple making a direct statement about a future strategy (albeit via a third party): as Apple is IMG’s largest customer, it had to disclose the guidance from Apple because it’s so material to its future prospects – the company’s share price has dropped 62% as of when I’m writing this.
The Samsung Galaxy S8 voice assistant Bixby can’t recognise British accents – Business Insider (Mar 30, 2017)
This is a great example of something I wrote about on Techpinions this week, which is that here in the US we often assume technologies available to us are ubiquitous globally, but that’s actually rarely the case. In this case, it’s the Bixby assistant / interface that ships with the Samsung Galaxy S8 which not only won’t work in languages other than English and Korean but won’t offer voice services at all in the UK, where of course accents are different. (Another tidbit in this piece is that it won’t actually work in US English until May). Building voice interfaces is tough to begin with, but localizing them for different accents and languages is another massive layer of work, often made harder by the fact that voice recognition technologies are trained on single languages like US English.
via Business Insider
Facebook will launch group chatbots at F8 – TechCrunch (Mar 29, 2017)
This is yet another sign that Facebook feels its initial bot strategy from last year isn’t panning out (something I predicted at the time) and that it needs to try alternative approaches. It’s iterated fairly rapidly since then and added some functions to make interacting with bots easier, and it now sounds like it’s trying another different tack, allowing developers to integrate bots into group conversations. But those bots won’t be interactive AI-type creatures, but instead will provide updates on events or processes, such as sporting matches or food orders. Like earlier pivots, this seems more modest in its ambitions but also more likely to be successful. But Facebook’s direction here stands in marked contrast to Microsoft’s, which continues to work on AI-based chatbots.
Microsoft launches Ruuh, yet another AI chatbot – ZDNet (Mar 29, 2017)
It’s fascinating to watch Microsoft continue to experiment with AI chatbots after its first effort, Tay, went so badly wrong. But the company’s response to that embarrassment is a sign of the culture changes that have happened at Microsoft over the last few years, as this piece from USA Today a while back points out. Microsoft isn’t afraid of failing, picking itself up, and trying again, and that’s admirable in an area as competitively intense as AI. It’s also interesting to watch these chatbots be launched into markets outside the US with other languages and/or accents (its other recent effort in this space is based in China). There’s a long way to go until these chatbots become really useful, but Microsoft seems determined to keep trying until it gets it right, while another early proponent, Facebook, seems to be changing its strategy lately.
Google’s AI Explosion in One Chart – MIT Technology Review (Mar 27, 2017)
One of the big problems with evaluating which company is ahead or behind in a field like AI is that there are few external signals – companies work on a variety of AI projects behind closed doors in their R&D departments, and many of them only surface when they’re built into products and services they bring to market. Some have suggested using patents as a way to measure leadership, and this article cites publication in scientific journals as another. Certainly, Google’s publishing is a sign that there’s lots of work going on, but it also reflects the (deliberately) quasi-academic culture at DeepMind, its big AI acquisition, while Apple is also slowly moving in this direction with regard to AI specifically. Neither patent filings nor academic papers, however, have a direct connection to using AI to provide better products and services, and that remains very difficult to measure.
Treasury Secretary Steve Mnuchin said in an interview that he felt AI taking Americans’ jobs was 50-100 years away, and it wasn’t a concern in the present. Predictably, a whole raft of tech folk who work on AI and are very much aware of jobs being lost to AI today reacted rather poorly to that statement. At best, this feels like yet another government official who doesn’t have a good grasp on technology, something that’s been a worry with the current administration since before it took office. But at worst, this means the government is far less likely to take any meaningful action on helping protect American jobs that might be lost to AI or to retraining workers so that they can find new ones if their old ones go away. Whether you believe either of those things are the government’s job or not is largely a matter of your philosophy on the proper role of government, but at the very least you’d want the government to have a realistic sense of what kind of impact AI will have on jobs and when, in order to make an informed decision.
Intel Consolidates its AI Teams (Mar 23, 2017)
Intel is announcing that it’s taking its various AI teams and consolidating them under Naveen Rao, who ran the AI company Nervana which Intel acquired last year. This feels like a sensible move from a company which has dabbled in AI in various parts of the organization but hasn’t become known as an AI powerhouse. Where things get slightly less credible is where Intel talks in the announcement about rallying the industry around a set of standards for AI as it has with past computational trends. Whereas Intel was a major player in personal computing, one of the examples it cites, it’s not nearly in the same position of influence with regard to AI, and so this feels like hubris rather than realism on what Intel’s role will be. Intel also talks, though, about bringing AI to more people, which sounds a lot like the “democratization of AI” message we’ve been hearing a lot from Microsoft lately, and which others including Google have also started parroting lately. This feels like it’s going to become an increasingly important theme in AI: less about individual companies owning capability and more about packaging up and making that capability available to anyone who wants to use it.