Screenshot_2013-05-30-15-17-00Nelly is the name I gave my Tasker voice assistant back when I made it last year. She’s made cameos in other posts since then, often to make fun of Siri or other Apple products.  I’ve also written extensively about how Siri is a lowest common denominator type of voice assistant, whereas something like Nelly is infinitely customizable to fit the user. Since it’s now been exactly one year since Nelly was created, and since I’ve seen many people attempting to make their own voice assistant this way, I thought I would do an update on her status.

As frequent readers of the site knows, I’ve written quite a lot about AutoVoice recently (I made a dedicated section for it in the Tasker portal). Back when I made Nelly, Tasker’s built-in Get Voice action was the only way to make such a voice assistant, but the introduction of AutoVoice has made it both easier to make one and given us more tools to make it more powerful. While I still have the profiles and tasks for the old Nelly, they’re currently not being used, and I’ve essentially started from scratch with an AutoVoice-based assistant.

After a bit of trial and error, I settled on a couple of ways to trigger this assistant. There’s still the invisible home screen shortcut that I started out with, but I mostly use two other ways of triggering it these days. The first is a simply unlock ring shortcut on my lock screen, allowing me to quickly access the voice assistant without going any further than that. The second is a gesture that’s available from anywhere in the OS, done through GMD GestureControl, after I finally figured out the logic (or lack there of) of some of the settings in that app. Both of those trigger a task that only contains an AutoVoice Recognize action, while the invisible home screen shortcut also throws in a Portal voice file for fun.

As for functionality, I’ve stripped a lot of it. The original Nelly was always more of a toy than anything else; while capable of some useful things, I found that I kept adding more and more novelty features, and started using less and less of the actually useful ones. In the end I had Nelly on my home screen, but didn’t use her for anything. That’s why I also wasn’t sure I would have any use for AutoVoice at first, but when I found out how different (in a good way) it is from Get Voice, I started adding more and more actually useful things.

Right now, the most used feature for me is home automation control. I’ve given Nelly some fairly advanced features in that regard, including the ability to turn on the light for a specified amount of time, or set it to a specified brightness level. Voice control has always been most useful for things that aren’t just easier to do with a button, and it’s easier to just tell it to set the lights to a certain level than to find the slider and set it to that level myself. The same goes for the timer functionality I’ve added, which allows me to set timers based on any combination of minutes and seconds, which is much, much faster than doing it through the timer app.

Below are some videos from other recent articles, showing various features of the current Nelly.

I have also been playing with some other features, like being able to start music playback of specific tracks using voice control (as opposed to just starting music playback). This is something that Siri does well, but it’s hard to replicate in Tasker without a music play app that’s designed for it. In fact, several things I’ve tried to do with voice control have run into situations where you’d think there would be something that would do it for you out there, but there isn’t, because making your own voice assistant really isn’t that common.

Bottom line, my focus this time around is more on making fewer, but more useful and powerful features. Voice control is a long way from replacing traditional control methods, and I think the best way to use it is to use it for things it’s good at, and not try to forcible use it for anything else.

The biggest difficulties I’ve had come from features that I’m hoping will be added to AutoVoice, but aren’t currently in there. One example is how it reacts to multiple suggested recognized phrases. When you tell it something, it receives multiple possible interpretations of what it heard, ranked according to how likely it is that you said that specific phrase. Profiles can however trigger on any one of those suggestions, not just the top one, and there’s currently no way to limit this. That’s a problem since the top suggestion might be exactly what you said, but some obscure possible interpretation further down the list ends up triggering a completely different profile.

This happened to me once when I left the house, and told Nelly “goodbye” to initiate shutdown mode. Imagine my surprise when the confirmation for that was followed by a confirmation for nap mode being activated, and I didn’t understand what was going on at first. Turns out that one of the suggested interpretations of what I said had been “good night”, so while it triggered the profile it was supposed to using the correct interpretation “goodbye”, it also triggered a second one that it shouldn’t have.

Another missing feature is the ability to act on a situation where no other profiles have triggered. You can currently create a profile that handles scenarios where it doesn’t understand what you’re saying, but that doesn’t cover situations where it thinks it understands, but didn’t interpret it right. Such a feature would be great for being able to re-trigger the recognition box if no profiles trigger, as that likely means it misheard you.

A final issue is that voice recognition is still fairly stupid. It constantly makes mistakes that no human in history would make, and that can be tiresome at times. For instance, I keep having issues where saying something like “set a timer for 5 minutes” is being interpreted as “set a timer 45 minutes,” which is a mistake that only a computer could ever make. It’s also incapable of guessing words it doesn’t know, as well as incapable of learning new words from what I’ve seen, which can be problematic to say the least for anyone who’s bilingual, as there’s no way to set it to understand two languages, like a human would.

At the end of the day, I still work on Nelly from time to time, but she’s changed a lot in the year she’s been alive. I still think that custom voice assistants are a great way to use Tasker, but it does require a certain level of proficiency to be able to do it well. Being able to make it dynamic is especially difficult, as it’s always easier to have it react to certain key phrases in specific places than to have to extrapolate information from more human ways of saying things.