Windows Phone 7 Speech

Windows Phone 7 has some really nice voice control and speech recognition features such as the ability to transcribe text messages and even reply to or write new messages to people in your contacts but to name one of the features. I’ve used the text messaging speech control on a couple of occasions in the […]

Windows Phone 7 has some really nice voice control and speech recognition features such as the ability to transcribe text messages and even reply to or write new messages to people in your contacts but to name one of the features. I’ve used the text messaging speech control on a couple of occasions in the car, but only really by fluke due to the fact that I had my phone connected to the car for playing music at the time.

You can read the official Microsoft page on speech control at http://www.microsoft.com/windowsphone/en-us/howto/wp7/get-started-speech.aspx.

I’ve never really been a big speech or voice control user, let alone a fan. I don’t spend a lot of time travelling in the car and typically, my phone is with me, on my person, so I use my hands as after all, that big touch screen on the HTC HD7 is made for them.

As a Christmas gift, I bought my wife and me a Scala Rider Q2 Multiset Pro (http://www.cardosystems.com/scala-rider/scala-rider-q2-multiset), which is a helmet mounted voice activated rider to pillion (and bike to bike) communication system, but it also triples as an FM Radio and a Bluetooth headset, allowing me to connect my phone and satnav device to it so that I can get handsfree Bluetooth calls or music whilst riding and get satnav directions through the helmet.

I fitted my Scala Rider unit to my helmet yesterday and thought I would have a play with some of the speech controls of my Windows Phone as I would be using some of them now via the helmet.

The call commands are pretty intuitive and what you would expect: Call is the opening command  followed by the name of the person and optionally which number to call them on. For example, call Richard Green Work would dial my work number. If you omit the work, home or mobile command, then the phone will prompt you for which number to dial if you have multiple numbers for a given contact.

The text command is pretty simple too: Text is the opening command followed by the name of the person. You will then be prompted to start speaking your message. Once you’re done, the phone will read back the transcript and if you’re happy with it, you can say Send, or you can say Try Again to start over if it misheard you. On the receiving side, when you receive an incoming text, the phone will announce that you have a new message and the name of the contact whom it is from and you are given the option to have it read out loud and then reply if you wish.

The application commands, again are simple and intuitive, and herein lies the problem. Saying Open followed by the name of an application of feature on the phone and it will do so, for example Open Zune will open the Music and Videos Hub (renamed from the Zune Hub pre-Mango update). You can say Open Music and Videos too, but why would you when you can just say Zune? This works for any application, including third-party ones, so I can say Open Sky News or Open Endomondo and the app will promptly open, however this is where it ends.

Once the Music and Videos Hub is open, there is no way to start playing music, play a particular artist, a playlist or anything.

I love my Windows Phone as anyone remotely close to me will tell you. The style of it, the ease of use and the way it gives me the data I want quickly and easy to read with those big blocks of bold colour, but most of all, my passion for all things Microsoft, but this is one area that flops.

What is the purpose of being able to open an application on the phone via speech if you then can’t control the application beyond that? I know that Microsoft can’t be expected or even be able to implement deep level interoperability for speech control for third party applications because Microsoft have no understanding of the function and purpose of the applications or code used to make those applications function (beyond the actual language used), but a deep rooted part of the operating system such as music, messaging and phone should be there out of the box.

Ignoring the new Siri functionality on the iPhone 4S which is different to what I’m covering here – Just the core platform controls, and an iPhone user can dictate to the phone to shuffle all music, play a particular album, artist or playlist which is what you need. Going back to my original statement, I’ve never been a big speech user, this one-up-manship for the iPhone didn’t phase me, however with my shift in needs, it does.

Now, in my circumstances, the phone is safely inside my backpack while I’m riding, so touching the phone to operate it isn’t even remotely viable. If I wanted to listen to music on the road, I would have to start the music playing before I get all my gloves and other gear on so that it’s already rolling before I’m rolling. If I want to stop the music for any reason, I need to take off, at a minimum, my gloves and backpack so that I can get into the bag to stop it. If I’m on the subject of music on Windows Phone, why is the music volume linked to the system volume? There should be separate control for the music and system volumes, as well as a separate control for the ringtone volume, however that’s a separate rant.

I still prefer my Windows Phone to any iPhone offering, because it does what I want, how I want it (except for this one occasion), however on this occasion, I do envy those owners. I’ve read multiple rumours about speech operation in Windows Phone 7 Tango update rumoured to be coming in 2012 which will bring the speech more inline with that seen in Siri, however for me, now, this can’t come soon enough.