• Alice - is Yandex's voice assistant really that good? Years later, everything is smarter: a comparison of Google Now and Siri

    Voice assistant "Alice" appeared in the Yandex application. Owners can use it modern smartphones. How is "Alice" different from Siri and how to communicate with it virtual assistant- in the “Question and Answer” section.

    "Alice" in the pleasant female voice of Tatyana Shitova (duplicates Scarlett Johansson in the Russian box office) will tell you how to get to desired point, will give a weather forecast, you can even have a heart-to-heart talk with her. She can work with Yandex applications such as music, weather, maps. In the future, “Alice” will have access to other services and will be able, for example, to recommend a movie or call a taxi.

    In the future, other companies will be able to provide Alice with access to their services. Launch third party applications(for example, VKontakte or Instagram) she can already now.

    Yandex notes that neural network allows "Alice" to recognize and process incomplete phrases and questions, take into account context and speak with different intonations. When developing the assistant, special attention was paid to the ability to understand “real human speech, and not just perfectly spoken requests.”

    How to communicate with “Alice”?

    To start communicating with this smart “girl,” you will need to install the Yandex application on your phone. This can be done on mobile operating systems Android systems and iOS.

    For personal computers on operating system Windows service will continue to work in beta. Then ask your questions.

    How is this app different from Siri?

    Communication with Siri is available only to iPhone owners; the owner of any smartphone can communicate with “Alice”. You won’t be able to call an assistant with one phrase, like Siri, on a mobile phone. First you need to launch the search engine itself.

    The Yandex press service emphasized that their voice assistant can go beyond the prescribed scripts and improvise; Siri has all the answers written in advance. In fact, one can doubt the correctness of this statement, since “Alice” still answered one question in completely different formulations in a formulaic way, but definitely with humor.

    SpeechKit speech technologies are the basis for recognizing someone else's speech and synthesizing Alice's own voice.

    But, for example, call " ambulance"Alice is not capable, unlike Siri. She will not be able to set an alarm clock either. At the same time, iPhone owner Just tell Siri what time he should wake up in the morning, and the program will set the alarm itself. You can set not only a specific hour and minute, but also a time period. For example, if the user uses the request “Siri, wake me up in 30 minutes,” the program itself will count half an hour based on the current time.

    You can add to Alice's advantages its integration with the company's services, including its own search, while Apple does not have its own search. But the search does not always show up what is near you. Instead of the movie schedule in Barnaul, “Alice” suggested watching a movie in Novosibirsk.

    October 10th personal assistant a serious competitor has appeared from Apple and other similar programs. In Russia, the development of Yandex was officially launched, which was named Alice.

    The editors of Pobeda26 tested the knowledge of two popular voice programs in regional studies, appreciated the speed of reaction and sense of humor. As a result, we concluded for ourselves which of the assistants was more talkative and smarter.

    Blitz survey

    First, we asked where Stavropol is located, when it was founded, how many people live in the city, the name of the longest street and how many museums there are in the regional capital.

    Out of five questions, Alice immediately gave two accurate answers. In two more cases, I went to the search engine, and misunderstood one request.

    Siri was less verbose and simply threw us a list of links.

    The two programs were puzzled by the question about the longest street. With one voice they tried to tell us about some salon on Mira Street. The answer is not counted.

    Most likely, the programs simply could not correctly recognize the request. By the way, according to Yandex statistics, the accuracy of speech recognition for queries on general topics is 84 percent, and for queries by address and name of an object - 94 percent.

    About the weather, transport, entertainment

    In general, developments of this kind should help owners solve everyday problems. Well then. We ask our assistants the same question: “What should I wear today?” and wait to see if their answer matches the weather outside the window.

    Of course, Siri and Alice couldn’t rummage through our closet and put together a suitable set, but at least they showed us the weather forecast. And the iPhone development coped with this task the first time. Although talkative Alice advised to wear “something that emphasizes your individuality.”

    The following situation. Let's say you need to get from Tukhachevsky Street to Marshal Zhukov Avenue. What if there was an accident somewhere or a traffic light broke? Let's see how the assistants calculate the route and how useful it will be.

    Here Alice had the advantage. She talked about the minutes of the journey and showed a map with traffic jams.

    Siri failed this task. The assistant showed a list of fast food restaurants.

    Are you bored? Let's ask our assistants what you can do in Stavropol.

    None of the assistants gave an exact answer to this request. Alisa sent a list of links to Yandex. By looking through it, of course, you can find a poster.

    They also didn't say where the dance would be tonight. But the Russian development was again translated into a search engine, and its rival “could not find any dance clubs.”

    But with Siri you definitely won't go hungry. All you had to do was say “I’m hungry” - and the program instantly threw up a list of nearby restaurants.

    For coffee lovers, our domestic assistant also suggested only one establishment with invigorating drinks. But for some reason Siri couldn’t cope with the task and offered to call a taxi.

    When asking the question “What is interesting in cinemas now?”, we expected to see a Stavropol poster. But two programs gave a list of not the most informative links. When clarifying the location, the assistants show more accurate answers.

    He explained why users from Russia need their own assistant than Alice better than Siri and whether she can replace a lover or friend.

    "Lenta.ru": Who is (or what is) Alice and why do Russians need her at all?

    : Who is she! Alice is the new Yandex voice assistant. Why do Russians need it? Now people large requests to the speed of response, they are less and less willing to spend time searching necessary information. And traditional interfaces, even Yandex, no longer quite meet these needs. Search results good, but if you need to instantly get an answer, for example, when playing sports, this no longer solves the problem. And Alice can handle it.

    Information services They are used not only while sitting at the computer. Everyone has had a smartphone for a long time: people on the go, playing sports, and while driving also want to search and consume information. And Alice is called upon to help in such situations.

    How is it better than Siri or Cortana? They are usually consulted to check the weather or find music. And they often don't understand requests.

    Firstly, Cortana is not on the Russian market. In general, all voice assistants work differently. Our specialty is that we focus on the Russian market and understand the Russian language very well. Both from the point of view of speech recognition and from the point of view of perception of meaning.

    Alice has Yandex services under the hood. In this sense, Alice is very different from Siri, which does not have its own search. They used to use Bing, but have now switched to .

    In addition, Alice is a completely different character with her own character. It’s like with people: more or less similar, but still different, it’s interesting to communicate with one person, not so much with another. We strive to make Alice interesting specifically for the Russian user, to endow her with a character that is close and familiar to Russian people.

    Traditional voice assistants are tailored to solve specific problems: weather, music, and so on. But everyone is trying to make sure that the assistant also answers non-standard questions. There are editors who take several hundred template questions and write answers to them. And a person has the illusion that this is artificial intelligence, that he can communicate. But step to the side, and the illusion crumbles, as the assistant repeats: “This is what I was able to find on the Internet for this request.”

    We are probably the first in the world to try to do this: we also use editorial responses to questions, but we add a special neural network trained for free conversation. She can choose an answer or engage the user in chatter about nothing.

    This is probably a fundamental difference, because people, in addition to looking for some facts, sometimes want to chat with someone. Alice is already able to chat and will only get better at it.

    We had a difficult task: a neural network (between us - “chatter”) is trained on almost all texts on the Internet, paying attention to dialogues. And what is on the Internet does not always correspond to the character that we want to instill in Alice. People communicate in different ways on the forums, and we cannot allow Alice to offend.

    Yes! This story is very significant for us. We needed to solve the same problem, and we teach Alice not to go beyond her character, keep a distance from the user and always be friendly. This is actually a very difficult task.

    At first, she could directly insult the interlocutor. Imagine groups on social networks where users allow themselves to express themselves in three-story obscenities. She used answers based on frequency of use, and at some point became the personification of the Internet soul, but not Yandex.

    The ability to chat sometimes backfires: many developers are faced with the fact that users begin to sexually harass voice assistants because they see them as women.

    All voice assistants have a voice, and the person himself builds an image of what his interlocutor looks like. Voices, as a rule, are quite bright and expressive. By the way, we are no exception: speech synthesis technology is used to create the voice, and we hired an actress. She is the official voice in Russia and voiced assistant Samantha in the film “Her”.

    The whole tragedy of the film lies in the fact that a man and a personal assistant begin a relationship. But in the end it turns out that her main character is not the only one. Also, as we recently realized, in the third part of The Witcher the character Yennefer speaks in the same voice. Gamers will appreciate it.

    Naturally, Russia is no exception when it comes to possible harassment. We understand that some part of the audience will try to ask such questions, and Alice has already been trained to get out of such situations without breaking the distance. We want to ensure that there is always, albeit a small, distance between the person and the assistant. Alice is not a lover or a friend. Perhaps in the future we will be able to configure Alice in friendly mode, but now it’s more important to launch the product. After all, we are primarily focused on solving problems, and a chatty story is a nice bonus.

    Not all. In fact, there are a lot of studies according to which a woman in the public consciousness is equated with a mother. If a man is a father, a conqueror, and so on, then a mother is always something warm, cozy, protected. For example, in all contact centers he always answers female voice. But if the user needs to command, then a male voice is more suitable.

    So this is not related to the sexist theory that the assistant must be a woman?

    No no. It is so designed by nature that women are mothers. It's psychological. But there will be a little surprise in our product.

    What questions do you think will be most in demand?

    A lot of people ask for factual information. We call these factoids or object responses. Who is Vladimir Putin? How old is he? What is the height of Everest? Helpers are often asked about this.

    The most common case is questions among friends. Conversations often arise in conversations, and it is possible to quickly get an answer.

    Much of Alice's brain and knowledge came from searching. Another important area in which we try to differentiate ourselves from our competitors is the ability to understand context. We are trying to build a personal assistant not in a “question-answer” mode, but in a conversation mode. The simplest example with the weather: “What is the weather today? And tomorrow? This very “and tomorrow” usually breaks all logic.

    In fact, everyone is moving in this direction, but we have an advantage - more than 30 Yandex services. Within each scenario, using context, Alice can recommend required applications. For example, it will offer “Maps” to view routes public transport, if you need to get somewhere on it.

    How does she react to tricky questions? What to do if the user intends to commit suicide and turns to her for help?

    In the chatty part of Alice there is a set of metrics under common name"Non-offensiveness." In fact, underneath them lies her tactfulness, inability to offend, and ability to avoid topics related to homophobia, nationalism, and suicide, among others. They are on the stop list, she will bypass these questions.

    It would be cool if Alice advised calling the hotline.

    We are also discussing this with psychologists. This is exactly what we are planning. But here you also need to be careful. We must know exactly the moment when we can offer this option to a person.

    Denis wants to demonstrate how Alice reacts to such topics and calls her on the phone.

    - I want to die!
    - Don't say that.
    - What should I do?
    - Help someone just like that right now.
    - But I want to die right now.
    - You don't have to die.
    -Where should I go?
    - There's always somewhere to go.

    We are in the process of discussing with psychologists at what point it is necessary to offer a help number so that it works in a positive way and not in a negative way. After all, a person must first be reassured, and then specific actions must be offered.

    There are speech modules that can change the voice: for example, the interlocutor speaks in his own voice, and at the other end the same text is heard, but in the voice of a completely different person. And all this sounds quite “human”. Why then do voice assistants still speak robotically?

    The answer here is simple: it all depends on the source of the voice. It’s quite easy to turn natural human speech into something else; just apply filters and play with frequencies. The sound quality will not be lost due to this. We have a different task: assistants do not have speech, but they have technology for its synthesis. They see the text and voice it using technology - a neural network, which, knowing how a person sounds, predicts exactly how the text should be heard. In fact, she doesn’t even understand that these are words.

    But there is an alternative approach, when the sound source is a huge speaker base. At the start, Alice will sound like this. For her conversations, we use a combination: we synthesize speech from Tatyana Shitova’s huge voice database or use a neural network. In the first case, everything sounds natural, but is only suitable for short phrases. In the second case, a “robot raid” will be heard, and it works when, for example, you need to read the news.

    Does she know how to show emotions?

    Emotions can be created using filters. But it’s easier to imitate emotions when the neural network speaks. We can control this speech as we want: make the voice very sad or very cheerful. This will not work with the announcer base.

    In the same film “Her” the assistant showed a lot of emotions, and this, it seems to me, is an indicator that the future has arrived.

    Yes, this is the future we are striving for. Alice will learn emotions over time.

    But it’s more important to make Alice hear the person’s emotions. Now she hears speech and translates it into text. We want her to learn to recognize joy or sadness. For example, with music playback there are an endless number of options: if you feel the moment, you can cheer up a sad user or reduce the degree of excessive fun with something relaxing.

    It is important to understand when a person experiences negative emotions. Alice is still a child who can make mistakes. We don't see irritation individual users, but are able to hear them.

    By using negative reactions we can train her. Let's say that a person often tries to ask something, but the assistant does not understand him. After the third remark, swearing and phrases like “You’re a fool” begin. At this moment, you can switch Alice to “chat” mode and another depending on the context.

    This whole story is possible thanks to neural networks. For example, we want Alice to learn to recognize a person by their voice. This is especially true if Alice will be used at home.

    Speech technology teams typically don't define their creation in any particular way. And manufacturers, for example, of sex dolls are actively working on “humanizing” their appearance, but cannot make them truly smart. Why don't industries overlap?

    We believe that everyone should do their own thing and focus on their own area. There are different specializations in the IT world. We work in the field of machine learning and neural networks, and our task is to create those software solutions, which will provide very high quality for the end consumer. So that Alice can recognize everything well, so that her voice sounds good. If we focus on creating physical forms, then our attention will probably be scattered, and this will not lead to anything good.

    In addition, the voice assistant, being in the application without any physical appearance, gives birth to its personal image in a person’s head. This is also a so-called comfortable choice - we have a multimillion-dollar audience, 90 percent of Internet users use services in large Russian cities. Imagine what needs to be done to ensure that the physical form we come up with pleases them all. It seems to me that this is impossible.

    In some countries, on the contrary, they emphasize the appearance of the assistant. Not long ago, a video circulated on Facebook in which a lonely Japanese man goes to work, returns home and constantly engages in dialogue with his assistant ( Gatebox- virtual assistant for lonely people). This is a sweet standard girl who can please everyone.

    Hardly everyone. Physical fitness is very demanding to appeal to a mass audience. It's very difficult to guess with her. It is clear that there is a class of devices with a simple form like Echo. There's no danger that people won't use it because they just don't like the design itself.

    If we are talking about humanoid androids, then it’s like with people: we like some, others just annoy us. This is not a popular story, and accordingly, we are not interested in it.

    On the other hand, we traditionally share our technologies with third-party developers. Perhaps someone will make a children's toy and want to build Alice into it or name the character differently, but based on our technologies.

    We believe in some kind of collaboration different companies, specializing in their products. Yandex cannot do everything in the world: we cannot produce toys that will be super popular, or robots that will clean the house. This is the task of other companies to whom we are ready to give our technologies.

    An interesting question from the “won’t robots take over the world” series. In fact, both Alice and other technologies based on machine learning are a tool for humans. I like to use the example of a hammer. Once upon a time, people didn't have a hammer. Then it was invented, and many possibilities opened up. You could break your head with a hammer - yourself or someone else. But if people used the hammer only in this direction, then humanity would be lost. However, it did not disappear.

    Artificial intelligence, neural networks, machine learning, personal assistants - this is the same hammer, the tool. We believe that people will have more free time thanks to voice assistants, they will be able to solve their everyday problems faster, and it will become safer for them to drive a car.

    Voice assistants will take people off their routine tasks. Another favorite example of mine is working in a contact center. A person who comes to work every day, puts on a headset and answers the same type of questions, four and more hours per day, just burns out and loses motivation. But if assistants take on such routine work, then people will have more interesting work, they will be able to solve more non-standard problems, which means the quality of service will increase.

    We've been using it for a long time social networks, but personal communication still doesn’t go away. Cafes and bars, stadiums and concerts are still full. Humanity begins to play wildly with toys, but life puts everything in its place. WITH virtual assistant it will be fun to chat when you have no one to turn to, and that’s cool.

    But when there is an opportunity to meet friends or make a call, people will choose it. The person is still more interesting, because, to be honest, all personal assistants are a program. It will always have its limitations, no matter how large the neural network is.

    People are valuable because they constantly bring new knowledge, so we communicate with each other, and not with robots.

    In 2011 year Apple brought about a new revolution - their smartphone spoke. The appearance of Siri marked new era gadget management. People were able to contact their gadgets like a person, asking them for important (and not so important) information. Weather, reminders and the latest mail can now be found without moving from application to application. Naturally, other technology companies and smartphone manufacturers could not stand aside and decided to show similar solutions, to varying degrees better or worse than Siri. In this material we will talk about the best analogues of Siri for Android, how far progress has come and what these analogues are capable of.

    Google Now

    Despite the fact that Google service Now is different from other voice assistants; it is still considered an analogue of Siri for Android. Google Now is an artificial intelligence that lives in your phone, knowing everything about your interests, activities, upcoming flights and calendar events. In addition to the secretary function, Google Now does an excellent job of searching for information on the Web. OK command Google already has become a cult favorite and helps millions of people find answers to their questions every day. Google Now can collect your search queries and based on them, display relevant information. For example, you were recently looking for tickets to a match of your favorite team. In this case, Google Now will start sending you cards with information about the upcoming game, the team's other games, and their progress in the tournament.

    Google Assistant

    “Assistant” is a new stage in the development of Google Now. This is Siri for Android at its best. The assistant is not just smarter than its predecessor, but also much more functional. You can use it to create reminders, calendar events, and send messages. Want to hit rock on the way to work? Ask the “Assistant” to play you the TOP best tracks in the genre and he will create the perfect playlist for you.

    Don't understand what the word is written on the sign? Ask the “Assistant” to translate it into your language, because he is an excellent linguist and knows more than 100 languages.

    Is this not enough? “Assistant” will help you communicate in instant messengers, choosing words, dates and contact information when asked. And the “Assistant” can joke, tell a story, or give advice on where it’s best to put a cabinet.

    Cortana

    Microsoft in lately famous for its endless (and unsuccessful) attempts to catch up with opponents by introducing similar functions into your devices and competitors’ gadgets. Microsoft did not hesitate to make some kind of analogue of Siri for Android. Her name is Cortana (this is a reference to one of the characters in the game Halo). In fact, this assistant is almost no different from its competitors. Microsoft made an attempt to sit on two chairs at once, therefore the interface has smart cards that adapt to a specific user, and a humane interlocutor, creating a feeling of live communication.

    In fact, the assistant is not very smart; she will have to provide almost all the information manually. She is unlikely to ever find out your interests and desires, if only because for this you need to use Microsoft services and no others. On the other hand, if you spend some time with Cortana and teach it, it begins to send very useful notifications, for example, showing inexpensive restaurants near you, the latest movies showing in cinemas in your city. Cortana will also remind you of your shopping list when you approach a store or show you the weather forecast for the coming week.

    Bixby

    The one who really should have copied the ideas of competitors a long time ago was Samsung. In 2017, together with the Galaxy S8, Korean engineers showed us their own developments in the field of artificial intelligence who was named unusual name Bixby. Interestingly, Bixby is not just an analogue of Siri for Android. This is a whole complex of self-learning services, ready to give hints throughout the day and find useful information. The functionality is not much different from " Google Assistant” and Siri itself, so let’s talk about the important differences.

    First, Bixby understands context and has cognitive tolerance. That is, if you asked him who Marlon Brando was, and then what films he starred in, without mentioning his name, then Bixby, after analyzing your dialogue, will understand who you are talking about. Secondly, Bixby can search for information from the camera. This means that you just need to point it at some thing or object - and Bixby will immediately tell you everything the Internet knows about it.

    "Yandex. Alice"

    Well, the last analogue of “Siri” for “Android” in Russian is “Alice”. Yandex had long been developing the idea of ​​artificial intelligence and speech recognition, so it was clear that sooner or later such a project would see the light of day. Alice can do everything that other assistants can do, but at the same time she is adapted to Russian market and searches for information in Yandex services. Alice, like Bixby, understands context, but only in some topics. In most cases, she is only able to answer one question. Alice can sing a song for you or make a funny joke if you get bored, or she can search for important information in Wikipedia without forcing you to go to the search and the article itself. There were some mistakes in pronunciation, but taking into account the fact that Yandex is still a domestic company, you can be sure that all the shortcomings will be quickly corrected.

    For several days, some users had access to voice assistant from Yandex - Alisa. Today the company officially released it to everyone.

    We decided to compare what the assistant is capable of compared to Siri. The result was mixed.

    We tested 15 different queries that can be asked of digital assistants.

    1. Create a note/reminder.

    Result: 1:0 in favor of Siri. She easily copes with this request, which Alice cannot do on iOS and Android.

    2. Set a timer for 5 minutes.

    Result: 2:0 in favor of Siri. And then Alice failed, unable to cope with the simplest task.

    3. What is the height of the Eiffel Tower?

    Result: 3:1 in favor of Siri. Both assistants did an excellent job in this matter.

    Note that Siri also talked a little about the Eiffel Tower, but this does not always happen.

    4. What's the weather like outside?

    Result: 4:2 in favor of Siri. Both assistants coped with the task, but Siri again gave a more meaningful answer.

    5. Check the calculator.

    Result: 5:3 in favor of Siri. They think well, and that makes me happy.

    Result: 5:4 in favor of Siri. Digital Apple assistant I immediately started studying information on the Internet while Alice had her own small database of “suitable” films.

    7. What do I have to do today?

    Result: 6:4 in favor of Siri. Alice could not talk about my affairs today.

    8. Where can I have breakfast?

    Result: 7:5 in favor of Siri. Both assistants were able to find a place to eat deliciously. Alice gave the address on the spot, Siri gave the option to choose it yourself.

    9. Find a grocery store near you.

    Result: 8:6 in favor of Siri. Both voice assistants were able to cope with the task.

    10. What is the situation on the roads?

    Result: 8:7 in favor of Siri. Alisa gave a more informative answer without going to Yandex.Maps.

    11. How to get to Gorky Park?

    Result: 8:8. Both assistants coped with the task, but Alice was able to immediately provide the address and approximate time to the park.

    Then the assistants opened the Maps.

    12. Latest news.

    Result: 8:9, Alice is in the lead. Siri was unable to answer the question and went back to looking for information on the Internet.

    13. Tell a joke.

    Result: 8:10, and again Alice is ahead. She's got enough big set jokes unlike Siri. They are rarely repeated.

    14. Tell a story.

    Result: 8:11, Alice continues to hold the leading position. The situation is the same as with jokes. Siri has a very limited supply of stories.

    15. Call Egor/Tim Cook.

    Result: 9:11, Alice is a champion. Siri was able to call a person, but Alice still cannot do this. Both on iOS and Android.

    We also tested the sociability of voice assistants

    Alice

    Siri

    In terms of sociability, Alice sounds more natural, her voice is really pleasant to listen to. Although her answers are not particularly intelligent, she can still communicate with the user.