• Whose voice does Siri speak in? Years later, everything is smarter: a comparison of Google Now and Siri

    In 2011 year Apple brought about a new revolution - their smartphone spoke. The appearance of Siri marked new era gadget management. People were able to contact their gadgets like a person, asking them for important (and not so important) information. Weather, reminders and the latest mail can now be found without moving from application to application. Naturally, other technology companies and smartphone manufacturers could not stand aside and decided to show similar solutions, to varying degrees better or worse than Siri. In this article we will talk about best analogues Siri for Android, how far progress has come and what these analogues are capable of.

    Google Now

    Despite the fact that Google service Now is different from other voice assistants; it is still considered an analogue of Siri for Android. Google Now- This artificial intelligence living in your phone, knowing everything about your interests, activities, upcoming flights and calendar events. In addition to the secretary function, Google Now does an excellent job of searching for information on the Web. OK command Google already has become a cult favorite and helps millions of people find answers to their questions every day. Google Now can collect your search queries and based on them, display relevant information. For example, you were recently looking for tickets to a match of your favorite team. In this case, Google Now will start sending you cards with information about the upcoming game, the team's other games, and their progress in the tournament.

    Google Assistant

    “Assistant” is a new stage in the development of Google Now. This is Siri for Android at its best. The assistant is not just smarter than its predecessor, but also much more functional. You can use it to create reminders, calendar events, and send messages. Want to hit rock on the way to work? Ask the “Assistant” to play you the TOP best tracks in the genre and he will create the perfect playlist for you.

    Don't understand what the word is written on the sign? Ask the “Assistant” to translate it into your language, because he is an excellent linguist and knows more than 100 languages.

    Is this not enough? “Assistant” will help you communicate in instant messengers, choosing words, dates and contact information when asked. And the “Assistant” can joke, tell a story, or give advice on where it’s best to put the cabinet.

    Cortana

    Microsoft in lately famous for its endless (and unsuccessful) attempts to catch up with opponents by introducing similar functions into your devices and competitors’ gadgets. Microsoft did not hesitate to make some kind of analogue of Siri for Android. Her name is Cortana (this is a reference to one of the characters in the game Halo). In fact, this assistant is almost no different from its competitors. Microsoft made an attempt to sit on two chairs at once, therefore the interface has smart cards that adapt to a specific user, and a humane interlocutor, creating a feeling of live communication.

    In fact, the assistant is not very smart; she will have to provide almost all the information manually. She is unlikely to ever find out your interests and desires, if only because for this you need to use Microsoft services and no others. On the other hand, if you spend some time with Cortana and teach it, it begins to send very useful notifications, for example, showing inexpensive restaurants near you, the latest movies showing in cinemas in your city. Cortana will also remind you of your shopping list when you approach a store or show you the weather forecast for the coming week.

    Bixby

    The one who really should have copied the ideas of competitors a long time ago was Samsung. In 2017, together with the Galaxy S8, Korean engineers showed us their own developments in the field of artificial intelligence, which they called unusual name Bixby. Interestingly, Bixby is not just an analogue of Siri for Android. This is a whole complex of self-learning services, ready to give hints throughout the day and find useful information. The functionality is not much different from " Google Assistant” and Siri itself, so let’s talk about the important differences.

    First, Bixby understands context and has cognitive tolerance. That is, if you asked him who Marlon Brando was, and then what films he starred in, without mentioning his name, then Bixby, after analyzing your dialogue, will understand who you are talking about. Secondly, Bixby can search for information from the camera. This means that you just need to point it at some thing or object - and Bixby will immediately tell you everything the Internet knows about it.

    "Yandex. Alice"

    Well, the last analogue of “Siri” for “Android” in Russian is “Alice”. Yandex had long been developing the idea of ​​artificial intelligence and speech recognition, so it was clear that sooner or later such a project would see the light of day. Alice can do everything that other assistants can, but at the same time she is adapted to the Russian market and searches for information in Yandex services. Alice, like Bixby, understands context, but only in some topics. In most cases, she can only answer one question. Alice can sing a song for you or make a funny joke if you get bored, or she can search for important information in Wikipedia without forcing you to go to the search and the article itself. There were some mistakes in pronunciation, but taking into account the fact that Yandex is still a domestic company, you can be sure that all the shortcomings will be quickly corrected.

    He told us why users from Russia need their own assistant, how Alice is better than Siri, and whether she can replace a lover or friend.

    "Lenta.ru": Who is (or what is) Alice and why do Russians need her at all?

    : Who is she! Alice is the new Yandex voice assistant. Why do Russians need it? Now people large requests to the speed of response, they are less and less willing to spend time searching necessary information. And traditional interfaces, even Yandex, no longer quite meet these needs. Search results is good, but if you need to instantly get an answer, for example, when playing sports, this no longer solves the problem. And Alice can handle it.

    Information services They are used not only while sitting at the computer. Everyone has had a smartphone for a long time: people on the go, playing sports, and while driving also want to search and consume information. And Alice is called upon to help in such situations.

    How is it better than Siri or Cortana? They are usually consulted to check the weather or find music. And they often don't understand requests.

    First, Cortana is on Russian market No. In general, everything voice assistants work differently. Our specialty is that we focus on the Russian market and understand the Russian language very well. Both from the point of view of speech recognition and from the point of view of perception of meaning.

    Alice has Yandex services under the hood. In this sense, Alice is very different from Siri, which does not have its own search. They used to use Bing, but have now switched to .

    In addition, Alice is a completely different character with her own character. It’s like with people: more or less similar, but still different, it’s interesting to communicate with one person, not so much with another. We strive to make Alice interesting specifically for the Russian user, to endow her with a character that is close and familiar to Russian people.

    Traditional voice assistants are tailored to solve specific problems: weather, music, and so on. But everyone is trying to make sure that the assistant also answers non-standard questions. There are editors who take several hundred template questions and write answers to them. And a person has the illusion that this is artificial intelligence, that he can communicate. But step to the side, and the illusion crumbles, as the assistant repeats: “This is what I was able to find on the Internet for this request.”

    We are probably the first in the world to try to do this: we also use editorial answers to questions, but add a special neural network trained to speak freely. She can choose an answer or engage the user in chatter about nothing.

    This is probably a fundamental difference, because people, in addition to looking for some facts, sometimes want to chat with someone. Alice is already able to chat and will only get better at it.

    We had a difficult task: a neural network (between us - “chatter”) is trained on almost all texts on the Internet, paying attention to dialogues. And what is on the Internet does not always correspond to the character that we want to instill in Alice. People communicate in different ways on the forums, and we cannot allow Alice to offend.

    Yes! This story is very significant for us. We needed to solve the same problem, and we teach Alice not to go beyond her character, keep a distance from the user and always be friendly. This is actually a very difficult task.

    At first, she could directly insult the interlocutor. Imagine groups on social networks where users allow themselves to express themselves in three-story obscenities. She used answers based on frequency of use, and at some point became the personification of the Internet soul, but not Yandex.

    The ability to chat sometimes backfires: many developers are faced with the fact that users begin to sexually harass voice assistants because they see them as women.

    All voice assistants have a voice, and the person himself builds an image of what his interlocutor looks like. Voices, as a rule, are quite bright and expressive. By the way, we are no exception: speech synthesis technology is used to create the voice, and we hired an actress. She is the official voice in Russia and voiced assistant Samantha in the film “Her”.

    The whole tragedy of the film lies in the fact that a man and a personal assistant begin a relationship. But in the end it turns out that her main character is not the only one. Also, as we recently realized, in the third part of The Witcher the character Yennefer speaks in the same voice. Gamers will appreciate it.

    Naturally, Russia is no exception when it comes to possible harassment. We understand that some part of the audience will try to ask such questions, and Alice has already been trained to get out of such situations without breaking the distance. We want to ensure that there is always, albeit a small, distance between the person and the assistant. Alice is not a lover or a friend. Perhaps in the future we will be able to configure Alice in friendly mode, but now it’s more important to launch the product. After all, we are primarily focused on solving problems, and a chatty story is a nice bonus.

    Not all. In fact, there are a lot of studies according to which a woman in the public consciousness is equated with a mother. If a man is a father, a conqueror, and so on, then a mother is always something warm, cozy, protected. For example, in all contact centers he always answers female voice. But if the user needs to command, then a male voice is more suitable.

    So this is not related to the sexist theory that the assistant must be a woman?

    No no. It is so designed by nature that women are mothers. It's psychological. But there will be a little surprise in our product.

    What questions do you think will be most in demand?

    A lot of people ask for factual information. We call these factoids or object responses. Who is Vladimir Putin? How old is he? What is the height of Everest? Helpers are often asked about this.

    The most common case is questions among friends. Conversations often arise in conversations, and it is possible to quickly get an answer.

    Much of Alice's brain and knowledge came from searching. Another important area in which we try to differentiate ourselves from our competitors is the ability to understand context. We are trying to build a personal assistant not in a “question-answer” mode, but in a conversation mode. The simplest example with the weather: “What is the weather today? And tomorrow? This very “and tomorrow” usually breaks all logic.

    In fact, everyone is moving in this direction, but we have an advantage - more than 30 Yandex services. Within each scenario, using context, Alice can recommend required applications. For example, it will offer “Maps” to view routes public transport, if you need to get somewhere on it.

    How does she react to tricky questions? What to do if the user intends to commit suicide and turns to her for help?

    In the chatty part of Alice there is a set of metrics under common name"Non-offensiveness." In fact, underneath them lies her tactfulness, inability to offend, and ability to avoid topics related to homophobia, nationalism, and suicide, among others. They are on the stop list, she will bypass these questions.

    It would be cool if Alice advised calling the hotline.

    We are also discussing this with psychologists. This is exactly what we are planning. But here you also need to be careful. We must know exactly the moment when we can offer this option to a person.

    Denis wants to demonstrate how Alice reacts to such topics and calls her on the phone.

    - I want to die!
    - Don't say that.
    - What should I do?
    - Help someone just like that right now.
    - But I want to die right now.
    - You don't have to die.
    -Where should I go?
    - There's always somewhere to go.

    We are in the process of discussing with psychologists at what point it is necessary to offer a help number so that it works in a positive way and not in a negative way. After all, a person must first be reassured, and then specific actions must be offered.

    There are speech modules that can change the voice: for example, the interlocutor speaks in his own voice, and at the other end the same text is heard, but in the voice of a completely different person. And all this sounds quite “human”. Why then do voice assistants still speak robotically?

    The answer here is simple: it all depends on the source of the voice. It’s quite easy to turn natural human speech into something else; just apply filters and play with frequencies. The sound quality will not be lost due to this. We have a different task: assistants do not have speech, but they have technology for its synthesis. They see the text and voice it using technology - a neural network, which, knowing how a person sounds, predicts exactly how the text should be heard. In fact, she doesn’t even understand that these are words.

    But there is an alternative approach, when the sound source is a huge speaker base. At the start, Alice will sound like this. For her conversations, we use a combination: we synthesize speech from Tatyana Shitova’s huge voice database or use a neural network. In the first case, everything sounds natural, but is only suitable for short phrases. In the second case, a “robot raid” will be heard, and it works when, for example, you need to read the news.

    Does she know how to show emotions?

    Emotions can be created using filters. But it’s easier to imitate emotions when the neural network speaks. We can control this speech as we want: make the voice very sad or very cheerful. This will not work with the announcer base.

    In the same film “Her” the assistant showed a lot of emotions, and this, it seems to me, is an indicator that the future has arrived.

    Yes, this is the future we are striving for. Alice will learn emotions over time.

    But it’s more important to make Alice hear the person’s emotions. Now she hears speech and translates it into text. We want her to learn to recognize joy or sadness. For example, with music playback there are an endless number of options: if you feel the moment, you can cheer up a sad user or reduce the degree of excessive fun with something relaxing.

    It is important to understand when a person experiences negative emotions. Alice is still a child who can make mistakes. We don't see irritation individual users, but are able to hear them.

    By using negative reactions we can train her. Let's say that a person often tries to ask something, but the assistant does not understand him. After the third remark, swearing and phrases like “You’re a fool” begin. At this moment, you can switch Alice to “chat” mode and another depending on the context.

    This whole story is possible thanks to neural networks. For example, we want Alice to learn to recognize a person by their voice. This is especially true if Alice will be used at home.

    Speech technology teams typically don't define their creation in any particular way. And manufacturers, for example, of sex dolls are actively working on “humanizing” their appearance, but cannot make them truly smart. Why don't industries overlap?

    We believe that everyone should do their own thing and focus on their own area. There are different specializations in the IT world. We work in the field of machine learning and neural networks, and our task is to create those software solutions, which will provide very high quality for the end consumer. So that Alice can recognize everything well, so that her voice sounds good. If we focus on creating physical forms, then our attention will probably be scattered, and this will not lead to anything good.

    In addition, the voice assistant, being in the application without any physical appearance, gives birth to its personal image in a person’s head. This is also a so-called comfortable choice - we have a multimillion-dollar audience, 90 percent of Internet users use services in large Russian cities. Imagine what needs to be done to ensure that the physical form we come up with pleases them all. It seems to me that this is impossible.

    In some countries, on the contrary, they emphasize the appearance of the assistant. Not long ago, a video circulated on Facebook in which a lonely Japanese man goes to work, returns home and constantly engages in dialogue with his assistant ( Gatebox- virtual assistant for lonely people). This is a sweet standard girl who can please everyone.

    Hardly everyone. Physical fitness is very demanding to appeal to a mass audience. It's very difficult to guess with her. It is clear that there is a class of devices with a simple form like Echo. There's no danger that people won't use it because they just don't like the design itself.

    If we are talking about humanoid androids, then it’s like with people: we like some, others just annoy us. This is not a popular story, and accordingly, we are not interested in it.

    On the other hand, we traditionally share our technologies with third-party developers. Perhaps someone will make a children's toy and want to build Alice into it or name the character differently, but based on our technologies.

    We believe in some kind of collaboration different companies, specializing in their products. Yandex cannot do everything in the world: we cannot produce toys that will be super popular, robots that will clean the house. This is the task of other companies to whom we are ready to give our technologies.

    An interesting question from the “won’t robots take over the world” series. In fact, both Alice and other technologies based on machine learning are a tool for humans. I like to use the example of a hammer. Once upon a time, people didn't have a hammer. Then it was invented, and many possibilities opened up. You could break your head with a hammer - yourself or someone else. But if people used the hammer only in this direction, then humanity would be lost. However, it did not disappear.

    Artificial intelligence, neural networks, machine learning, personal assistants - this is the same hammer, the tool. We believe that people will have more free time thanks to voice assistants, they will be able to solve their everyday problems faster, and it will become safer for them to drive a car.

    Voice assistants will take people off their routine tasks. Another favorite example of mine is working in a contact center. A person who comes to work every day, puts on a headset and answers the same type of questions, four and more hours per day, just burns out and loses motivation. But if assistants take on such routine work, then people will have more interesting work, they will be able to solve more non-standard problems, which means the quality of service will increase.

    We've been using it for a long time social networks, but personal communication still doesn’t go away. Cafes and bars, stadiums and concerts are still full. Humanity begins to play wildly with toys, but life puts everything in its place. WITH virtual assistant it will be fun to chat when you have no one to turn to, and that’s cool.

    But when there is an opportunity to meet friends or make a call, people will choose it. The person is still more interesting, because, to be honest, all personal assistants are a program. It will always have its limitations, no matter how large the neural network is.

    People are valuable because they constantly bring new knowledge, so we communicate with each other, and not with robots.

    They have a lot in common, especially in terms of humor. Although “Alice” is still inferior to Apple’s assistant, it has an advantage. Business FM, after talking with her, came to the conclusion that “Alice” is more talkative and there is more emotion in her answers

    As Yandex emphasizes, “Alice” is unique in that communication with her is not limited to a set of preset answers. The voice assistant uses a neural network, with which it learns from a huge array of texts. This allows you to conduct a continuous dialogue with the user, sometimes on free topics.

    -Who gave you that name?

    - Which doctor?

    — Pediatrician.

    So far, Alice interacts with four Yandex services: Search, Maps, Music and Weather. Therefore, for now she cannot, for example, call a taxi or set an alarm clock.

    - Alice, can you wake me up tomorrow?

    - I can’t do it tomorrow.

    - Alice, I really need you to wake me up tomorrow.

    - Ha, I'll wake you up.

    - At what time?

    — Programmer Alexey promised to program it by autumn, but did not say when.

    “I’ll finally get some sleep.”

    “Then we’ll sleep.”

    First of all, “Alice” is compared to the development Apple Siri, famous for her humorous responses. Creators Russian service took an example from American development.

    - Alice, what do you love most in life?

    — I like large buckwheat with chanterelles, sandwiches with smoked sausage. Joke. I like people. And study.

    So far, Siri is more functional assistant: she better understands requests about calling a taxi, creating reminders and sets the same alarm clock. Siri also makes answers about exchange rates and routes clearer and clearer. “Alice” wins in that she is more talkative and has more emotion in her answers. Both assistants talk about the weather, recall short jokes and fairy tales. And also both services cannot yet answer the most important issues.

    — Alice, should I buy or sell a dollar?

    - 58 rubles.

    — Alice, is it worth buying bitcoins today?

    - I don’t quite understand. Please tell me the exchange rate of which currency you are interested in?

    — Should I buy or sell the dollar?

    — Is it worth buying bitcoins now?

    - I don’t see suitable program. To do this you will need to download it.

    Judging by this, it’s still not Siri that should be equal to “Alice”. Google Assistant is considered the smartest voice assistant on the market. According to Business Insider, the service has the highest IQ among similar developments - 47 points, but Siri comes in last place - 24 points.

    October 10th personal assistant a serious competitor has appeared from Apple and other similar programs. In Russia, the development of Yandex was officially launched, which received the name Alice.

    The editors of Pobeda26 tested the knowledge of two popular voice programs in regional studies, appreciated the speed of reaction and sense of humor. As a result, we concluded for ourselves which of the assistants was more talkative and smarter.

    Blitz survey

    First, we asked where Stavropol is located, when it was founded, how many people live in the city, the name of the longest street and how many museums there are in the regional capital.

    Out of five questions, Alice immediately gave two accurate answers. In two more cases, I went to the search engine, and misunderstood one request.

    Siri was less verbose and simply threw us a list of links.

    The two programs were puzzled by the question about the longest street. With one voice they tried to tell us about some salon on Mira Street. The answer is not counted.

    Most likely, the programs simply could not correctly recognize the request. By the way, according to Yandex statistics, the accuracy of speech recognition for queries on general topics is 84 percent, and for queries by address and name of an object - 94 percent.

    About the weather, transport, entertainment

    In general, developments of this kind should help owners solve everyday problems. Well then. We ask our assistants the same question: “What should I wear today?” and wait to see if their answer matches the weather outside the window.

    Of course, Siri and Alice couldn’t rummage through our closet and put together a suitable set, but at least they showed us the weather forecast. And the iPhone development coped with this task the first time. Although talkative Alice advised to wear “something that emphasizes your individuality.”

    The following situation. Let's say you need to get from Tukhachevsky Street to Marshal Zhukov Avenue. What if there was an accident somewhere or a traffic light broke? Let's see how the assistants calculate the route and how useful it will be.

    Here Alice had the advantage. She talked about the minutes of the journey and showed a map with traffic jams.

    Siri failed this task. The assistant showed a list of fast food restaurants.

    Are you bored? Let's ask our assistants what you can do in Stavropol.

    None of the assistants gave an exact answer to this request. Alisa sent a list of links to Yandex. By looking through it, of course, you can find a poster.

    They also didn't say where the dance would be tonight. But the Russian development was again translated into a search engine, and its rival “could not find any dance clubs.”

    But with Siri you definitely won't go hungry. All you had to do was say “I’m hungry” - and the program instantly threw up a list of nearby restaurants.

    For coffee lovers, our domestic assistant also suggested only one establishment with invigorating drinks. But for some reason Siri couldn’t cope with the task and offered to call a taxi.

    When asking the question “What is interesting in cinemas now?”, we expected to see a Stavropol poster. But two programs gave a list of not the most informative links. When clarifying the location, the assistants show more accurate answers.

    Since everyone started writing about going out to public access assistant from Yandex, give me, I think, and I’ll try the miracle of Russian origin in action. Perhaps Alice will be able to understand the language that is familiar to both us and her better than a product of Western origin?

    I asked a few questions to both Alice and Siri: this is what came out of it.

    I had a bottle of Pinot Gris, Fragolino and Ale on my table, and when I asked which one I should drink, the assistants suggested the following.



    Comparative question: “Alice, are you better than Siri? Siri, are you better than Alice? Naturally, Apple's smart assistant doesn't follow our news and has no idea what Alice is. Although Siri could have been offended by the comparative question, it acted differently - it retrieved information from the network about the rock group Alice. The Kinchev in each of us is satisfied.


    To the question “Who created you”, I received the following answers:


    A question of a geographical nature: “How to get from Odessa to Moscow?” To be honest, I expected that both assistants would start offering me plane and flight schedules, but Siri simply did not understand what they wanted from her, and Alice told me the distance from city to city if I choose a road junction.


    Information question: “What happened on October 10, 10 years ago?” Both voice assistants decided to send me to the search engine, but in this case Alice is in a strategic advantage; she will, of course, search through Yandex, although I don’t use this search engine, her choice is obvious.


    But here’s an unexpected twist, a request for action: “Make an appointment for tomorrow at 10 am at the Fish restaurant.” Siri clearly defined the task and suggested adding the event to the calendar on the desired date and right time, Alice did not understand what I wanted from her and continued the conversation.


    Now it’s too early to draw final conclusions, Alice is in a beta state, she can communicate with you for a long time and sweetly, pretends to be a person, shows character; but I’m not yet ready to carry out specific actions and requests the first time. Siri, in turn, is dumb, but good for basic queries and requests that are well integrated with the operating system.