The trajectory of Virtual Assistants

The trajectory of Virtual Assistants 2021-04-04 - Virtual assintants and HCI trajector - Virtual assistants like Siri, Alexa, are a step towards accepting ambiguity in human–computer interaction - very rigid interfaces like a command line don't accept any ambiguity - GUIs a bit more, but mostly there are just many ways to do everything - starting out, virtual assistants kinda just recognize speech (not a trivial accomplishment) and then use that for commands. This is like "voice commands" where you have to learn what command does what. - A little further down the line (in the present) assistants can get meaning from most statements. "I want to see a movie," can be interpreted into the request, "show me movie times nearby." This expands ambiguity a bit from commands to kinda rudimentary command intuition. The machine can kinda translate most things into one of its available abilities. - But holding a back-and-forth is still really hard. - Limited memory presents a frustrating problem. After saying the above, you can't ask "do that thing I just asked for, but with dinner" and get nearby restaurants. - Also there's a sense that memory state is volatile. (Compare to discussion of volitile state in [[Tools not working properly is frustrating because It breaks immersion with the higher-level work]].) It's hard to be natural and relaxed when you know the first half of your request "set an alarm" could be lost if you don't say the follow-up response to "when should I set the alarm for" right away and without too much hesitation. On top of that, it's hard to go back and change it after if you don't get it right the first time - Back-and-forths are really natural with human communication. - When your friend sitting across the room from you says "I went out and bought a sm--rh-k the other day," you don't respond with "Sorry I didn't get that." You say "You bought a WHAT?" and your friend clarifies: "I bought a shark-skin rug" (you may want to un-become friends with this person). - Back-and-forth clarification like this is collaborative - Makes better inteligence between groups and with teams than commands to be followed or rejected. - Good example of this power in Siri: done drinknig and outdoor silent-discoing at 12:45 at night, "Tomorrow remind me to drink water," "Did you mean later today or tomorrow?" "Oh yeah, later today." - Allows the listener a chance to pitch in to the task, basically - "Sounds good, but should I do it this way or this way?" "Oh I hadn't thought about that... I guess this way would be better." - References: Douglas R. Hofstadter, Godel, Escher, Bach, 1979 (p. 297 "Cushioning the User and Protecting the System") - "When you stop to think what most people use computers for, you realize that it is to carry out very definite and precise tasks, which are too complex for people to do. If the computer is to be relaible, then it is necessary that it should understand, without the slightest chance of ambiguity, what it is supposed to do. It is also necessary that it should do neither more nor less than it is explicitly instructed to do. If there is, in the cushion underneath the programmer, a program whose purpose is to "guess" what the programmer wants ore means, then it is quite conceivable that the programmer could try to communicate his task and be totally misunderstood. So it is important that the high-level program, while comfortable for the human, still should be unambiguous and precise." - This last point only holds if your computer can't have a low-friction back-and-forth with you to clarify an ambiguity, the way you're used to doing with your human peers.