Getting robots to perform simple tasks is a complex task. To do simple things and interact with other people on a daily basis, we use a lot of knowledge. Whether it is knowledge patiently acquired throughout our lives or inherited cognitive behavioral mechanisms. Everything a robot doesn’t have. For example, if someone tells us they spilled their coffee, we hand them a napkin and offer them another, while a robot might be caught off guard for lack of sufficiently clear and concise instructions.
Enabling robots with the capabilities of large-scale language models
Robots therefore function satisfactorily in environments where what is expected of them is relatively precise and predefined, but not to evolve as household robots where nothing allows us to predict what we will expect of them. With the aim of advancing robots in this field, researchers from Google Brain and Everyday Robots (structure also in the Alphabet group) are pooling their expertise as part of the PaLM-SayCan project. The idea: to equip robots with the capabilities of large language models. The same models that allow, for example, a chatbot to have a conversation, or other systems to complete a sentence or even a text (read our file on this topic). Although they cannot understand what they are being told or what they are saying themselves, these systems use knowledge from the texts they are being trained on. For example, you can answer the question about the capital of Burundi if you have no idea what a capital is.
A sponge instead of a vacuum to mop up spilled coffee
In an article published on the Google AI blog, the PaLM-SayCan project team explains that integrating the capabilities of the PaLM language model into a robot should help it understand statements it was not trained to make. In addition, the system is equipped with an affordance function (or potentiality) to allow the robot to anchor itself in the real world and determine the actions it can perform in a given environment.
“Our system can be viewed as a dialogue between the user and the robot facilitated by the language model. The user begins by giving an instruction that converts the language model into a sequence of steps for the robot to perform. This sequence is filtered using the robot’s capabilities to determine the most feasible plan given its current state and environment. The model determines the likelihood that a given skill will successfully evolve towards attaining instructions,” explain the researchers involved in the project. In particular, this calculation takes into account the feasibility of the skill in its current state.
To take the example of someone saying they spilled their coffee: a robot boosted with the PaLM-SayCan approach, thanks to its language understanding, understood that it is not relevant to fetch a vacuum cleaner, but a sponge. An option that was also chosen because it was doable in the given context as there was a sponge nearby.
An approach that does not escape criticism
The PaLM-SayCan project does not convince everyone. Some scholars of AI and its ethical implications have already raised concerns, such as researcher Gary Marcus. He recalls that the major language models still sometimes get it wrong, given their inability to reason rationally and be aware of the responses they generate (see the example of the Burundian capital mentioned above). The expert recalls that when the GPT-3 model said “I am sad if I kill myself”, the GPT-3 model offered to help his interlocutor…
“It’s not just that big language models can advise suicide […] It’s also about the fact that if you put them in a robot and they misunderstand you or don’t fully appreciate the implications of your request, they can do significant damage,” warns Gary Marcus on his blog. Before you question the proof of concept built into the PaLM-SayCan system, you essentially explain that the question shouldn’t be “is it feasible?” it should be “is it safe and ethical?”.
#Google #combines #robots #good #bad #idea