Peteris Racinskis , Oskars Vismanis , Toms Eduards Zinars , Janis Arents and Modris Greitans. Towards Open-Set NLP-Based Multi-Level Planning for Robotic Tasks. Applied Sciences, 14(22), 10717 pp. MDPI, 2024.
Bibtex citāts:
Bibtex citāts:
@article{17127_2024,
author = {Peteris Racinskis and Oskars Vismanis and Toms Eduards Zinars and Janis Arents and Modris Greitans},
title = {Towards Open-Set NLP-Based Multi-Level Planning for Robotic Tasks},
journal = {Applied Sciences},
volume = {14},
issue = {22},
pages = {10717},
publisher = {MDPI},
year = {2024}
}
author = {Peteris Racinskis and Oskars Vismanis and Toms Eduards Zinars and Janis Arents and Modris Greitans},
title = {Towards Open-Set NLP-Based Multi-Level Planning for Robotic Tasks},
journal = {Applied Sciences},
volume = {14},
issue = {22},
pages = {10717},
publisher = {MDPI},
year = {2024}
}
Anotācija: This paper outlines a conceptual design for a multi-level natural language-based planning system and describes a demonstrator. The main goal of the demonstrator is to serve as a proof-of-concept by accomplishing end-to-end execution in a real-world environment, and showing a novel way of interfacing an LLM-based planner with open-set semantic maps. The target use-case is executing sequences of tabletop pick-and-place operations using an industrial robot arm and RGB-D camera. The demonstrator processes unstructured user prompts, produces high-level action plans, queries a map for object positions and grasp poses using open-set semantics, then uses the resulting outputs to parametrize and execute a sequence of action primitives. In this paper, the overall system structure, high-level planning using language models, low-level planning through action and motion primitives, as well as the implementation of two different environment modeling schemes—2.5 or fully 3-dimensional—are described in detail. The impacts of quantizing image embeddings on object recall are assessed and high-level planner performance is evaluated using a small reference scene data set. We observe that, for the simple constrained test command data set, the high-level planner is able to achieve a total success rate of 96.40%, while the semantic maps exhibit maximum recall rates of 94.69% and 92.29% for the 2.5d and 3d versions, respectively.
Žurnāla kvartile: Q1