MIT Team Builds a Speech-to-Reality System that Turns Spoken Prompts into Physical Objects Within Minutes
MIT researchers have devloped a system that allows users to speak a request aloud and receive a fabricated object minutes later, demonstrating how natural language, generative AI, and robotics can combine to produce on-demand manufacturing. According to MIT, the work, presented at the ACM Symposium on Computational Fabrication, shows that the system can assemble simple furniture and decorative items from modular parts without requiring users to know 3D modeling or robotic programming. Researchers at MIT’s Center for Bits and Atoms, led by graduate student Alexander Htet Kyaw with collaborators Se Hwan Jeon and Miana Smith, built a workflow that begins with speech recognition and a large language model. The model interprets the user’s request — such as asking for a stool — and passes the result to a 3D generative AI system that produces a digital representation of the object. A voxel-based process then breaks that form into discrete components suitable for robotic assembly. After geomet...