Nishanth J. Kumar

My overarching research goal is to create general-purpose AI systems that are capable of solving useful sequential decision-making tasks for humans across a wide variety of real-world environments. I’d like to enable digital AI agents and real-world robots to satisfy complex commands like “book me a cheap trip to a tropical destination I’d like that matches my vacation schedule” or “go buy the necessary ingredients and then make me a cup of my favorite tea”. Towards this, I’m interested in combining ideas from automated planning and search and learning to enable effective scaling at both test and training time respectively. My research draws on ideas from task and motion planning (TAMP), foundation models, reinforcement learning, deep learning, and program synthesis.

I’m currently a 4th Year Ph.D. student with the LIS Group within MIT CSAIL. I’m officially advised by Leslie Kaelbling and Tomás Lozano-Pérez, and have the pleasure of collaborating with many other wonderful people within CSAIL’s Embodied Intelligence Initiative. I’m extremely grateful for support from the NSF Graduate Research Fellowship. I’ve also had the pleasure of interning at NVIDIA Research, the Robotics and AI Institute, Vicarious AI, and Uber ATG. Previously, I received an S.M. degree from MIT, an Sc.B. with honors from Brown University, and completed the IB Diploma in my hometown of Coimbatore, India.

Outside of research, I like to lift heavy things, read, play basketball, philosophize, cook, and write both fiction and non-fiction. If you’re interested in learning more about me or reading some of my writing, check out my blog, fiction writing, or social links in the website footer. If you’d like to get in contact, check out this page here.

Resume / CV / Scholar / Twitter

news

Jan 20, 2025	I’m excited to spend Summer 2025 as an intern at FAIR in NYC working on improving long-horizon generation and decision-making for LLMs with Mary Williamson, Jimmy Yang, and Yuandong Tian.
Dec 25, 2024	New paper on inventing symbolic predicates using VLM’s accepted at the LLM4Plan and GenPlan AAAI 2025 workshops! Very excited about continuing to push in this direction and get this stuff onto real-world robots.
Sep 03, 2024	Two new papers on LLMs+TAMP and Language-Guided Abstraction Learning accepted at CoRL 2024.

research areas

My recent research has been organized around a few distinct directions, which are highlighted below. For a complete and up-to-date list of papers, please see my Google Scholar page.

Planning and Learning with Foundation Models

Foundation models have an implicit ability to sequence high-level skills --- like 'pick', 'place', or 'move to' --- to accomplish a goal, but are unable to reason over low-level joint angles or torques. By combining foundation models with structured planning methods like constraint satisfaction or TAMP, we can create sophisticated and general-purpose systems capable of solving complex tasks directly from vision and natural language input.

Learning to Plan from Demonstrations

Structured search-based methods like TAMP work well with tediously hand-engineered components. By leveraging ideas from program synthesis, we can instead output the components of planning domains --- such as symbolic predicates and operators --- directly from demonstrations and potentially scale planning systems with data instead of human effort. Running a planner at test-time to zero-shot complex new tasks via reasoning.

Planning to Learn in Deployment

Robots and web agents are bound to encounter unexpected failures in their deployment environments (new objects, layouts, websites, etc.). Ideally, they should rapidly learn to overcome these failures online. We can leverage planning to not only find alternative ways to recover from failures, but also autonomoously explore an environment to collect data most useful to prevent future failures of the same kind.