Bala Kumaravel

I am a Senior Researcher at Microsoft Research, Redmond at the Interactive Multimodal AI Systems group. I work on leveraging Generative AI models (Multimodal Large Language Models and Diffusion models) to enhance user productivity and collaboration in business-critical applications. I am particularly interested in customizing, finetuning, and aligning generative AI models for specific end-user applications.
Over the years I’ve worked on projects spanning multimodal copilots that accelerate productivity and collaboration in business-critical workflows; Unified natively multimodal AI copilots for Microsoft Office AI that work across Word, PowerPoint, and Excel amongst other formats; generative pipelines and creative tooling for Bing Creative Ads; live AI agents that assist players in games such as Minecraft; vision perception systems that enable spatial understanding in AR/VR and robotics; and generative approaches that improve meeting experiences through multimodal understanding and content generation.
Before joining Microsoft, I completed my Ph.D. at the University of California, Berkeley where I was advised by Prof. Björn Hartmann. My research at Berkeley was concentrated in the domains of Virtual and Augmented Reality, exploring applications in diverse activities, from AR/VR-assisted robotics interactions to enhancing learning experiences. Before that, I completed my Bachelors at Indian Institute of Technology, Madras where my Bachelors’ thesis won the best interdisciplinary thesis project amongst all engineering departments and the best thesis in the department.
During my PhD I got to spend time at various places and work with amazing collaborators across Microsoft, Adobe and Autodesk - Cuong Nguyen , Stephen DiVerdi , Fraser Anderson , Tovi Grossman , George Fitzmaurice , and Andy Wilson
If you’re exploring multimodal LLMs, diffusion models, or embodied AI for enhancing Human AI interactions I’d love to connect.
news
Jul 15, 2025 | Our work - ‘Grounding Task Assistance with Multimodal Cues from a Single Demonstration’ was accepted and presented at ACL’25 Findings link |
---|---|
Oct 21, 2024 | I will be speaking at panel discussion at the IEEE International Symposium on Emerging Metaverse on Oct 21st 2024 link |
Oct 16, 2024 | We presented our work on BlendScape and SpaceBlender at UIST 2024. BlendScape won a Honorable Mention Award at UIST 2024. Check out the works at BlendScape and SpaceBlender. |
May 11, 2024 | We presented our work on SharedNeRF at CHI 2024. SharedNeRF won a Honorable Mention Award at CHI 2024. Check out the work at SharedNeRF. |
Mar 16, 2024 | Moved to the Interactive Multimodal AI Systems team at Microsoft Research, Redmond |
selected publications
2025
- Out of Sight, Not Out of Context? Egocentric Spatial Reasoning in VLMs Across Disjoint FramesarXiv preprint arXiv:2505.24257, 2025
2024
- BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AIIn Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, Jul 2024
- SpaceBlender: Creating Context-Rich Collaborative Spaces Through Generative 3D Scene BlendingIn Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, Jul 2024
-
- BlendScape: Enabling Unified and Personalized Video-Conferencing Environments through Generative AIIn , Jul 2024
2023
- StreamFunnel: Facilitating Communication Between a VR Streamer and Many SpectatorsIn , Jul 2023