Bala Kumaravel

I am a Senior Researcher at Microsoft Research, Redmond at the Interactive Multimodal AI Systems group. I work on leveraging Generative AI models (Multimodal Large Language Models and Diffusion models) to enhance user productivity and collaboration in business-critical applications. I am particularly interested in customizing, finetuning, and aligning generative AI models for specific end-user applications.

Before joining Microsoft, I completed my Ph.D. at the University of California, Berkeley where I was advised by Prof. Björn Hartmann. My research at Berkeley was concentrated in the domains of Virtual and Augmented Reality, exploring applications in diverse activities, from AR/VR-assisted robotics interactions to enhancing learning experiences. Before that, I completed my Bachelors at Indian Institute of Technology, Madras where my Bachelors’ thesis won the best interdisciplinary thesis project amongst all engineering departments and the best thesis in the department.

During my PhD I got to spend time at various places and work with amazing collaborators across Microsoft, Adobe and Autodesk - Cuong Nguyen , Stephen DiVerdi , Fraser Anderson , Tovi Grossman , George Fitzmaurice , and Andy Wilson

news

Oct 21, 2024	I will be speaking at panel discussion at the IEEE International Symposium on Emerging Metaverse on Oct 21st 2024 link
Oct 16, 2024	We presented our work on BlendScape and SpaceBlender at UIST 2024. BlendScape won a Honorable Mention Award at UIST 2024. Check out the works at BlendScape and SpaceBlender.
May 11, 2024	We presented our work on SharedNeRF at CHI 2024. SharedNeRF won a Honorable Mention Award at CHI 2024. Check out the work at SharedNeRF.
Mar 16, 2024	Transferred to the Interactive Multimodal AI Systems team at Microsoft Research, Redmond
Sep 11, 2022	Started working as Senior Researcher at the Extended, Perception, Interaction and Cognition (EPIC) team at Microsoft Research, Redmond

selected publications

2024

BlendScape: Enabling End-User Customization of Video-Conferencing Environments through Generative AI

Shwetha Rajaram, Nels Numan, Balasaravanan Thoravi Kumaravel, Nicolai Marquardt, and Andrew D Wilson

In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, 2024

Abs

Today’s video-conferencing tools support a rich range of professional and social activities, but their generic meeting environments cannot be dynamically adapted to align with distributed collaborators’ needs. To enable end-user customization, we developed BlendScape, a rendering and composition system for video-conferencing participants to tailor environments to their meeting context by leveraging AI image generation techniques. BlendScape supports flexible representations of task spaces by blending users’ physical or digital backgrounds into unified environments and implements multimodal interaction techniques to steer the generation. Through an exploratory study with 15 end-users, we investigated whether and how they would find value in using generative AI to customize video-conferencing environments. Participants envisioned using a system like BlendScape to facilitate collaborative activities in the future, but required further controls to mitigate distracting or unrealistic visual elements. We implemented scenarios to demonstrate BlendScape’s expressiveness for supporting environment design strategies from prior work and propose composition techniques to improve the quality of environments.
SpaceBlender: Creating Context-Rich Collaborative Spaces Through Generative 3D Scene Blending

Nels Numan, Shwetha Rajaram, Balasaravanan Thoravi Kumaravel, Nicolai Marquardt, and Andrew D Wilson

In Proceedings of the 37th Annual ACM Symposium on User Interface Software and Technology, 2024

Abs

There is increased interest in using generative AI to create 3D spaces for Virtual Reality (VR) applications. However, today’s models produce artificial environments, falling short of supporting collaborative tasks that benefit from incorporating the user’s physical context. To generate environments that support VR telepresence, we introduce SpaceBlender, a novel pipeline that utilizes generative AI techniques to blend users’ physical surroundings into unified virtual spaces. This pipeline transforms user-provided 2D images into context-rich 3D environments through an iterative process consisting of depth estimation, mesh alignment, and diffusion-based space completion guided by geometric priors and adaptive text prompts. In a preliminary within-subjects study, where 20 participants performed a collaborative VR affinity diagramming task in pairs, we compared SpaceBlender with a generic virtual environment and a state-of-the-art scene generation framework, evaluating its ability to create virtual spaces suitable for collaboration. Participants appreciated the enhanced familiarity and context provided by SpaceBlender but also noted complexities in the generative environments that could detract from task focus. Drawing on participant feedback, we propose directions for improving the pipeline and discuss the value and design of blended spaces for different scenarios.
SharedNeRF: Leveraging Photorealistic and View-dependent Rendering for Real-time and Remote Collaboration

Mose Sakashita, Balasaravanan Thoravi Kumaravel, Nicolai Marquardt, and Andrew David Wilson

In Proceedings of the CHI Conference on Human Factors in Computing Systems, 2024

Abs

Collaborating around physical objects necessitates examining different aspects of design or hardware in detail when reviewing or inspecting physical artifacts or prototypes. When collaborators are remote, coordinating the sharing of views of their physical environment becomes challenging. Video-conferencing tools often do not provide the desired viewpoints for a remote viewer. While RGB-D cameras offer 3D views, they lack the necessary fidelity. We introduce SharedNeRF, designed to enhance synchronous remote collaboration by leveraging the photorealistic and view-dependent nature of Neural Radiance Field (NeRF). The system complements the higher visual quality of the NeRF rendering with the instantaneity of a point cloud and combines them through carefully accommodating the dynamic elements within the shared space, such as hand gestures and moving objects. The system employs a head-mounted camera for data collection, creating a volumetric task space on the fly and updating it as the task space changes. In our preliminary study, participants successfully completed a flower arrangement task, benefiting from SharedNeRF’s ability to render the space in high fidelity from various viewpoints.
BlendScape: Enabling Unified and Personalized Video-Conferencing Environments through Generative AI

Shwetha Rajaram, Nels Numan, Balasaravanan Thoravi Kumaravel, Nicolai Marquardt, and Andrew D. Wilson

In , 2024

2023

StreamFunnel: Facilitating Communication Between a VR Streamer and Many Spectators

Haohua Lyu, Cyrus Vachha, Qianyi Chen, Balasaravanan Thoravi Kumaravel, and Bjöern Hartmann

In , 2023

2022

Shaping the new future of work through mixed reality

Bala Kumaravel

2022

HTML
Interactive Cross-Dimensional Media for Collaboration and Guidance in Mixed Reality Environments

Balasaravanan Thoravi Kumaravel

University of California, Berkeley, 2022

HTML PDF
Modeling and Influencing Human Attentiveness in Autonomy-to-Human Perception Hand-offs

Yash Vardhan Pant, Balasaravanan Thoravi Kumaravel, Ameesh Shah, Erin Kraemer, Marcell Vazquez-Chanlatte, Kshitij Kulkarni, Bjoern Hartmann, and Sanjit A Seshia

In 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), 2022

HTML PDF
DreamStream: Immersive and Interactive Spectating in VR

Balasaravanan Thoravi Kumaravel, and Andrew D Wilson

In Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems, 2022

HTML PDF

2020

TransceiVR: Bridging asymmetrical communication between VR users and external collaborators

Balasaravanan Thoravi Kumaravel, Cuong Nguyen, Stephen DiVerdi, and Bjoern Hartmann

In Proceedings of the 33rd Annual ACM Symposium on User Interface Software and Technology, 2020

HTML PDF

2019

TutoriVR: A Video-Based Tutorial System for Design Applications in Virtual Reality

Balasaravanan Thoravi Kumaravel, Cuong Nguyen, Stephen DiVerdi, and Bjoern Hartmann

In CHI ’19: Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

HTML PDF
Loki: Facilitating remote instruction of physical tasks using bi-directional mixed-reality telepresence

Balasaravanan Thoravi Kumaravel, Fraser Anderson, George Fitzmaurice, Bjoern Hartmann, and Tovi Grossman

In Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, 2019

HTML PDF