Story
Research, Systems & Serendipity
Research is rarely a straight line. Mine has moved from an early cloud startup to database internals, scientific platforms, high-performance computing, and AI-enabled discovery.
Scroll through the journey
Systems story
The page moves from cloud startup work to database systems, scientific platforms, distributed computing, and AI-enabled discovery. Each scene is designed as a systems diagram rather than a card grid.
Story
01
First Employee
A startup beginning that shaped how I think about cloud platforms, operations, and shipping under uncertainty.
I joined Kaavo at the earliest stage and helped turn an idea into a working cloud platform.
The work included core product architecture, releases, monitoring, and helping establish the India engineering operation that supported the product over time.
Story
02
Before the Lakehouse Era
A federated query layer for relational stores, graph systems, text search, scientific files, and APIs.
This scene represents the AWESOME Polystore period, where heterogeneous sources had to behave like a single coherent system.
The interesting problem was not only storage. It was planning, optimization, provenance, and execution across different data models without hiding the system’s real behavior.
Story
03
Two Database Patents
The patent work centers on ingestion and query processing across systems, which are still the backbone of the site’s technical narrative.
The patent reveal frames the core technical bet: make heterogeneous data ingestible, queryable, and operationally understandable.
Those ideas still show up today in scientific data platforms, search systems, and diagnostics-oriented infrastructure.
Story
04
One Systems Foundation, Many Sciences
The same architecture can support biomedical research, wearables, public health, quantum materials, environmental science, and computational social science.
The systems foundation stays consistent even when the scientific domain changes.
That is the connective tissue behind TemPredict, Quantum Data Hub, National Data Platform Search, and other platform work across the site.
Story
05
Data from People and Machines
Scientific systems sit between human-generated signals and machine-generated telemetry.
The same platform has to understand publications, clinical records, annotations, wearables, APIs, telemetry, scientific instruments, and HPC logs.
That spectrum is what makes the work useful and hard at the same time.
Story
06
The Query Planner Inside an AI Agent
The planning machinery in database systems and AI agents shares a surprising amount of common structure.
Goal, plan, cost, tools, state, execution, validation, and provenance are not only database ideas. They are also useful abstractions for reliable agents.
An agent is not only a language model. It is also a planner, executor, and evaluator.
Query Optimizer
Structured planning under constraints
- Goal
- Plan
- Cost
- Tools
- State
- Execution
- Validation
- Provenance
AI Agent
Goal-directed orchestration across tools
- Goal
- Plan
- Cost
- Tools
- State
- Execution
- Validation
- Provenance
Story
07
Systems That Explain Their Own Behavior
A useful platform does not just run. It can explain where cost accumulates and what to do next.
This scene turns a query execution trace into an operational interface.
High scan cost becomes a recommendation, not just a diagnostic warning.
Observed
High scan costPossible action
Improve partition pruningStory
08
From National Data Infrastructure to Individual Health
The same design instincts apply to large discovery platforms and to health signals from a single person.
My work spans systems that help discover national scientific resources and systems that interpret signals from one person.
The scale changes, but the need for disciplined data plumbing, provenance, and reliable interpretation does not.
Story
09
Digital Twins Are Also Data Systems
Digital twins depend on lifecycle management as much as they depend on simulation.
A digital twin has a lifecycle: register, discover, configure, execute, monitor, evaluate, archive.
Identity, provenance, versioning, reproducibility, orchestration, and lifecycle management make the twin trustworthy.
Story
10
The Best Problems Live Between Fields
The whole page resolves into one research space where databases, workflows, HPC, and AI meet.
My research space lives at the intersection of database systems, scientific workflows, high-performance computing, and AI systems.
That intersection is where query planning, scientific agents, knowledge graphs, digital twins, performance engineering, and reproducibility become one conversation.
Closing map
Good systems do more than process data.
They help us understand where information came from, how it was transformed, and whether the result can be trusted.