Story

Research, Systems & Serendipity

Research is rarely a straight line. Mine has moved from an early cloud startup to database internals, scientific platforms, high-performance computing, and AI-enabled discovery.

Scroll through the journey

Cloud StartupDatabase SystemsScientific PlatformsDistributed ComputingAI and AgentsFuture Research

Startup execution Polystore research Scientific platforms AI planning

Systems story

The page moves from cloud startup work to database systems, scientific platforms, distributed computing, and AI-enabled discovery. Each scene is designed as a systems diagram rather than a card grid.

Story

First Employee

A startup beginning that shaped how I think about cloud platforms, operations, and shipping under uncertainty.

I joined Kaavo at the earliest stage and helped turn an idea into a working cloud platform.

The work included core product architecture, releases, monitoring, and helping establish the India engineering operation that supported the product over time.

Read the startup story Kaavo IMOD project

Story

Before the Lakehouse Era

A federated query layer for relational stores, graph systems, text search, scientific files, and APIs.

This scene represents the AWESOME Polystore period, where heterogeneous sources had to behave like a single coherent system.

The interesting problem was not only storage. It was planning, optimization, provenance, and execution across different data models without hiding the system’s real behavior.

Story

Two Database Patents

The patent work centers on ingestion and query processing across systems, which are still the backbone of the site’s technical narrative.

The patent reveal frames the core technical bet: make heterogeneous data ingestible, queryable, and operationally understandable.

Those ideas still show up today in scientific data platforms, search systems, and diagnostics-oriented infrastructure.

Story

One Systems Foundation, Many Sciences

The same architecture can support biomedical research, wearables, public health, quantum materials, environmental science, and computational social science.

The systems foundation stays consistent even when the scientific domain changes.

That is the connective tissue behind TemPredict, Quantum Data Hub, National Data Platform Search, and other platform work across the site.

Story

Data from People and Machines

Scientific systems sit between human-generated signals and machine-generated telemetry.

The same platform has to understand publications, clinical records, annotations, wearables, APIs, telemetry, scientific instruments, and HPC logs.

That spectrum is what makes the work useful and hard at the same time.

Human-Generated Machine-Generated

Publications

Clinical Records

Human Annotations

Wearables

APIs

Telemetry

Scientific Instruments

HPC Logs

Story

The Query Planner Inside an AI Agent

The planning machinery in database systems and AI agents shares a surprising amount of common structure.

Goal, plan, cost, tools, state, execution, validation, and provenance are not only database ideas. They are also useful abstractions for reliable agents.

An agent is not only a language model. It is also a planner, executor, and evaluator.

Query Optimizer

Structured planning under constraints

Goal
Plan
Cost
Tools
State
Execution
Validation
Provenance

AI Agent

Goal-directed orchestration across tools

Goal
Plan
Cost
Tools
State
Execution
Validation
Provenance

Story

Systems That Explain Their Own Behavior

A useful platform does not just run. It can explain where cost accumulates and what to do next.

This scene turns a query execution trace into an operational interface.

High scan cost becomes a recommendation, not just a diagnostic warning.

01 SQL Query

02 Execution Plan

03 Operators

04 Bottleneck Detected

05 Recommendation

Observed

High scan cost

Possible action

Improve partition pruning

Story

From National Data Infrastructure to Individual Health

The same design instincts apply to large discovery platforms and to health signals from a single person.

My work spans systems that help discover national scientific resources and systems that interpret signals from one person.

The scale changes, but the need for disciplined data plumbing, provenance, and reliable interpretation does not.

Institutional Data Nodes

Federated Catalogs

National Discovery Platform

Distributed Datasets

Shared Data-Platform Center

Wearable Sensor

Individual Timeline

Clinical Records

Predictive Analytics

Story

Digital Twins Are Also Data Systems

Digital twins depend on lifecycle management as much as they depend on simulation.

A digital twin has a lifecycle: register, discover, configure, execute, monitor, evaluate, archive.

Identity, provenance, versioning, reproducibility, orchestration, and lifecycle management make the twin trustworthy.

Digital Twin

Identity Provenance Versioning Reproducibility Orchestration Data Lifecycle

Story

The Best Problems Live Between Fields

The whole page resolves into one research space where databases, workflows, HPC, and AI meet.

My research space lives at the intersection of database systems, scientific workflows, high-performance computing, and AI systems.

That intersection is where query planning, scientific agents, knowledge graphs, digital twins, performance engineering, and reproducibility become one conversation.

Database SystemsScientific WorkflowsHigh-Performance ComputingAI Systems

My Research Space

Query PlanningScientific AgentsData PlatformsKnowledge GraphsDigital TwinsPerformance EngineeringReproducibility

Closing map

Good systems do more than process data.

They help us understand where information came from, how it was transformed, and whether the result can be trusted.

Explore Projects Read My Blogs Start a Conversation