Story

Research, Systems & Serendipity

Research is rarely a straight line. Mine has moved from an early cloud startup to database internals, scientific platforms, high-performance computing, and AI-enabled discovery.

Scroll through the journey

Cloud StartupDatabase SystemsScientific PlatformsDistributed ComputingAI and AgentsFuture Research
Startup execution Polystore research Scientific platforms AI planning

Systems story

The page moves from cloud startup work to database systems, scientific platforms, distributed computing, and AI-enabled discovery. Each scene is designed as a systems diagram rather than a card grid.

Story

01

First Employee

A startup beginning that shaped how I think about cloud platforms, operations, and shipping under uncertainty.

I joined Kaavo at the earliest stage and helped turn an idea into a working cloud platform.

The work included core product architecture, releases, monitoring, and helping establish the India engineering operation that supported the product over time.

Startup beginning timeline A glowing line connecting the first employee stage to cloud platform and India engineering operation. First Employee 01 Core Cloud Platform 02 Multi-Cloud Management 03 India Engineering Operation 04

Story

02

Before the Lakehouse Era

A federated query layer for relational stores, graph systems, text search, scientific files, and APIs.

This scene represents the AWESOME Polystore period, where heterogeneous sources had to behave like a single coherent system.

The interesting problem was not only storage. It was planning, optimization, provenance, and execution across different data models without hiding the system’s real behavior.

Polystore query orbit A central query and execution layer with systems orbiting around it. Unified Query and Execution Layer Relational Database hover to inspect Graph Database hover to inspect Text Search hover to inspect Analytical Engine hover to inspect Scientific Files hover to inspect APIs hover to inspect

Story

03

Two Database Patents

The patent work centers on ingestion and query processing across systems, which are still the backbone of the site’s technical narrative.

The patent reveal frames the core technical bet: make heterogeneous data ingestible, queryable, and operationally understandable.

Those ideas still show up today in scientific data platforms, search systems, and diagnostics-oriented infrastructure.

Database patent reveal Two stylized patent documents slide apart from a central database core and connect to data, plans, execution, and results. Heterogeneous Data Ingestion Cross-System Query Processing Database Core Data Plans Execution Results

Story

04

One Systems Foundation, Many Sciences

The same architecture can support biomedical research, wearables, public health, quantum materials, environmental science, and computational social science.

The systems foundation stays consistent even when the scientific domain changes.

That is the connective tissue behind TemPredict, Quantum Data Hub, National Data Platform Search, and other platform work across the site.

Systems foundation across scientific domains A central database core with six scientific domains illuminating in sequence. Systems Core Biomedical Research Wearable Sensing Public Health Quantum Materials Environmental Science Computational Social Science

Story

05

Data from People and Machines

Scientific systems sit between human-generated signals and machine-generated telemetry.

The same platform has to understand publications, clinical records, annotations, wearables, APIs, telemetry, scientific instruments, and HPC logs.

That spectrum is what makes the work useful and hard at the same time.

Human-Generated Machine-Generated
Publications
Clinical Records
Human Annotations
Wearables
APIs
Telemetry
Scientific Instruments
HPC Logs

Story

06

The Query Planner Inside an AI Agent

The planning machinery in database systems and AI agents shares a surprising amount of common structure.

Goal, plan, cost, tools, state, execution, validation, and provenance are not only database ideas. They are also useful abstractions for reliable agents.

An agent is not only a language model. It is also a planner, executor, and evaluator.

Query Optimizer

Structured planning under constraints

  • Goal
  • Plan
  • Cost
  • Tools
  • State
  • Execution
  • Validation
  • Provenance

AI Agent

Goal-directed orchestration across tools

  • Goal
  • Plan
  • Cost
  • Tools
  • State
  • Execution
  • Validation
  • Provenance

Story

07

Systems That Explain Their Own Behavior

A useful platform does not just run. It can explain where cost accumulates and what to do next.

This scene turns a query execution trace into an operational interface.

High scan cost becomes a recommendation, not just a diagnostic warning.

01 SQL Query
02 Execution Plan
03 Operators
04 Bottleneck Detected
05 Recommendation

Observed

High scan cost

Possible action

Improve partition pruning

Story

08

From National Data Infrastructure to Individual Health

The same design instincts apply to large discovery platforms and to health signals from a single person.

My work spans systems that help discover national scientific resources and systems that interpret signals from one person.

The scale changes, but the need for disciplined data plumbing, provenance, and reliable interpretation does not.

Institutional Data Nodes
Federated Catalogs
National Discovery Platform
Distributed Datasets
Shared Data-Platform Center
Wearable Sensor
Individual Timeline
Clinical Records
Predictive Analytics

Story

09

Digital Twins Are Also Data Systems

Digital twins depend on lifecycle management as much as they depend on simulation.

A digital twin has a lifecycle: register, discover, configure, execute, monitor, evaluate, archive.

Identity, provenance, versioning, reproducibility, orchestration, and lifecycle management make the twin trustworthy.

Digital Twin
Identity Provenance Versioning Reproducibility Orchestration Data Lifecycle

Story

10

The Best Problems Live Between Fields

The whole page resolves into one research space where databases, workflows, HPC, and AI meet.

My research space lives at the intersection of database systems, scientific workflows, high-performance computing, and AI systems.

That intersection is where query planning, scientific agents, knowledge graphs, digital twins, performance engineering, and reproducibility become one conversation.

Database SystemsScientific WorkflowsHigh-Performance ComputingAI Systems
My Research Space
Query PlanningScientific AgentsData PlatformsKnowledge GraphsDigital TwinsPerformance EngineeringReproducibility

Closing map

Good systems do more than process data.

They help us understand where information came from, how it was transformed, and whether the result can be trusted.

Connected systems map A closing map tying together the career scenes into one technical story. Kaavo Polystore Patents Scientific Platforms Data Spectrum Agents Instrumentation Scale Digital Twins Interdisciplinary Research