> ## Documentation Index
> Fetch the complete documentation index at: https://docs.startree.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Query Analyzer API Reference

> REST API reference for the Query Analyzer static and runtime analysis endpoints.

<Warning>
  **Beta feature** — Query Analyzer is disabled by default and available on demand. Contact your StarTree account team to have it enabled for your environment.
</Warning>

Query Analyzer exposes two REST endpoints on the StarTree Cloud data-manager service. Both return the same unified response structure.

## Endpoints

| Method | Path                                                    | Purpose                                                                                 |
| ------ | ------------------------------------------------------- | --------------------------------------------------------------------------------------- |
| `POST` | `/api/data/manager/api/query-analyzer/static-analysis`  | Analyze a query using table metadata and explain plan — no execution required           |
| `POST` | `/api/data/manager/api/query-analyzer/runtime-analysis` | Execute the query, collect stage statistics, then analyze with evidence-backed findings |

## Authentication and Headers

All requests require a valid StarTree Cloud Bearer token. The `workspace` header scopes the request to a specific workspace.

| Header          | Required | Description                                                                  |
| --------------- | -------- | ---------------------------------------------------------------------------- |
| `Authorization` | Yes      | `Bearer <token>`                                                             |
| `Content-Type`  | Yes      | `application/json`                                                           |
| `workspace`     | No       | Workspace ID. When omitted, the default workspace for the principal is used. |

***

## POST /api/data/manager/api/query-analyzer/static-analysis

Analyzes a SQL query using only table metadata and explain plan. Does not execute the query. Returns results within seconds.

### Request Body

```json theme={null}
{
  "sql": "SELECT COUNT(*) FROM orders WHERE status = 'shipped' AND ts > 1700000000",
  "timeout": "PT60S"
}
```

| Field     | Type   | Required | Description                                                                                        |
| --------- | ------ | -------- | -------------------------------------------------------------------------------------------------- |
| `sql`     | string | Yes      | The SQL query to analyze                                                                           |
| `timeout` | string | No       | ISO-8601 duration override for the LLM call (e.g. `"PT60S"`, `"PT5M"`). Min: 10s. Max: 10 minutes. |

### Example Request

```bash theme={null}
curl -X POST "https://<your-domain>/api/data/manager/api/query-analyzer/static-analysis" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -H "workspace: <workspace-id>" \
  -d '{
    "sql": "SELECT COUNT(*) FROM orders WHERE status = '\''shipped'\'' AND ts > 1700000000"
  }'
```

### Example Response

```json theme={null}
{
  "querySummary": "Counts all orders with status 'shipped' after a given Unix timestamp.",
  "analysisType": "STATIC",
  "findings": [
    {
      "type": "MISSING_INVERTED_INDEX",
      "severity": "HIGH",
      "description": "Column 'status' is used in an equality predicate but has no inverted index. Every segment will be fully scanned.",
      "evidence": [
        {
          "type": "MISSING_INDEX",
          "detail": "status: no inverted index found"
        }
      ],
      "confidence": "HIGH"
    }
  ],
  "recommendations": [
    {
      "rank": 1,
      "title": "Add inverted index on 'status'",
      "type": "ADD_INVERTED_INDEX",
      "description": "An inverted index on 'status' will allow Pinot to skip segments that do not contain the predicate value, eliminating the full scan.",
      "suggestedChange": {
        "before": "\"invertedIndexColumns\": []",
        "after": "\"invertedIndexColumns\": [\"status\"]"
      },
      "expectedImpact": "HIGH",
      "confidence": "HIGH",
      "tradeoffs": "Adds a small amount of index storage per segment. Requires a segment reload to take effect on existing data."
    }
  ]
}
```

***

## POST /api/data/manager/api/query-analyzer/runtime-analysis

Executes the query against your Pinot cluster using the multi-stage engine (MSE), collects stage execution statistics, then produces evidence-backed findings tied to measured operator metrics.

If the query is not MSE-compatible, the service automatically falls back to static analysis. The response `analysisType` field reflects which mode was used, and `mseIncompatibleReason` is populated on fallback.

### Request Body

```json theme={null}
{
  "sql": "SELECT status, COUNT(*) FROM orders WHERE ts BETWEEN 1700000000 AND 1710000000 GROUP BY status",
  "timeout": "PT120S"
}
```

| Field     | Type   | Required | Description                                                             |
| --------- | ------ | -------- | ----------------------------------------------------------------------- |
| `sql`     | string | Yes      | The SQL query to execute and analyze                                    |
| `timeout` | string | No       | ISO-8601 duration override for the LLM call. Min: 10s. Max: 10 minutes. |

<Warning>
  Runtime analysis executes your query against the cluster. For very large or expensive queries, run static analysis first to catch structural issues before spending compute.
</Warning>

### Example Request

```bash theme={null}
curl -X POST "https://<your-domain>/api/data/manager/api/query-analyzer/runtime-analysis" \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -H "workspace: <workspace-id>" \
  -d '{
    "sql": "SELECT status, COUNT(*) FROM orders WHERE ts BETWEEN 1700000000 AND 1710000000 GROUP BY status"
  }'
```

### Example Response

```json theme={null}
{
  "querySummary": "Counts orders grouped by status within a time window.",
  "analysisType": "RUNTIME",
  "findings": [
    {
      "type": "MISSING_RANGE_INDEX",
      "severity": "HIGH",
      "description": "Column 'ts' is used in a BETWEEN predicate but has no range index. All segments are being scanned.",
      "evidence": [
        {
          "type": "MISSING_INDEX",
          "detail": "ts: no range index found"
        },
        {
          "type": "HIGH_POST_FILTER_SCAN",
          "detail": "numEntriesScannedPostFilter: 48,200,000"
        }
      ],
      "confidence": "HIGH",
      "operator": "TABLE_SCAN",
      "metrics": {
        "executionTimeMs": 3420,
        "memoryAllocatedBytes": 0,
        "rowsProduced": 12400
      }
    }
  ],
  "recommendations": [
    {
      "rank": 1,
      "title": "Add range index on 'ts'",
      "type": "ADD_RANGE_INDEX",
      "description": "A range index on 'ts' allows Pinot to prune segments outside the time window, reducing the scan to only matching segments.",
      "suggestedChange": {
        "before": "\"rangeIndexColumns\": []",
        "after": "\"rangeIndexColumns\": [\"ts\"]"
      },
      "expectedImpact": "HIGH",
      "confidence": "HIGH",
      "tradeoffs": "Adds index storage per segment. Requires a segment reload to take effect on existing data."
    }
  ]
}
```

***

## Response Schema

Both endpoints return a `QueryAnalysisResponse` object.

| Field                   | Type   | Description                                                                                                                                                        |
| ----------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `querySummary`          | string | Plain-English description of what the query does                                                                                                                   |
| `analysisType`          | string | `RUNTIME` when MSE execution succeeded; `STATIC` when the service fell back to static analysis                                                                     |
| `mseIncompatibleReason` | string | Reason MSE execution was not possible. Null when `analysisType` is `RUNTIME`.                                                                                      |
| `findings`              | array  | Risk flags detected, each with type, severity, description, evidence, confidence, and (runtime only) operator and metrics                                          |
| `recommendations`       | array  | Ranked, actionable suggestions with title, type, description, suggested change, expected impact, confidence, evidence references, tradeoffs, and correctness notes |
| `error`                 | object | Structured error. Null on success. See [Error Response](#error-response).                                                                                          |

***

## Error Response

When a request fails, the response body contains an `error` object and the HTTP status reflects the failure category.

```json theme={null}
{
  "error": {
    "errorType": "BAD_REQUEST",
    "description": "Invalid timeout value 'PT1S'. Expected ISO-8601 duration (e.g. PT60S, PT5M)."
  }
}
```

| Field         | Type   | Description                                                                                       |
| ------------- | ------ | ------------------------------------------------------------------------------------------------- |
| `errorType`   | string | Machine-readable error category (e.g. `BAD_REQUEST`, `PINOT_UNAVAILABLE`, `LLM_ERROR`, `UNKNOWN`) |
| `description` | string | Human-readable explanation of the failure                                                         |
| `cause`       | object | Optional nested cause chain for debugging                                                         |

Common HTTP status codes:

| Status | When                                                                                    |
| ------ | --------------------------------------------------------------------------------------- |
| `200`  | Analysis completed successfully                                                         |
| `400`  | Malformed request (e.g. missing `sql`, invalid `timeout` format, timeout out of bounds) |
| `401`  | Missing or invalid authentication token                                                 |
| `403`  | The authenticated principal does not have access to the requested tables                |
| `500`  | Internal error — contact support if it persists                                         |

***

## Related Documentation

<CardGroup cols={2}>
  <Card title="How to Use Query Analyzer" icon="book-open" href="/corecapabilities/query_data/query-analyzer/usage">
    Step-by-step guide to running analysis from the Query Console and via API.
  </Card>

  <Card title="Understanding Recommendations" icon="list-check" href="/corecapabilities/query_data/query-analyzer/reading-results">
    Full reference for every finding type, recommendation category, operator type, and evidence field in the response.
  </Card>

  <Card title="Security and Compliance" icon="shield-check" href="/corecapabilities/query_data/query-analyzer/security-and-compliance">
    What data is sent to the LLM, access controls, and platform compliance posture.
  </Card>
</CardGroup>
