Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding Interactive API's for MSQ engine #14416

Merged
merged 17 commits into from
Jun 28, 2023
Merged

Conversation

cryptoe
Copy link
Contributor

@cryptoe cryptoe commented Jun 13, 2023

The current SQL endpoint is more suited toward hot queries where the client blocks for the results. The current task endpoint is more of a fire-and-forget API, primarily meant for ingestion.

This PR aims to expose a new API called
"@path("/druid/v2/sql/statements/")" which takes the same payload as the current "/druid/v2/sql" endpoint and allows users to fetch results in an async manner.

The statement execution API runs a single SQL statement which can be:

SELECT

DDL

CREATE [TEMPORARY|EXTERNAL] TABLE AS 

INSERT 

POST SQL statement

POST /sql/statements

Executes a SQL statement.


   mode: Determines how the query results are fetched. It will have 3 values [SYNC, ASYNC, AUTO] .  For the MVP mode=ASYNC is only supported. 

   execTimeout: The maximum amount of time that the query can execute. Handy to prevent “runaway queries.” Must be lower than any system-configured value. This is distinct from the results expiration timeout. Optional.(not supported for initial version)
   
  • Response
#http 200
{
  "state":"RUNNING",
  "queryID":"queryIDXX",
  "createdAt":"2023-03-07T12:58:17Z",
  "schema": [
    {
      "name": "col1",
      "type": "varchar"
    },
    {
      "name": "col2",
      "type": "varchar"
    }
  ]
}
#http 200
{
  "state": "FINISHED",
  "createdAt": "2023-03-07T12:58:17Z",
  "queryID": "queryIDXX",
  "durationInMS": 20,
  "schema": [
    {
      "name": "col1",
      "type": "varchar"
    },
    {
      "name": "col2",
      "type": "varchar"
    }
  ],
  "result": {
    "totalRows": 12412,
    "totalSize": 12000,
    "format":"json",
    "header":true,
    "sampleRecords": []
  }
}

Query execution failures

This sections is exactly similar as https://druid.apache.org/docs/latest/querying/querying.html#query-execution-failures
If a query fails, Druid returns a response with an HTTP response code and a JSON object with the following structure:

{
  "error" : "Query timeout",
  "errorMessage" : "Timeout waiting for task.",
  "errorClass" : "java.util.concurrent.TimeoutException",
  "host" : "druid1.example.com:8083"
}

Status

GET /sql/statements/{id}

Returns the same response as the post API.

Results

GET /sql/statements/{id}/results?offset=x&numRows=y&size=s&timeout=z

Returns a page of results if they are available.

    id Query Id.
    offset. Row offset of the first row to fetch.
    numRows: number of rows to fetch
    size: Size limit of the results, in bytes. (Optional: size limit in rows. The smallest value wins.) (not part of initial implementation)
    timeout: The timeout, in MS, to wait for results. 

Close Query

DELETE /sql/statements/{id}

Cancels a running/accepted query.

TODO

  • Add more docs both user facing and javadocs
  • UT's

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@cryptoe cryptoe marked this pull request as draft June 13, 2023 15:01
@vogievetsky vogievetsky added the Needs web console change Backend API changes that would benefit from frontend support in the web console label Jun 14, 2023
@cryptoe cryptoe changed the title [Draft] Adding Interactive API's for MSQ engine Adding Interactive API's for MSQ engine Jun 19, 2023
@cryptoe cryptoe marked this pull request as ready for review June 19, 2023 06:24

TaskStatusResponse taskResponse = overlordClient.getTaskStatus(queryId);
if (taskResponse == null) {
return Response.status(Response.Status.NOT_FOUND).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be useful to add a message that includes queryId in it.


TaskStatusPlus statusPlus = taskResponse.getStatus();
if (statusPlus == null || !MSQControllerTask.TYPE.equals(statusPlus.getType())) {
return Response.status(Response.Status.NOT_FOUND).build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be useful to add an error message along with response code.

);
} else if (sqlStatementState == SqlStatementState.FAILED) {
return buildNonOkResponse(
Response.Status.NOT_FOUND.getStatusCode(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it shouldn't be a 404 IMO. the status code should come from statusPlus itself.

new QueryException(null, statusPlus.getErrorMsg(), null, null),
queryId
);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can avoid this nested code by using following structure
if (abc)
return xyz
if (abcd)
return something_else
code for default

Comment on lines +641 to +644
TaskStatusPlus statusPlus = taskResponse.getStatus();
if (statusPlus == null || !MSQControllerTask.TYPE.equals(statusPlus.getType())) {
return Optional.empty();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this expected? If not, can we log some warning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not add a warning since its bad user input.

cryptoe added 5 commits June 20, 2023 08:26
Adding more sanity post test
2. Adding rows and size to response.
3. Adding dataSource to response.
4. Adding exceptionDetails to response.
# Conflicts:
#	sql/src/test/java/org/apache/druid/sql/http/SqlResourceTest.java
@GET
@Path("/enabled")
@Produces(MediaType.APPLICATION_JSON)
public Response doGetEnabled(@Context final HttpServletRequest request)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this context will be helpful in the javadocs.

if (numRows != 1) {
throw new RE("Expected a single row but got [%d] rows. Please check broker logs for more information.", numRows);
}
String taskId = (String) rows.get(0)[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add a size check on rows.get(0)

for (Map.Entry<String, Object> worker : counterMap.entrySet()) {
Object workerChannels = worker.getValue();
if (workerChannels == null || !(workerChannels instanceof Map)) {
return Optional.empty();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you ever expect this though?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Just defensive coding.

return FutureUtils.getUnchecked(future, true);
}
catch (RuntimeException e) {
throw new QueryException(null, "Unable to contact overlord " + e.getMessage(), null, null, null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats the expected action associated with this error message?

Copy link
Contributor Author

@cryptoe cryptoe Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can happen for n reasons. Changed it to a developer exception.

DruidException.forPersona(DruidException.Persona.DEVELOPER)
                          .ofCategory(DruidException.Category.UNCATEGORIZED)
                          .build("Unable to contact overlord " + e.getMessage());
    }

MSQControllerTask msqControllerTask = getMSQControllerTaskOrThrow(queryId, authenticationResult.getIdentity());
Optional<List<ColNameAndType>> signature = getSignature(msqControllerTask);
if (!signature.isPresent()) {
return Response.ok().build();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm. what does it mean for signature to be absent?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signature should not be empty but that's for the MSQ engine to decide.
We can't return results if the signature is empty. So if the results call get's an empty signature, we just return an empty result.

if (!signature.isPresent()) {
return Response.ok().build();
}
Optional<List<Object>> results = getResults(getPayload(overlordWork(overlordClient.taskReportAsMap(queryId))));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the results are in the task report?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

Copy link
Contributor

@adarshsanjeev adarshsanjeev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partially reviewing this, still going through a few classes. Looks good to me overall. Had a few nits.

@@ -2056,7 +2056,7 @@ private <T> T deserializeResponse(Response resp, Class<T> clazz) throws IOExcept
return JSON_MAPPER.readValue(responseToByteArray(resp), clazz);
}

private byte[] responseToByteArray(Response resp) throws IOException
public static byte[] responseToByteArray(Response resp) throws IOException
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: Could be pushed up to CalciteTestBase so that other classes can use it in the future

Copy link
Contributor Author

@cryptoe cryptoe Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally avoid pushing stuff to calcite test base since its already a big class until necessary. We can probably do that if this method is being used in other places in the future.

@@ -52,5 +53,7 @@ public interface OverlordClient

ListenableFuture<Map<String, Object>> taskReportAsMap(String taskId);

ListenableFuture<TaskPayloadResponse> taskPayload(String taskId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This function might need a better name to avoid confusing it with the task payload containing results, since we have a payload() function in SqlStatementResource already. Maybe taskDefinition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is borrowed for the existing indexingServiceClient hence I did not want to change the name.

final long last = SqlStatementResourceHelper.getLastIndex(numberOfRows, start);


getStatementStatus(queryId, authenticationResult.getIdentity(), false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this call required? It seems to be checked after this already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes that needs to be removed. Thanks for the catch.

Copy link
Contributor

@adarshsanjeev adarshsanjeev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me overall. I don't see any blocking issues here.

{

try {
Access authResult = AuthorizationUtils.authorizeAllResourceActions(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhishekagarwal87 I am following the pattern from

final Access authResult = AuthorizationUtils.authorizeAllResourceActions(
.

We might need to add a new method to AuthorizationUtilsto set the req header but that can be done as part of a follow up PR.

Copy link
Contributor Author

@cryptoe cryptoe Aug 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhishekagarwal87 Raised a PR : #14878

@cryptoe cryptoe added the Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 label Jun 28, 2023
@cryptoe cryptoe merged commit cb3a9d2 into apache:master Jun 28, 2023
@cryptoe
Copy link
Contributor Author

cryptoe commented Jun 28, 2023

Thanks @abhishekagarwal87 @adarshsanjeev for the reviews. User Facing docs would be coming soon.

cryptoe added a commit that referenced this pull request Jul 7, 2023
…14527)

One of the most requested features in druid is to have an ability to download big result sets.
As part of #14416 , we added an ability for MSQ to be queried via a query friendly endpoint. This PR builds upon that work and adds the ability for MSQ to write select results to durable storage.

We write the results to the durable storage location <prefix>/results/<queryId> in the druid frame format. This is exposed to users by
/v2/sql/statements/:queryId/results.
abhishekagarwal87 pushed a commit that referenced this pull request Jul 17, 2023
This PR catches the console up to all the backend changes for Druid 27

Specifically:

Add page information to SqlStatementResource API #14512
Allow empty tiered replicants map for load rules #14432
Adding Interactive API's for MSQ engine #14416
Add replication factor column to sys table #14403
Account for data format and compression in MSQ auto taskAssignment #14307
Errors take 3 #14004
AmatyaAvadhanula pushed a commit to AmatyaAvadhanula/druid that referenced this pull request Jul 17, 2023
This PR catches the console up to all the backend changes for Druid 27

Specifically:

Add page information to SqlStatementResource API apache#14512
Allow empty tiered replicants map for load rules apache#14432
Adding Interactive API's for MSQ engine apache#14416
Add replication factor column to sys table apache#14403
Account for data format and compression in MSQ auto taskAssignment apache#14307
Errors take 3 apache#14004
abhishekagarwal87 pushed a commit that referenced this pull request Jul 17, 2023
This PR catches the console up to all the backend changes for Druid 27

Specifically:

Add page information to SqlStatementResource API #14512
Allow empty tiered replicants map for load rules #14432
Adding Interactive API's for MSQ engine #14416
Add replication factor column to sys table #14403
Account for data format and compression in MSQ auto taskAssignment #14307
Errors take 3 #14004

Co-authored-by: Vadim Ogievetsky <[email protected]>
@abhishekagarwal87 abhishekagarwal87 added this to the 27.0 milestone Jul 19, 2023
sergioferragut pushed a commit to sergioferragut/druid that referenced this pull request Jul 21, 2023
This PR aims to expose a new API called
"@path("/druid/v2/sql/statements/")" which takes the same payload as the current "/druid/v2/sql" endpoint and allows users to fetch results in an async manner.
sergioferragut pushed a commit to sergioferragut/druid that referenced this pull request Jul 21, 2023
…pache#14527)

One of the most requested features in druid is to have an ability to download big result sets.
As part of apache#14416 , we added an ability for MSQ to be queried via a query friendly endpoint. This PR builds upon that work and adds the ability for MSQ to write select results to durable storage.

We write the results to the durable storage location <prefix>/results/<queryId> in the druid frame format. This is exposed to users by
/v2/sql/statements/:queryId/results.
sergioferragut pushed a commit to sergioferragut/druid that referenced this pull request Jul 21, 2023
This PR catches the console up to all the backend changes for Druid 27

Specifically:

Add page information to SqlStatementResource API apache#14512
Allow empty tiered replicants map for load rules apache#14432
Adding Interactive API's for MSQ engine apache#14416
Add replication factor column to sys table apache#14403
Account for data format and compression in MSQ auto taskAssignment apache#14307
Errors take 3 apache#14004
@vogievetsky vogievetsky removed the Needs web console change Backend API changes that would benefit from frontend support in the web console label Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants