-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
6 changed files
with
284 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -29,6 +29,7 @@ API Reference | |
api/scalar | ||
api/builder | ||
api/table | ||
api/c_abi | ||
api/compute | ||
api/tensor | ||
api/utilities | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,44 @@ | ||
.. Licensed to the Apache Software Foundation (ASF) under one | ||
.. or more contributor license agreements. See the NOTICE file | ||
.. distributed with this work for additional information | ||
.. regarding copyright ownership. The ASF licenses this file | ||
.. to you under the Apache License, Version 2.0 (the | ||
.. "License"); you may not use this file except in compliance | ||
.. with the License. You may obtain a copy of the License at | ||
.. http://www.apache.org/licenses/LICENSE-2.0 | ||
.. Unless required by applicable law or agreed to in writing, | ||
.. software distributed under the License is distributed on an | ||
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
.. KIND, either express or implied. See the License for the | ||
.. specific language governing permissions and limitations | ||
.. under the License. | ||
============ | ||
C Interfaces | ||
============ | ||
|
||
ABI Structures | ||
============== | ||
|
||
.. doxygenstruct:: ArrowSchema | ||
:project: arrow_cpp | ||
|
||
.. doxygenstruct:: ArrowArray | ||
:project: arrow_cpp | ||
|
||
.. doxygenstruct:: ArrowArrayStream | ||
:project: arrow_cpp | ||
|
||
C Data Interface | ||
================ | ||
|
||
.. doxygengroup:: c-data-interface | ||
:content-only: | ||
|
||
C Stream Interface | ||
================== | ||
|
||
.. doxygengroup:: c-stream-interface | ||
:content-only: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,218 @@ | ||
.. Licensed to the Apache Software Foundation (ASF) under one | ||
.. or more contributor license agreements. See the NOTICE file | ||
.. distributed with this work for additional information | ||
.. regarding copyright ownership. The ASF licenses this file | ||
.. to you under the Apache License, Version 2.0 (the | ||
.. "License"); you may not use this file except in compliance | ||
.. with the License. You may obtain a copy of the License at | ||
.. http://www.apache.org/licenses/LICENSE-2.0 | ||
.. Unless required by applicable law or agreed to in writing, | ||
.. software distributed under the License is distributed on an | ||
.. "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
.. KIND, either express or implied. See the License for the | ||
.. specific language governing permissions and limitations | ||
.. under the License. | ||
.. highlight:: c | ||
|
||
.. _c-stream-interface: | ||
|
||
============================ | ||
The Arrow C stream interface | ||
============================ | ||
|
||
.. warning:: | ||
This interface is experimental and may evolve based on feedback from | ||
early users. ABI stability is not guaranteed yet. Feel free to | ||
`contact us <https://arrow.apache.org/community/>`__. | ||
|
||
The C stream interface builds on the structures defined in the | ||
:ref:`C data interface <c-data-interface>` and combines them into a higher-level | ||
specification so as to ease the communication of streaming data within a single | ||
process. | ||
|
||
Semantics | ||
========= | ||
|
||
An Arrow C stream exposes a streaming source of data chunks, each with the | ||
same schema. Chunks are obtained by calling a blocking pull-style iteration | ||
function. | ||
|
||
Structure definition | ||
==================== | ||
|
||
The C stream interface is defined by a single ``struct`` definition:: | ||
|
||
struct ArrowArrayStream { | ||
// Callbacks providing stream functionality | ||
int (*get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out); | ||
int (*get_next)(struct ArrowArrayStream*, struct ArrowArray* out); | ||
const char* (*get_last_error)(struct ArrowArrayStream*); | ||
|
||
// Release callback | ||
void (*release)(struct ArrowArrayStream*); | ||
|
||
// Opaque producer-specific data | ||
void* private_data; | ||
}; | ||
|
||
The ArrowArrayStream structure | ||
------------------------------ | ||
|
||
The ``ArrowArrayStream`` provides the required callbacks to interact with a | ||
streaming source of Arrow arrays. It has the following fields: | ||
|
||
.. c:member:: int (*ArrowArrayStream.get_schema)(struct ArrowArrayStream*, struct ArrowSchema* out) | ||
*Mandatory.* This callback allows the consumer to query the schema of | ||
the chunks of data in the stream. The schema is the same for all | ||
data chunks. | ||
|
||
This callback must NOT be called on a released ``ArrowArrayStream``. | ||
|
||
*Return value:* 0 on success, a non-zero | ||
:ref:`error code <c-stream-interface-error-codes>` otherwise. | ||
|
||
.. c:member:: int (*ArrowArrayStream.get_next)(struct ArrowArrayStream*, struct ArrowArray* out) | ||
*Mandatory.* This callback allows the consumer to get the next chunk | ||
of data in the stream. | ||
|
||
This callback must NOT be called on a released ``ArrowArrayStream``. | ||
|
||
*Return value:* 0 on success, a non-zero | ||
:ref:`error code <c-stream-interface-error-codes>` otherwise. | ||
|
||
On success, the consumer must check whether the ``ArrowArray`` is | ||
marked :ref:`released <c-data-interface-released>`. If the | ||
``ArrowArray`` is released, then the end of stream has been reached. | ||
Otherwise, the ``ArrowArray`` contains a valid data chunk. | ||
|
||
.. c:member:: const char* (*ArrowArrayStream.get_last_error)(struct ArrowArrayStream*) | ||
*Mandatory.* This callback allows the consumer to get a textual description | ||
of the last error. | ||
|
||
This callback must ONLY be called if the last operation on the | ||
``ArrowArrayStream`` returned an error. It must NOT be called on a | ||
released ``ArrowArrayStream``. | ||
|
||
*Return value:* a pointer to a NULL-terminated character string (UTF8-encoded). | ||
NULL can also be returned if no detailed description is available. | ||
|
||
The returned pointer is only guaranteed to be valid until the next call of | ||
one of the stream's callbacks. The character string it points to should | ||
be copied to consumer-managed storage if it is intended to survive longer. | ||
|
||
.. c:member:: void (*ArrowArrayStream.release)(struct ArrowArrayStream*) | ||
*Mandatory.* A pointer to a producer-provided release callback. | ||
|
||
.. c:member:: void* ArrowArrayStream.private_data | ||
*Optional.* An opaque pointer to producer-provided private data. | ||
|
||
Consumers MUST not process this member. Lifetime of this member | ||
is handled by the producer, and especially by the release callback. | ||
|
||
|
||
.. _c-stream-interface-error-codes: | ||
|
||
Error codes | ||
----------- | ||
|
||
The ``get_schema`` and ``get_next`` callbacks may return an error under the form | ||
of a non-zero integer code. Such error codes should be interpreted like | ||
``errno`` numbers (as defined by the local platform). Note that the symbolic | ||
forms of these constants are stable from platform to platform, but their numeric | ||
values are platform-specific. | ||
|
||
In particular, it is recommended to recognize the following values: | ||
|
||
* ``EINVAL``: for a parameter or input validation error | ||
* ``ENOMEM``: for a memory allocation failure (out of memory) | ||
* ``EIO``: for a generic input/output error | ||
|
||
.. seealso:: | ||
`Standard POSIX error codes <https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/errno.h.html>`__. | ||
|
||
`Error codes recognized by the Windows C runtime library | ||
<https://docs.microsoft.com/en-us/cpp/c-runtime-library/errno-doserrno-sys-errlist-and-sys-nerr>`__. | ||
|
||
Result lifetimes | ||
---------------- | ||
|
||
The data returned by the ``get_schema`` and ``get_next`` callbacks must be | ||
released independently. Their lifetimes are not tied to that of the | ||
``ArrowArrayStream``. | ||
|
||
Stream lifetime | ||
--------------- | ||
|
||
Lifetime of the C stream is managed using a release callback with similar | ||
usage as in the :ref:`C data interface <c-data-interface-released>`. | ||
|
||
|
||
C consumer example | ||
================== | ||
|
||
Let's say a particular database provides the following C API to execute | ||
a SQL query and return the result set as a Arrow C stream:: | ||
|
||
void MyDB_Query(const char* query, struct ArrowArrayStream* result_set); | ||
|
||
Then a consumer could use the following code to iterate over the results:: | ||
|
||
static void handle_error(int errcode, struct ArrowArrayStream* stream) { | ||
// Print stream error | ||
const char* errdesc = stream->get_last_error(stream); | ||
if (errdesc != NULL) { | ||
fputs(errdesc, stderr); | ||
} else { | ||
fputs(strerror(errcode), stderr); | ||
} | ||
// Release stream and abort | ||
stream->release(stream), | ||
exit(1); | ||
} | ||
|
||
void run_query() { | ||
struct ArrowArrayStream stream; | ||
struct ArrowSchema schema; | ||
struct ArrowArray chunk; | ||
int errcode; | ||
|
||
MyDB_Query("SELECT * FROM my_table", &stream); | ||
|
||
// Query result set schema | ||
errcode = stream.get_schema(&stream, &schema); | ||
if (errcode != 0) { | ||
handle_error(errcode, &stream); | ||
} | ||
|
||
int64_t num_rows = 0; | ||
|
||
// Iterate over results: loop until error or end of stream | ||
while ((errcode = stream.get_next(&stream, &chunk) == 0) && | ||
chunk.release != NULL) { | ||
// Do something with chunk... | ||
fprintf(stderr, "Result chunk: got %lld rows\n", chunk.length); | ||
num_rows += chunk.length; | ||
|
||
// Release chunk | ||
chunk.release(&chunk); | ||
} | ||
|
||
// Was it an error? | ||
if (errcode != 0) { | ||
handle_error(errcode, &stream); | ||
} | ||
|
||
fprintf(stderr, "Result stream ended: total %lld rows\n", num_rows); | ||
|
||
// Release schema and stream | ||
schema.release(&schema); | ||
stream.release(&stream); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters