Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to process all valid rows from a stream #1719

Open
lounsbrough opened this issue Jan 14, 2025 · 5 comments
Open

Unable to process all valid rows from a stream #1719

lounsbrough opened this issue Jan 14, 2025 · 5 comments

Comments

@lounsbrough
Copy link

lounsbrough commented Jan 14, 2025

We have a use case where we are streaming thousands of records from a database and occasionally one record will have an error. We would love to have the option to keep streaming records even when a read fails. The current implementation always destroys the stream on any error:

this.destroy(err);

@cjbj
Copy link
Member

cjbj commented Jan 14, 2025

What kind of error are you seeing?

@lounsbrough
Copy link
Author

lounsbrough commented Jan 14, 2025

We first build a query that invokes a function in the select list, for example:

SELECT field1, problem_function(field2)
FROM some_table

We then hand that query to queryStream and pipe it into a Postgres copy stream.

Many thousands of records get successfully piped into Postgres, but then at some point the function in our SELECT throws an error on one record. That Oracle error then causes the read to fail on the Node stream, and the way that queryStream is implemented, that error causes the entire stream to be destroyed. What would help us is if we had the option to make queryStream simply emit an error event, but not destroy the stream. All the remaining records could continue to be processed, and the consumer could then choose how to handle the emitted error events.

Our specific error is ORA-20100 and it is coming from a proprietary chemical cartridge in Oracle. I don't think that the specific error is relevant to the issue though, I think this would happen with any error thrown by Oracle when reading a record.

Hopefully that adds a little more context, let me know if you have more questions about what we're experiencing.

@cjbj
Copy link
Member

cjbj commented Jan 15, 2025

It would be "interesting" to try and work out if errors can be ignored, and what specific ones (user errors?) can be ignored. Perhaps querying can just continue and then fail on the next fetch if the DB stops sending rows altogether.

The simplest solution might be to keep all the data in Oracle DB :)

@lounsbrough
Copy link
Author

Yeah I agree it's possible that what I'm asking for is not feasible depending on how the interaction works between the getRow call and the database. I was assuming that the getRow would throw for the problematic row, but then the cursor(?) could continue to iterate over the result set. Maybe this is not the case, it's just how I imagined it in my head. I'm not an Oracle expert, I'm making a lot of assumptions as I think about this. I just wanted to throw this out as a helpful feature if it's possible to add.

@cjbj
Copy link
Member

cjbj commented Jan 15, 2025

The feedback about the goal and limitation is good to hear so that we can make changes, or help influence other Oracle teams' decisions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants