-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
blocking node with large result sets #181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yikes. The only thing I can think of is postgres is pushing you rows as fast as is possible and they're taking all the memory available. Though the API is evented, it can still get overwhelmed by a back-end system sending you 1x10^6 rows. Are you using a query with parameters or one without? |
Hi, thanks for your response. Maybe the issue is somehow correlated with the following behavoir. Here my results: loading all rows into memory:average(10000) = 110.9 ms ms The first encounter with this issue was the following situation: Given a database (DB1) containing a table with 10^6 rows and another database (DB2) with an empty copy of the same table. We created two connections with both databases (evented API) and tried to read form DB1 and stream the data into DB2 on each 'row' event. Did we abuse the driver? Or did we use it in a manner it was never meant to? thanks in advance |
I have an issue with Node blocking as well. Ive structured my query method using both the callback and event listener methods and while I get results back from both, the following scenario blocks. Client requests query of about 180,000 rows. an instance pg client is created and the query is sent to a remote server on the same network. The query takes about 10 seconds to run. All my console.log debugging message fire immediately alerting me that all my methods have finished save the query, which is how node is supposed to work. The client sends a second request 1 second later for a dataset of 500 rows. The client uses $.post in both instances and I know it isn't blocking for the request. My server will not take my second request until the dataset from the first request has been returned. Is there a problem with multiple async requests from the same client? Am I missing something? Here is my code. Thanks in advance. |
the PostgreSQL server only supports 1 executing query per client. If you On Mon, Mar 4, 2013 at 3:56 PM, rhibnes notifications@github.com wrote:
|
Does this help? Single row mode in libpq is only available for 9.2+ though. Regardless it seems like it would be worth implementing. I have been wondering how to handle very large result sets and this seems like the solution as long as you are using 9.2+. You can also use FETCH. |
There is another use case for supporting single row mode which I just ran into. Let's say you want to do a COPY but want the output of a particular column reformatted. For example, "true" instead of "t" for boolean columns. As far as I am aware, you can't do this with COPY or at least not easily. But you could do this with a |
…+. This will enable single row mode only when the user wants to stream rows.
…+. This will enable single row mode only when the user wants to stream rows.
Single row mode would indeed help with this but still wont properly take into account back-pressure when you are reading rows faster than you can process them. The problem is |
Hi @brianc, sorry to bother you here. Do you have plans to integrate async generators cursors into pg-cursors ? ( i cannot assume everybody is familiar with async generators (aka. cursors), so here's a gist to browse all your filesystem in a |
We have a table with more than 1x10ˆ6 entries. We use the evented API for data retrieval.
Although ignoring the row event (for debugging purposes) the node server is blocked during data processing.
Using the non-native mode the node server is not blocked but responding extremely slow (up to 5 secs instead of 100 mis).
In the native mode the node server is completely blocked and allocates more than 2Gig of RAM.
In order to dive into the problem we tried to use your own "large-datatset-bench.js" with slightly different number of rows (1x10ˆ6).
The execution times with respect to number of items to be inserted does not increase in a linear manner (as expected).
What we have noticed during both test cases was the fact that the Postgres backend processes did not use more than 1.5 % of CPU time.
The text was updated successfully, but these errors were encountered: