Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Initially I was really excited by Redshift, but when I got a chance to play with it I found out that there is no JDBC support for any kind of bulk insert or trickle loading.

The Postgres JDBC driver when you try and do batch inserts runs each statement individually and you end up inserting 10s of rows a second.

I wish they had gone with something like Vertica.



Look for the COPY command. Upload your raw data to S3 (gzipped if you like) and you can pull it straight in from there.


You can use the COPY command to do bulk imports from S3. We also support importing from DynamoDB.


What if DynamoDB doesn't solve my problem because I need transactions?

Why do I have to write code to perform an extra step and pay the extra cost and latency of pushing data through S3 just to get it into Redshift?

Not supporting trickle loading is a leaky abstraction IMO. It's not a ton of code to log statements until you have enough to justify an import and you shouldn't push that complexity on every database user.

Postgres supports copying from a binary stream, why not support that?


I'd have to agree with arielweisberg here. Our organization was really excited about Redshift a few days ago, but after seeing each of our individual INSERTs take upwards of 2 seconds, and hearing that we should first upload to S3 or Dynamo, we decided the platform would not fit our needs.

Our goal is minimal architecture complexity, and to upload log files or other data to a file system before loading it into a data warehouse just doesn't make sense.

We're currently looking into Hadoop/HDFS/Impala due to cost constraints (Vertica would have been our primary choice). If anyone has any other suggestions it would be great to hear them.


It runs each statement individually? Or commits each statement individually?

Column stores are destined to have slower inserts, due to how the data is stored on disk. But if they are actually committing each statement individually, that is a problem.


That is not the case with Vertica. Trickle loading is table stakes functionality IMO.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: