Initially I was really excited by Redshift, but when I got a chance to play with...

jasonkester · on Feb 15, 2013

Look for the COPY command. Upload your raw data to S3 (gzipped if you like) and you can pull it straight in from there.

mallipeddi · on Feb 15, 2013

You can use the COPY command to do bulk imports from S3. We also support importing from DynamoDB.

arielweisberg · on Feb 16, 2013

What if DynamoDB doesn't solve my problem because I need transactions?

Why do I have to write code to perform an extra step and pay the extra cost and latency of pushing data through S3 just to get it into Redshift?

Not supporting trickle loading is a leaky abstraction IMO. It's not a ton of code to log statements until you have enough to justify an import and you shouldn't push that complexity on every database user.

Postgres supports copying from a binary stream, why not support that?

shanif · on March 1, 2013

I'd have to agree with arielweisberg here. Our organization was really excited about Redshift a few days ago, but after seeing each of our individual INSERTs take upwards of 2 seconds, and hearing that we should first upload to S3 or Dynamo, we decided the platform would not fit our needs.

Our goal is minimal architecture complexity, and to upload log files or other data to a file system before loading it into a data warehouse just doesn't make sense.

We're currently looking into Hadoop/HDFS/Impala due to cost constraints (Vertica would have been our primary choice). If anyone has any other suggestions it would be great to hear them.

darksaints · on Feb 15, 2013

It runs each statement individually? Or commits each statement individually?

Column stores are destined to have slower inserts, due to how the data is stored on disk. But if they are actually committing each statement individually, that is a problem.

arielweisberg · on Feb 16, 2013

That is not the case with Vertica. Trickle loading is table stakes functionality IMO.