Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Feather (Arrow IPC) is zero copy and an order of magnitude simpler. Parquet has a lot of compatibility issues between readers and writers.

Arrow is also directly usable as the application memory model. It’s pretty common to read Parquet into Arrow for transport.



When you say compatibility issues, you mean they are more problematic or less?

It’s pretty common to read Parquet into Arrow for transport.

I'm confused by this. Are you referring to Arrow Flight RPC? Or are you saying distributed analytic engine use arrow to transport parquet between queries?


Not the OP, but Parquet compatibility issues are usually due to the varying support of features across implementations. You have to take that into account when writing Parquet data (unless you go with the defaults which can be conservative and suboptimal).

Recently we have started documenting this to better inform choices: https://parquet.apache.org/docs/file-format/implementationst...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: