Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'm annoyed that I bothered to read the tutorial to this. The TLDR: "Write some generators or functions, put them in a list, and Bonobo will call them all for you in order. Look at the example files for more." The example files are all basic string transformations. The docs are mostly blank pages and missing sections. What little is written has more jokes and conversational tics than information.

What does this even do? There's mention of DAGs and different execution strategies if you really dig through the docs, but is that it? If so, why would you use this instead of joblib or some other established parallelism lib?



Bonobo runs each functions in the pipeline in parallel and make the fifo queues plumbing and thread pool management completely transparent.

The TLDR would then be "Write some generators or functions, link them in a graph, and call them in order on each line of data as soon as the previous transformation node output is ready.". For example if you have a database cursor that yields each line of a query as its output, it starts to run the next step(s) in the graph as soon as the first result is ready (yet not stop yielding from database until the graph is done for the current row). I did not find it easy to do with the libraries I tried.

The docs clearly lacks completion to say the least, and would need an example with a big dataset, one with long individual operations and one with a non linear graph, so it's more obvious that, of course, it's not made to process strings to uppercase twice in a row.

Stay tuned, I'm very happy HN brought it to homepage, did not really think it could happen at this stage though and I understand you. But that's a good thing for the project to move forward.


This is really cool!

Python is my usual language of choice, but recently I picked up Go for some data processing because there was a lot of benefits to parallelising the task - which Go made easy.


Yeah, why does anyone need something to run some functions in order for them? I can do that, thanks. If it ran them on some... say 'big data' platform, that would be something. As is, this does not deserve to be front page. This is vaporware.


Yeah there seems to be a lot of marketing but I found a concise definition on the author's personal website: "extract transform load for python 3.5+". It could be noted that some of the earlier commit messages include "more marketing".


They mentioned it is in ALPHA, give it some time.


Pretty snappy website for a project that's in alpha.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: