Pandas feels clunky coming from R. What about Haskell?

(mchav.github.io)

23 points | by mchav a day ago ago

7 comments

  • kermatt a day ago ago

    Although not quite the same without a pipe operator, https://pola.rs was an improvement for me when missing the R dataframe syntax.

    • minimaxir a day ago ago

      OP makes that concession in the first section of the post. (I may or may not have made a similar comment before deleting in kneejerk shame)

  • rgavuliak 18 hours ago ago

    > This has a great SQL-ish API. Python is similar but starts to be a little clunky since it requires you to think about indices:

    groupby has an as_index parameter for this very purpose

    > Deducting the discount

    You focus on doing the subtraction during the group by. Is there any good reason for this? You could either do it as a step before, or after summing up both columns. Putting too many things into one command is not good practice yet you benchmark the language based on how easy it is to do said bad practice

    • mchav 16 hours ago ago

      I think the original author picked this example to broadly illustrate how easy it is to make ad hoc changes to your query without worrying about lot about implementation details. Polars, for example, converges on a similar API and gives you the flexibility. You can iterate then refactor easily later to what you consider good practice.

      • rgavuliak 12 hours ago ago

        For me the whole piping felt like making everything less readable and harder to debug compared to a string of commands.

  • internet_points 20 hours ago ago

    > You need to read the previous line to understand what

    that ended rather abruptly?

  • a day ago ago
    [deleted]