1 comments

  • pranav1077 8 hours ago ago

    PyCanopy offers spatial query planning + adaptive spatial indexing in Polars-native format without having to switch to SQL. On Apache's single node spatial query performance benchmark, PyCanopy wins 12/24 testcases against industry standard tools like Apache Sedona and DuckDB.

    I got interested in this recently through my undergrad research, where it became clear that we lack intuitive + performant data tooling for using dataframes on spatial data. I thought it would be useful to have a library that offers some of the benefits of relational DBs (query planning, indexing, etc) in Polars-native form but abstract the complexities of this away from the user and require zero SQL.

    I've gone into more detail about how it performs on benchmarks + how it works in the repo so feel free to take a look and I'd be glad to hear thoughts / feedback