In my niche corner of scientific computing it feels like Cython has largely been replaced by Numba and CFFI, or just Julia. Last I checked it still needed setup.py which is a bit of a deal breaker in 2025.
[build-system]
requires = ["setuptools", "cython"]
[tool.setuptools]
ext-modules = [
{name = "example", sources = ["example.pyx"]} # You can also specify all the usual options like language or include_dirs
]
That's true but I still don't see that so much because the core libraries are not as mature and often they're just thin wrappers around the C/C++/Fortran API without examples. Just as an example, I'd count this SUNDAILS library as like that:
https://docs.rs/sundials/0.3.2/sundials/
Nothing wrong with that as a starting point of course, but it's easier just to compile it as a dependency and look at the core documentation if you're familiar with C++; you'll need to be reading the C++ examples anyway to write Rust code with it.
What I mean is that (at least in my experience) people are not so commonly writing serious numeric applications in Rust as Python extensions because the numeric libraries on which you'd typically write in a compiled language are not as well developed and are in themselves often thin wrappers over C/C++ code at the moment. When you write an extension library you typically want all the 'slow' stuff to be done in a layer below the interpreted language for performance reasons.
So if you wanted to write a Python Physics library that included, say, time integration with an implicit solver like those SUNDIALS provides (and SUNDIALS is like the gold standard in this area), you have less well used options for the time integration part if you write the extension in Rust as if you do in C/C++. Or you're using the same library anyway.
SymPy has Solvers for ODEs and PDEs and other libraries do convex optimization. SymPy also has lambdify to compile from a relatively slow symbolic expression tree to faster 'vectorized' functions
>>> """Convert a SymPy expression into a function that allows for fast numeric evaluation""" [with e.g. the CPython math module, mpmath, NumPy, SciPy, CuPy, JAX, TensorFlow, PyTorch (*), SymPy, numexpr, but not yet cmath]
I’m perfectly familiar with SymPy and it’s great but it doesn’t have methods comparable in performance in stiff PDEs to CVODE, and it’s not parallelised either. CVODES offers sensitivity analysis, ARKODE offers multi rate integrators for systems where the ODE can be decomposed into slow and fast rates, etc. etc. - it’s a much more sophisticated and specialist library.
Persons who need pyproject.toml functionality could consider contributing tests so that the free functionality might be considered adequate for their purposes.
I haven't kept track of numba in recent years. But there is a clear path to translate more and more scikit-learn to mojo, bypassing the python interpreter entirely. And then things become much more composable in a way that numba can't be.
We are heavily leaning on Julia, and to my mind Mojo is a major threat to the long term development of the Julia community. If people dissatisfied with Python+C(++)-Silos end up writing Mojo instead of Julia it will become even harder to grow the ecosystem and community.
That said, for now Julia has a number of big strengths for scientific work that don't seem to be in the focus of the Mojo devs...
Very interesting. I'm currently trading off whether to use Mojo or C++/pybind to accelerate simulations that combine matrix operations with fine-grained scalar calculations. I only recently learned that pybind + cppimport offers the integrated compile-on-import experience available in Mojo.
Mojo makes SIMD and GPU programming more ergonomic than what you would obtain from C++, I imagine this should factor into your decision process. The language is just less mature overall.
In my niche corner of scientific computing it feels like Cython has largely been replaced by Numba and CFFI, or just Julia. Last I checked it still needed setup.py which is a bit of a deal breaker in 2025.
/? cython pyproject.toml: https://www.google.com/search?q=cython+pyproject.toml
From "Building cython extensions using only pyproject.toml (no setup.py)" https://github.com/pypa/setuptools/discussions/4154#discussi... :
Pybind11 seems more popular in my area now. I still like Cython though in terms of the ease of wrapping anything in a Python-y interface.
extern "C" functions + ctypes are a personal favorite - it's the least "type-rich" approach by far, and I prefer poverty to this sort of riches
Obligatory Rust + PyO3/Maturin plug. Very ergonomic and easy to use.
That's true but I still don't see that so much because the core libraries are not as mature and often they're just thin wrappers around the C/C++/Fortran API without examples. Just as an example, I'd count this SUNDAILS library as like that: https://docs.rs/sundials/0.3.2/sundials/
Nothing wrong with that as a starting point of course, but it's easier just to compile it as a dependency and look at the core documentation if you're familiar with C++; you'll need to be reading the C++ examples anyway to write Rust code with it.
And it will get even better with reflection, there are already a few talks on the matter, generating Python bindings with C++26 reflection.
Sorry, I can't find a relationship between Sundials and PyO3/Maturin. Am I missing something?
What I mean is that (at least in my experience) people are not so commonly writing serious numeric applications in Rust as Python extensions because the numeric libraries on which you'd typically write in a compiled language are not as well developed and are in themselves often thin wrappers over C/C++ code at the moment. When you write an extension library you typically want all the 'slow' stuff to be done in a layer below the interpreted language for performance reasons.
So if you wanted to write a Python Physics library that included, say, time integration with an implicit solver like those SUNDIALS provides (and SUNDIALS is like the gold standard in this area), you have less well used options for the time integration part if you write the extension in Rust as if you do in C/C++. Or you're using the same library anyway.
It looks like Narwhals; "Narwhals and scikit-Lego came together to achieve dataframe-agnosticism" https://news.ycombinator.com/item?id=40950813 :
> Narwhals: https://narwhals-dev.github.io/narwhals/ :
>> Extremely lightweight compatibility layer between [pandas, Polars, cuDF, Modin]
> Lancedb/lance works with [Pandas, DuckDB, Polars, Pyarrow,]; https://github.com/lancedb/lance
SymPy has Solvers for ODEs and PDEs and other libraries do convex optimization. SymPy also has lambdify to compile from a relatively slow symbolic expression tree to faster 'vectorized' functions
From https://news.ycombinator.com/item?id=40683777 re: warp :
> sympy.utilities.lambdify.lambdify() https://github.com/sympy/sympy/blob/master/sympy/utilities/l... :
>>> """Convert a SymPy expression into a function that allows for fast numeric evaluation""" [with e.g. the CPython math module, mpmath, NumPy, SciPy, CuPy, JAX, TensorFlow, PyTorch (*), SymPy, numexpr, but not yet cmath]
I’m perfectly familiar with SymPy and it’s great but it doesn’t have methods comparable in performance in stiff PDEs to CVODE, and it’s not parallelised either. CVODES offers sensitivity analysis, ARKODE offers multi rate integrators for systems where the ODE can be decomposed into slow and fast rates, etc. etc. - it’s a much more sophisticated and specialist library.
Thanks, but experimental support based off a Github comment is not what I'm looking for when I distribute software.
Persons who need pyproject.toml functionality could consider contributing tests so that the free functionality might be considered adequate for their purposes.
I haven't kept track of numba in recent years. But there is a clear path to translate more and more scikit-learn to mojo, bypassing the python interpreter entirely. And then things become much more composable in a way that numba can't be.
We are heavily leaning on Julia, and to my mind Mojo is a major threat to the long term development of the Julia community. If people dissatisfied with Python+C(++)-Silos end up writing Mojo instead of Julia it will become even harder to grow the ecosystem and community.
That said, for now Julia has a number of big strengths for scientific work that don't seem to be in the focus of the Mojo devs...
> Mojo is a major threat to the long term development of the Julia community
Mojo has 3 disadvantages compared to Julia:
1) The core team is focused on the Linux+servers+AI combination, because that's where the money is.
2) Less composability due to the lack of multiple dispatch.
3) The license.
Yeah, I went to JuliaCon last year, and it was clear that Julia really seems to have found it's niche in the scientific computing world.
I like the language, but as I do ML, Python is really the only game in town, and Mojo is looking promising.
> Last I checked it still needed setup.py which is a bit of a deal breaker in 2025.
lolwut
A more interesting path is to keep dbscan_inner in pure python with type annotations and then use
to translate.Very interesting. I'm currently trading off whether to use Mojo or C++/pybind to accelerate simulations that combine matrix operations with fine-grained scalar calculations. I only recently learned that pybind + cppimport offers the integrated compile-on-import experience available in Mojo.
I would say it depends on how stable you need the code to be.
If it's something you need to put in production soon, C++/pybind might be the way to go, but if it's just a side-project, Mojo could work.
Mojo makes SIMD and GPU programming more ergonomic than what you would obtain from C++, I imagine this should factor into your decision process. The language is just less mature overall.
Depends on how much you care to work on Windows, if not at all, then Mojo can be considered.
Mojo is not open source, so how can it be realistic to use it in scikit-learn?
We spent decades getting out of the clutches of Mathworks, Microsoft, etc. Why are people eager to go back that way?
They want to open-source the language, and call me naive, but I do believe that they will.
The licence is a bit weird to me though. I do get it for their main product, Max, but it is a bit of a weird one for a language.
> I think moving a lot of scikit-learn’s more computationally intensive code to Mojo could be an interesting project.
Only if you want to lose access to Windows users, as it is a low priority for Mojo development.
Fair, but it would also be a multi-year project, and I wouldn't take it seriously until Mojo reaches a 1.0
As per current roadmap that seems something around 2027, assuming everything goes as planned.
That's not too bad right, seeing as 2026 is getting pretty close?
Enough time for landscape changes, though.
Somehow just trying to navigate to this website makes my browser crash.
Firefox on Android with NoScript.
Something with Noscript is causing it. I was able to load it fine, then installed noscript and it suddenly crashed
Mhh, any idea what I could do? It's my website.
I just use Quarto to create a static site, but I am also very clueless about web stuff.