> I am utterly puzzled by how you can see this killer feature as a downsides.
It depends on the usage that you make of the language. I don't use Julia as a general-purpose programming language, but only for doing numeric computations. For that usage, the multiple dispatch feature is not necessary (all my variables are matrices of the same numeric type), and indeed it has downsides. For example, the infamous "time to first plot" is so large in Julia precisely due to the need to support multiple dispatch. If this benchmark is becoming faster is only due to a huge and complex effort in optimization. I would prefer if this effort could be spent in improving support for sparse matrices, for example.
I acknowledge that "time to first plot" may be a useless benchmark for many, even for most, people. But you can also acknowledge that multiple dispatch is a useless feature for a few graybeards like me, for whom multiple dispatch is the main reason why Julia feels slow. Indeed, when you use Julia as an interpreted language (e.g., write a Julia script), execution is very slow due to the complexities introduced by multiple dispatch, that force essentially to recompile all libraries upon each execution of my tiny script.
Multiple dispatch isn't useless for you. It's the reason the numerics are so good even when everything is the same type. Multiple dispatch makes it way easier to write generic code which means that all the libraries you use are about 10x easier to write.
Their point is that they're not trying to write generic code. In that case, Julia should act like an interactive Fortran, which would be nice, but then it needs to fix latency and binary building for the "this is pure Float64 and we know it" case. The only gain one really gets here is that type inference means you can write a bit simpler code, but that's not a huge win. That's a valid criticism that should hopefully be addressed soon, with https://github.com/JuliaLang/julia/pull/41936 being one of the biggest steps to getting there, along with the more precompilation stuff.
Not to second-guess you (you know far more about Julia than I) but my experience is that people who think that their code is completely uniformly typed still win massively from Julia.
It starts with "just dense matrices with Float64" and then, like the poster above, "but with some nice sparse matrix support" and then "oh and special handling for Vandermonde matrices", "oh and tridiagonals" and eventually "my matrices are hyper-<mumble>-<mumble>-symmetric and I can compute vector dot products really fast". At that point, they have been using multiple dispatch for months without knowing it.
In my own case it was "I don't need multiple dispatch" until it was "oh... it surely is nice that the H3 geo-hashes work transparently with LibGEOS even though the underlying C libraries don't work together."
The point is that Julia makes using things work together far better than anything else I have seen. It even makes completely asocial C libraries talk to each other.
Thanks for your answer! I'm the one who sees everything as "just float matrices". Do you have a simple example of a numeric algorithm where multiple dispatch really makes an difference? (as opposed to simply allowing to pass different types of numbers to your functions). I have trouble imagining that.
In octave/matlab I can already use the "sum" function over dense and sparse matrices, but you wouldn't say that the language is multiple dispatch. It doesn't seem like a big deal. More importantly, I want the "sum" function to have exactly the same meaning regardless of the data type.
I abhor the idea of hiding algorithms into types, on a very fundamental level. Allowing an algorithm to behave differently depending on the type of the input data seems totally wrong to me. The famous Stefan Karpinski talk at JuliaCon 2019 is one of the most horrifying videos I've ever seen. I still wake up at night trembling and with cold sweat when I dream about this talk.
>I want the "sum" function to have exactly the same meaning regardless of the data type.
You really don't. If you have a sparse matrix, you don't want to spend 99% of your time adding zeros.
In general, the advantage of multiple dispatch is it means you can automatically get optimal algorithms for a variety of type combinations. To show why this matters, look at matrix multiplication. If you have a high level function that multiplies matrices, you want to call the appropriate BLAS function (for dense inputs). That function will be one of SGEMM,SSYMM,STRMM,DGEMM,DSYMM,DTRMM,CGEMM,CSYMM,CTRMM,ZGEMM,ZSYMM, or ZTRMM depending on the type (and element type) of matrix (this is a simplified example, in the real world you also might want to diagonal, banded, CSR, CSC, block, or any of 20 or so different matrix types). Without multiple dispatch, you have to either write a bunch of if-else statements to choose the appropriate one, or you just convert everything to a dense (and probably double precision) matrix first. The first one is totally un-maintainable, and the second one will make your program an order of magnitude slower when you ignore structure inherent to your problem. With multiple dispatch, you just call * and it does all the hard stuff for you.
The proof that multiple dispatch is necessary is that most numerics libraries that aren't in Julia make ad-hoc and slow implementations of it internally. For example, here is Pytorch's implimentation https://pytorch.org/tutorials/advanced/dispatcher.html
Thanks, here you presented an example that actually makes sense to me.
In case of octave/matlab, of course computing the sum() of sparse matrices does not traverse all the zeros. But the result is the same as if it did, just much faster. There are also many types of sparse matrices, I wonder how does it work internally, it must have a similar mechanism. Probably, since it is interpreted in real time, each time that the "sum" function is called, it checks the type of the arguments and it calls the appropriate function.
> most numerics libraries that aren't in Julia make ad-hoc and slow implementations
The word "slow" has a relative meaning here... if you take into account the time of the first compilation. For example, calling an octave script that performs a matrix product is faster than the equivalent julia script.
>the result is the same as if it did
This isn't quite true. Differently ordered operations can lead to different results due to floating point math, but in general, the point of multiple dispatch isn't to produce different results for different types. The example from Stefan's talk does this when contrasting with operator overloading because it's easier to show a difference in results than a difference in performance.
I don't know the specifics of how Matlab/Octave do this, but you're probably right about runtime checks.
It's true that compilation time will make Julia slower than Octave for a single multiplication, but if you are doing a bunch of them (especially if they are small), Julia can perform the computations with much lower overhead since it doesn't have to do type checks for stable programs. Also, Julia lets you define specialized matrix types which can be asymptotically faster than Octave for specific circumstances. A great example of this is BandedBlockBandedMatrix from https://github.com/JuliaMatrices/BlockBandedMatrices.jl which are extremely useful solving of PDEs quickly, but aren't implemented in any of the other languages because they have to write all their dispatch systems manually for each algorithm.
Oh I definitely agree with you, and I think the OP's response to this post starts to go into the subjective ergonomic feels territory that is just, well okay everyone feel differently. But there is a technical point that precompiling and building binaries for the simple Float64 case is a bit harder than it should be compared to C or Fortran, so there is still a reason to grab Fortran every once in awhile. It's not unfixable, but it's something we should acknowledge that we can improve.
It kind of is. Multiple dispatch means that a static binary has to deal with most of the downsides of heavily templated C++ code (but much worse since everything is templated). Fixing this is highly non-trivial.
It depends on the usage that you make of the language. I don't use Julia as a general-purpose programming language, but only for doing numeric computations. For that usage, the multiple dispatch feature is not necessary (all my variables are matrices of the same numeric type), and indeed it has downsides. For example, the infamous "time to first plot" is so large in Julia precisely due to the need to support multiple dispatch. If this benchmark is becoming faster is only due to a huge and complex effort in optimization. I would prefer if this effort could be spent in improving support for sparse matrices, for example.
I acknowledge that "time to first plot" may be a useless benchmark for many, even for most, people. But you can also acknowledge that multiple dispatch is a useless feature for a few graybeards like me, for whom multiple dispatch is the main reason why Julia feels slow. Indeed, when you use Julia as an interpreted language (e.g., write a Julia script), execution is very slow due to the complexities introduced by multiple dispatch, that force essentially to recompile all libraries upon each execution of my tiny script.