Literature review on the benefits of static types, by Dan Luu

HaraldvonBlauzahn@feddit.org · edit-2 7 days ago

Literature review on the benefits of static types, by Dan Luu

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 7 days ago

An indisputable fact is that static typing and compilation virtually eliminate an entire class of runtime bugs that plague dynamically typed languages, and it’s not an insignificant class.

If you add a type checker to a dynamically typed language, you’ve re-invented a strongly typed, compiled language without adding any of the low hanging fruit gains of compiling.

Studies are important and informative; however, it’s really hard to fight against the Monte Carlo evidence of 60-ish years of organic evolution: there’s a reason why statically typed languages are considered more reliable and fast - it’s because they are. There isn’t some conspiracy to suppress Lisp.

HaraldvonBlauzahn@feddit.org · 6 days ago

you’ve re-invented a strongly typed, compiled language without adding any of the low hanging fruit gains of compiling.

This is aside from the main argument around static typing which claims advantages in developer productivity and eventual correctness of code.

Now, I think it is generally accepted that compilation with the help of static typing enables compilation to native code which leads to faster executables.

Now, did you know that sevral good Lisp and Scheme implementations like SBCL, Chez Scheme or Racket compile to native code, even when they are dynamically typed languages? This is done by type inference.

And the argument is not that these are as fast as C or Rust - the argument is that the difference might be significantly smaller than what many people believe.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 6 days ago

Now, did you know that sevral good Lisp and Scheme implementations like SBCL, Chez Scheme or Racket compile to native code, even when they are dynamically typed languages? This is done by type inference.

Compiled or not, inferred or not (Go has type inference; most modern, compiled languages do), the importance of strong typing is that it detects typing errors at compile time, not at run time. Pushing inference into the compile phase also has performance benefits. If a program does type checking in advance of execution, it is by definition strongly typed.

HaraldvonBlauzahn@feddit.org · 6 days ago

So, if type checking at compile time is universally good, why are there (to my knowledge) no modern and widely used Languages, perhaps with the exception of Pascal and Ada, where all arrays or vectors have a size that is part of their type?

TehPers@beehaw.org · 5 days ago

C, C++, and Rust come to mind as other languages with sizes as part of an array’s type. This is necessary for the compiler to know how much stack memory to reserve for the values, and other languages that only rely on dynamically sized arrays and lists allocate those on the heap (with a few exceptions, like C#'s mostly unknown stackalloc keyword).

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 6 days ago

Maybe it depends on your definition of “part of”, but

a := make([]int, 5)
len(a) == 5
len("hello") == 5

Arrays in Go have associated sizes.

But that’s beside the point; what does array size metadata have to do with strong typing?

HaraldvonBlauzahn@feddit.org · edit-2 5 days ago

I was not meaning slices or array size checks at runtime.

The array size can be part of the static type, that is one that can be checked at compile time. Rust, Pascal, and Ada have such Arrays.

But if static typing is always better, why are these types rarely used?

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 6 days ago

There’s a false equivalency here. Array sizes have nothing to do with static typing. For evidence, look at your own words: if undisputed strongly typed languages don’t support X, then X probably doesn’t have anything to do with strong typing. You’re conflating constraints or contracts or specific type features with type theory.

On the topic of array sizes, are you suggesting that size isn’t part of the array type in Go? Or that the compiler can’t perform some size constraint checks at compile time? Are you suggesting that Rust can perform compile time array bounds checking for all code that uses arrays?

TehPers@beehaw.org · 5 days ago

Are you suggesting that Rust can perform compile time array bounds checking for all code that uses arrays?

I’ll answer this question: no.

But it does make some optimizations around iterators and unnecessary bounds checks written in code at least.

But yes it does runtime bounds checking where necessary.

HaraldvonBlauzahn@feddit.org · 5 days ago

Actually, Rust checks arrays, which have a static size, at compile time, and slices and vectors, which have a dynamic size, at run time.

HaraldvonBlauzahn@feddit.org · 6 days ago

If a program does type checking in advance of execution, it is by definition strongly typed.

Look here, under “typing discipline”:

https://en.m.wikipedia.org/wiki/Lisp_(programming_language)

HaraldvonBlauzahn@feddit.org · 7 days ago

there’s a reason why statically typed languages are considered more reliable and fast - it’s because they are. There isn’t some conspiracy to suppress Lisp.

Then why is the SBCL implementation of Common Lisp about as fast as modern Java? I linked the benchmarks.

𝕽𝖚𝖆𝖎𝖉𝖍𝖗𝖎𝖌𝖍@midwest.social · 7 days ago

Java is still interpreted. It compiles to bytecode for a virtual machine, which then executes a for a simulated CPU. The bytecode interpreter has gotten very good, and optimizes the bytecode as it runs; nearly every Java benchmark excluded warm-up because it takes time for the huge VM to load up and for the optimization code to analyze and settle.

Java is not the gold standard for statically typed compiled languages. It’s gotten good, but it barely competes with far younger, far less mature statically typed compiled languages.

You’re comparing a language that has existed since before C and has had decades of tuning and optimization, to a language created when Lisp was already venerable and which only started to get the same level of performance tuning decades after that. Neither of which can come close to Rust or D, which are practically infants. Zig is an infant; it’s still trying to be a complete language with a complete standard library, and it’s still faster than SBCL. Give it a decade and some focus on performance tuning, and it’ll leap ahead. SBCL is probably about a fast as it will ever get.

HaraldvonBlauzahn@feddit.org · 6 days ago

Java is still interpreted. It compiles to bytecode for a virtual machine, which then executes a for a simulated CPU. The bytecode interpreter has gotten very good, and optimizes the bytecode as it runs

Modern JVM implementations use just-in-time (JIT) compilation of bytecode to native machine code. That can be faster than C code which is optimized without profiling (because the optimizer gets relevant additional information), and for example in the Debian computer languages benchmark game, for numerically-intensive tasks it runs typically at about half the speed of the best and most heavily optimized C programs.

And now, I have a bummer: These most heavily optimized C programs are not written in idiomatic C. They are written with inline assembly, heavy use of compiler intrinsics and CPU -dependent code, manual loop unrolling and such.

TehPers@beehaw.org · 6 days ago

TIL there’s such a thing as idiomatic C.

Jokes aside, microbenchmarks are not very useful, and even JS can compete in the right microbenchmark. In practice, C has the ability to give more performance in an application than Java or most other languages, but it requires way more work to do that, and it unrealistic for most devs to try to write the same applications in C that they would use Java to write.

But both are fast enough for most applications.

A more interesting comparison to me is Rust and C, where the compiler can make more guarantees at compile time and optimize around them than a C compiler can.

HaraldvonBlauzahn@feddit.org · edit-2 7 days ago

An indisputable fact is that static typing and compilation virtually eliminate an entire class of runtime bugs that plague dynamically typed languages, and it’s not an insignificant class.

Well, going back to the original article by Dan Luu and the literature he reviews, then why do we not see objective, reproducible advantages from this?

BatmanAoD@programming.dev · 6 days ago

Partly because it’s from 2014, so the modern static typing renaissance was barely starting (TypeScript was only two years old; Rust hadn’t hit 1.0; Swift was mere months old). And partly because true evidence-based software research is very difficult (how can you possibly measure the impact of a programming language on a large-scale project without having different teams write the same project in different languages?) and it’s rarely even attempted.

verstra@programming.dev · 7 days ago

Because it is hard to design a study that would capture it. Because it is hard to control many variables that affect the “bugs/LOC” variable.

HaraldvonBlauzahn@feddit.org · edit-2 6 days ago

An indisputable fact is that static typing and compilation virtually eliminate an entire class of runtime bugs that plague dynamically typed languages, and it’s not an insignificant class.

Just another data point, for amusement: There is a widely known language that eliminated not only memory management bugs like “use after free”, but also data races and similar concurrency bugs, in 2007. No, I am not talking about Rust, but Clojure. It does prevent data races by using so-called persistent data structures. These are like Python’s strings (in that they can be input to operations but never change), but for all basic types of collections in Clojure, namely lists, vectors, dictionaries / hash maps, and sets.

And Clojure is dynamically typed. So, dynamically typed languages had that feature earlier. (To be fair, Rust did adopt its borrow checker from research languages which appeared earlier.)

“But”, you could say, “Java was invented in 1995 and had already memory safety! Surely that shows the advantage of statically typed languages?!”

Well, Lisp was invented in 1960, and had garbage collection and was hence memory-safe.