Avatar
Posts of varying effort on technology, cybersecurity, transhumanism, rationalism, self-improvement, DIY, and other stereotypical technologist stuff.

Will superintelligent AGI even care about OCaml?

Every now and then I think of this guy that used to post solutions to the weekly Project Euler problems first. In assembler. Are they a madman, or is someone working at that tier of genius just so good they don’t need to use crutches like Python or C or OCaml?

By the way, this post assumes AI isn’t a grift. If you believe AI can’t add value to programmers or write as good if not better code than humans can right now, this post is probably not for you.

This post was born of a recent X (formerly Twitter) exchange with Yaron Minsky, my ex-boss and the patron saint of industrial OCaml. I love OCaml, but I think I’m right, and I’m pretty sure my argument has nothing to do with publicly showing that my ex-boss is wrong.

Here we go. Think of this as registering a prediction. I can’t really prove any of this since we’re talking about something that doesn’t exist yet, and that we can’t predict the shape of in advance.

My feeling, my vibe if you will, is a superintelligent AGI would find high-level programming languages like C++ or Python or OCaml a distraction. They would simply write everything in assembler. If they need to rework something, they could just pull it back in, lift it into whatever level of abstraction needed, say, lambda calculus or any other kind of ad hoc mathematical object (maybe one discovered just for this task), re-work it, and then blast assembler back out.

One counterargument is that programming languages are themselves a form of intelligence, so even a superintelligence would lever up by using OCaml instead of C++ instead of assembler.

Another counterargument is that there probably isn’t that much fruit to pick by writing everything in assembler, as optimizing compilers like LLVM are freakishly good. Doing everything in assembler is probably a minor speed-up at best at massive cognitive burden.

Finally, one could say “superintelligent AGI” is really hand wavey and proves too much. Even with quasi-infinite IQ, that doesn’t solve NP-hardness and finding the right abstraction from a binary executable blob might be intractable.

Programming languages as crystallized intelligence

As someone who has spent 15+ years now as an OCaml elitist (🧐), this argument resonates with me. Programming languages let you work natively at levels of abstraction that unlock things you’d never be able to solve at the Turing-machinist banging-rocks-together-to-make-compute level. Any intelligence, human or machine, surely benefits by plugging into a ready made tool designed for these abstractions.

Furthermore, languages have compressed lessons from failures! Design patterns that have to be laboriously re-done by hand are now first class features of modern languages!

Reasoning costs scale with representation size. A 200 line OCaml program might be 50,000 lines of assembler. Superintelligences still have finite compute, the laws of physics don’t cease just because we have superintelligence. Abstractions prune the search space! Superintelligence can’t work around NP-hardness, and strong types do a lot of work to make gigantic swaths of the search space unrepresentable and therefore excluded from consideration.

I counter that a superintelligence may simply not need compressed lessons. They can re-derive them ad hoc, as needed. If it’s smart enough, the basket of bundled compressed lessons is more like a prison than a basket of tools.

It’s true reasoning costs scale with representation size for humans. A superintelligence can hold a control flow graph in its head whether it’s 200 lines or 50,000 lines. And what matters is the algorithmic complexity, not the crude weight of the code.

Yes, types prune the search space, but it’s relative to the set of programs the language can express. The optimal program might live in a region that the type system can’t even express! Self-modifying code or very fancy JIT stuff, or even just plain dynamism with full duck-typed hideousness.

But you can’t IQmaxx your way out of NP-hardness!

The strongest objection is that “superintelligent AGI” is doing a lot of heavy lifting. It’s not really a falsifiable claim, I suppose.

Indeed, anyone who thinks you can just sprinkle magic IQ pixie dust on a problem until it falls over gets a stern talking-to from Stephen Wolfram, who’d remind you of exactly this. Lifting a stripped binary back into a clean refactor is harder the more clever the compiler was. Anyone who has spent time looking at a disassembly can tell you this.

To be clear, I’m not saying the superintelligence breaks NP-hardness, or cracks computational irreducibility, or somehow unlocks infinite compute1. I am saying something more boring: high-level languages do cognitive work, for humans specifically. They chunk programs into pieces small enough for human working memory. They scaffold attention because human attention is narrow. None of this is compute savings so much as it is cognition savings to work around the confines of thinking meat.

Most of the performance fruit has been picked by LLVM

You can find no shortage of people who post their tips and tricks on how to hand-optimize C code in assembly, only for someone to come along and say ackshually, LLVM has this heuristic built in already, and many others, and can spot it even in very counter-intuitive examples. You have made things at best only more complicated to reason about by trying to hand-write it in assembler.

Is a superintelligence really going to beat LLVM, both in breadth and depth? Wouldn’t it be better off just targeting LLVM?

I’m not sure it’s so far-fetched.

Maybe you have heard the lore about Rollercoaster Tycoon. Chris Sawyer wrote 99% of it in x86 assembler and it was insanely performant.

Back in the 90s and early 00s, the demoscene was full of impressive 3d engines that fit in 4k of code.

The performance of these programs was other-worldly, and I contend is because they were working at the lowest level at every possible angle.

Yes, we all know the law of performance that you should not prematurely optimize because 90%+ of your program spends its time in just a few small hot loops, but what if that’s only true if you’ve already taken the high-level language pill? Hardly anyone commissions rewrites of large mature applications in assembler by a crack team, mostly because it’s so expensive and hard. But what if LLMs could do it well, for modest cost?

SIMD has existed for decades but it’s hilariously famously underutilized. They’re very architecture specific, programmers are often unskilled in targeting them and hardly ever thinking about them, and compilers are just not very much help. Your List.map might be a lot better off as a SIMD vector mutation, but that’s almost never emitted. And there might be opportunities for this everywhere. Not just SIMD, but in all things.

In fact, the field already admits compilers have limits, it’s the whole reason GPUs exist and you’re expected to write completely alternative code in CUDA. That is, if you have the skills for it you write CUDA, and most programmers don’t (myself included!)

Anyway, high-level languages might well be destroying information hardware needs? Boxing kills SIMD, GC pointer-chasing kills cache locality. Higher-level doesn’t necessarily strictly lever you up. It may also be its own lossy compression of intent.

We’re already seeing what AI agents do when they code

Human SWEs react in horror when, after having their AI agents run off-leash for a few weeks, they crack open the code they’ve written: massively large functions and god objects. Don’t these things know anything about good software engineering discipline?

Yes, they do. They just don’t need it2, because that stuff is for humans. Everything we think of as a necessary engineering abstraction, types, immutability, DRY, encapsulation, separation of concerns, etc. could just be to work around the feebleness of human cognition.

WARNING: This post was written by a human, me, except for this next paragraph. I had written the next paragraph but it was very weak. So, I asked Claude to rewrite it in the voice of Steve Yegge drunk-posting.

Okay so look. Right around 2022, 2023 — the GPT-3.5 era, the dark ages, may God rest those little baby brains — there was a whole cottage industry of papers arguing strong types make LLMs hallucinate less. And those papers were CORRECT. I’m not litigating methodology. Those models were absolute lunatics. They hallucinated functions that did not exist. They invented APIs that no human being on planet earth had ever shipped. They returned JSON that mocked the very concept of structure. Types were the only adult in the room, and the papers said so in tables full of pass@1 numbers, and they were right about the world as it then existed. The thing is, those models are extinct now. They are paleontology. They are the trilobites of language models. You cannot point at a fossil and go “see, this is why we need OCaml,” because the great-grandchildren of that fossil are out there right now writing untyped Python that compiles in their head and runs the first time. The research isn’t wrong. It’s just describing a dead world.

Back to me. We’re assuming there’s an enormous gap between our brains and the limits of computation, and that programming languages exist to fill it. But a different kind of mind, one that isn’t thinking meat, might find that gap is much smaller than it looks, and these higher-level languages are just unnecessary fluff that gets in the way.

Right now we rewrite in assembly very selectively, because it’s extraordinarily expensive for us to write and for us to reason about. But if the cost of writing it plummets with superintelligent AGI, and it can reason about it just as easily, it’ll just write the thing in assembly.

Closing

Some think the guy on Project Euler who would solve the weekly problems first in assembler was a madman. I prefer to think he was an alien superintellect slumming it in our timeline.

In the near future, programming will mostly look like his: lift the problem into whatever mathematical object makes sense, twiddle with it, blast assembler back out. Don’t fret. The superintelligence will be there to rewrite the result in OCaml for us if we want to understand it.

Will AGI care about OCaml? Yes, much in the same way we care about Latin. Appreciates its beauty, but not much use for it. And really, I mean all human programming languages.

Footnotes

  1. That said, P vs NP has been open for decades. Computational irreducibility is a Wolframism, not a proof. Even laws of physics are mostly Lindyism in a trench coat. If superintelligence cracks these we’re going to have a much, much weirder conversation than this one. 

  2. It is possible that they’re only doing the god object/large functions thing because they’re partially learning it from all of the human slop on github, but it’s still the case that AI agents work with it just fine. 

all tags