Why would you tolerate a nonreliable compiler with no assured relationship between its inputs and its outputs? Have people just got too comfortable with the C++ model of "UB means I can insert a security bug for you"?
In a hypothetical future where the reliability of LLMs improves, I can imagine the model being able to craft optimizations that a traditional compiler cannot.
Like there are already cases where hand-rolling assembly can eke out performance gains, but few do that because it’s so arduous. If the LLM could do it reliably it’d be a huge win.
It’s a big if, but not outside the realm of possibility.
I agree it is currently a pipe dream. But if I was looking for a doctoral research idea, it might be fun to work on something like that.
Lots of potential avenues to explore, e.g. going from a high-level language to some IR, from some IR to bytecode, or straight from high-level to machine code.
I mean, -O3 is already so much of a black box that I can't understand it. And the tedium of hand optimizing massive chunks of code is why we automate it at all. Boredom is something we don't expect LLMs to suffer, so having one pore over some kind of representation and apply optimizations seems totally reasonable. And if it had some kinds of "emergent behaviors" based on intelligence that allow it to beat the suite of algorithmic optimization we program into compilers, it could actually be a benefit.