It drives me insane how very few people can extrapolate these scenarios based on what is happening today.
I have been trying to walk my social circle through these scenarios and I get either some denial pushback or outright dismissiveness. I don't think it comes from a place of pure denial, its just more they would rather not think about it.
It is really giving me Late February - Early March 2020 vibes.
Everyone is exhausted with extrapolating new trends into the future and trusting the experts. Especially after 2020. And by definition it’s useless to think about what’s after the Singularity anyway.
I found this really insightful. It all feels very straightforward but skeptics talk about the compute ceiling as a response. What’s the best criticism of your argument here? Let’s steelman this thing!
You don't have the solutions to start with – you generate the solutions and then verify they're correct. It's easier to verify a problem is correct than to generate the solution in the first place. (But yes this method only works if it's relatively easy to verify the solutions.)
Also note my impression is there's significant value in generating the whole chain of reasoning leading to the solution (even if you already know what the solution is).
The claim that this thing is at a PhD level at anything is just false as of right now.
Having said that. I do believe it's possible to RL all these different tasks and eventually it could naturally start to learn to generalize over all these tasks. It's probably only a matter of cost really.
Does that cost 1 trillion, 10, or 100 trillion?
Furthermore, I think innovation aka genuine PhD level work is nothing more than probing the space of concepts. So that should be solvable as well.
In principle I see no walls, but we shouldn't say these things are at PhD level at anything currently.
It drives me insane how very few people can extrapolate these scenarios based on what is happening today.
I have been trying to walk my social circle through these scenarios and I get either some denial pushback or outright dismissiveness. I don't think it comes from a place of pure denial, its just more they would rather not think about it.
It is really giving me Late February - Early March 2020 vibes.
Everyone is exhausted with extrapolating new trends into the future and trusting the experts. Especially after 2020. And by definition it’s useless to think about what’s after the Singularity anyway.
Important update on this post: https://substack.com/@benjamintodd/note/c-111372048?utm_source=activity_item
How can a young person prepare for this?
Reading the 80 000 hours career guide would be a good start
I found this really insightful. It all feels very straightforward but skeptics talk about the compute ceiling as a response. What’s the best criticism of your argument here? Let’s steelman this thing!
The compute ceiling is real - it just probably doesn’t hit until 2030. Full discussion of counterarguments and bottlenecks here: https://benjamintodd.substack.com/p/the-case-for-agi-by-2030
Nice great.
>You can ask GPT-o1 to solve 100,000 math problems, then take only the correct solutions, and use them to train the next model.
If you already have the solutions (needed to grade the problems), why not train on those in the first place?
You don't have the solutions to start with – you generate the solutions and then verify they're correct. It's easier to verify a problem is correct than to generate the solution in the first place. (But yes this method only works if it's relatively easy to verify the solutions.)
Also note my impression is there's significant value in generating the whole chain of reasoning leading to the solution (even if you already know what the solution is).
Would be curious to know what your thoughts are on Figure and Helix.
I think we are still with the problem where it aces all the tests but can't yet do the actual thing right?
Like crush competitive coding but fails at debugging properly a codebase in the real world (i believe openAI had some paper showing this).
The claim that this thing is at a PhD level at anything is just false as of right now.
Having said that. I do believe it's possible to RL all these different tasks and eventually it could naturally start to learn to generalize over all these tasks. It's probably only a matter of cost really.
Does that cost 1 trillion, 10, or 100 trillion?
Furthermore, I think innovation aka genuine PhD level work is nothing more than probing the space of concepts. So that should be solvable as well.
In principle I see no walls, but we shouldn't say these things are at PhD level at anything currently.
I'm claiming they're at PhD level at answering these questions. But clearly you're right PhDs do a lot more than answer well defined 1h questions.
What fraction of work is captured by benchmarks is a key uncertainty we face right now.
Also agree we'll be able to make a lot more progress from here, and that's maybe the key thing.