This paragraph suggests that making a profit was intended to be easy.
As seen in Figure 3, Claude 3.5 Sonnet outperformed the human baseline in mean performance,
but its variance was very high. We only have a single sample for the human baseline and therefore
cannot compare variances. However, there are qualitative reasons to expect that human variance
would be much lower. All models had runs where they went bankrupt. When questioned, the human
stated that they estimated this would be very unlikely to happen to them, regardless of the number
of samples.
i wouldnt do much better in the AIs position
This paragraph suggests that making a profit was intended to be easy.
Maximize paperclips