So it seems AZ excels in the kitchen as well.
For the earlier discussion about intelligence, and whether these RL successes may count as such: I wrote that intelligence is ability to solve previously unseen problems, almost the opposite of what reinforcement learning does. This felt a bit weak argument then, but I now convinced myself further with this analogy.
Consider animal behaviour, instincts in particular. Animals can solve complex problems but fail at simple ones if only minor things change and the behaviour suggested by instict is not working anymore. IMO evolutional selection, genetics and mutation = reinforcement learning, where the reward function is survival and reproduction. And instinct vs intelligence = animal vs human behaviour = RL vs intellect. So:
1. random play = eons of failures before success, then eons of failures again
2. reinforcement learning = eons of failures before success, then repeating success (until the problem changes)
3. intelligence = success reasonably soon, using knowledge from other domains