I (roughly) measured the required number of loops on my hardware.
At DELAY(1):
rtc0: first read took 0 iterations
rtc0: second read took 156 iterations
rtc0: third read took 169 iterations
At DELAY(10):
rtc0: first read took 0 iterations
rtc0: second read took 127 iterations
rtc0: third read took 117 iterations