Delay in the physical world always came with time relay transmission and reflection. The sound to the edge of the space, the sound reflected listener’s ear, later than the direct sound. That cause delay. And this is the basic mechanism of all the time-related effects, such as echo, flange, Hass.
When me and my friend talking in the elevator, his voice bouncing between the smooth surface, and some of the reflection(early-reflection) flying into my ear with a little delay compare to his raw voice. Their combination will eliminate part of the energy on certain frequencies. Which makes his voice sound like in a can. This phenomenon also happens in a car conversation.
When the acoustic space is bigger, like in a bedroom. I clap my hand, really loud, I can hear the delay comes from the early reflection clearly. The high frequency(might be because the single clap is similar to an impulse signal. Full of high-frequency) bouncing between the wall. Also can notice the volume of sound decrease in fast time.
Delay that happens in the virtual world mostly depends on the network. The route from my local computer to the server and return to my computer. When we are on a network call, the network transmission needs time, the network delay allows me to hear my voice again if the other side using the speaker and a sensitive microphone. Just like I am talking to myself.