Everything about this topic is reading and guesswork. I thought I had all the bases covered. Took me forever to suss this problem.
The program grabbed some data wrote it to a file and then transferred the file to another system. Sounds easy.
This was how I coded the program.
- Delete the file. Yep, if the delete failed because the file didn’t exist that was okay, because I didn’t want a file on the file system. If it failed because the file was in use and therefore locked from deletion, that told me that this file was in use and I didn’t want to mess with it.
- Create the file with exclusive access write only. Exclusive access stops two instances of the program from running simultaneously, because of the code in the initial step. Exclusive access stops something from reading the file when it is partially written.
- Write all of the records to the file.
- Close the file. This releases the file. It is no longer accessed exclusively. There is danger here, but this kind of timing issue wasn’t the base of the intermittent problem we encountered.
- Open the file with exclusive access read only. This is where it failed. Process was reporting that the file could not be found.
My best guess was I overlooked something or screwed up using the file system. So I started thinking about how file systems worked. This was put together from my reading of file system implementations.
I read that file systems are there to handle the slowness of the physical media. Command are queued. The file system receives the commands and responds to the program.
I dialed my work from my cell phone while I was sitting at my desk. The cell pone told me it was ringing. The work phone was not ringing. Until about three or four cell phone rings, the work phone did not ring. The phone network was proving me the phone is ringing feedback. There was a lag before the work phone received the signal to ring. This is the lag I’m calling latency. It isn’t normally visible unless you conduct an experiment like this to expose it.
So the file system knows enough about the state of the underlying media to provide the writing program correct feedback. So here is where I made a big guess. I put the open the file for reading statement in a loop. I would try to open the file, if it did not work, I would delay for a couple of seconds and then try again. The loop would fire about ten or so times before the sending program would exit with a file not found error.
The problem disappeared. So there seemed to be some lag between when the writing program flushes and closes the file and the file appears so that it can be opened and read by the sending process.
All of this may be wrong, but it is the current model of the way things work that I carry in my head.
Now I see latency everywhere and try to impute the mechanism involved. One place is the TV news? A reporter at the central office asks a question to some remote reporter. The remote reporter stands staring into the camera, as the audible question is digitized into packets and sent to the remote reporter where the packets are assembled into the audible question the remote reporter hears. If this persists for too many seconds, twenty seconds, the central reporter calls it as a communication problem and cuts to another story and tries again later. Otherwise the remote reporter responds within a second or two and there is that short lag between question and response.
Maybe the central reporter could press a button that starts a swirling browser ball, that says we sent a request and are waiting for the response. Then the remote reporter could press a button that will kill the swirling browser ball when the remote reporter starts talking.
Sometimes pictures are worth a thousand words, but I can’t figure out what images would go good with this topic so I give you a choice of search engine image queries, a bing query or a google query or a duckduckgo query.
Found the following video on youtube. It explains latency very well,. It can be found at https://www.youtube.com/watch?v=UWeMWIoUWQA on the techquickie channel. Warning: Some violent game images.
That’s all for now.