Why Windows Can Always Lose Logs
Note: The following post applies to Windows versions prior to Vista. I have not researched how logging has changed in versions greater than Vista. I can only assume and hope that is has changed for the better in new versions.
I’m going to let you in on something that not many people realize. There’s nothing you can do, no setting you can configure and no possible way not to lose logs in Windows. Due to the way Windows is designed, there’s always a chance of missing logs on an otherwise running and functional system. Let’s explore why this is. Hang on to your hat, we’re going into the weeds.
The Windows Eventlog service uses memory-mapped i/o and each active log is fully opened in memory. The eventlog service resides within the services.exe process, which, due to architectural limitations, only allocates 2GB to the process on a 32-bit system.
Within this 2GB, logs are allocated in 64k chunks. New logs take a 64k chunk of memory and take further chunks as needed, ultimately loading the entire log into memory.
So far this isn’t too earth shattering. Smart people know that although you can configure each log to grow to 4GB individually in Event Viewer (see KB 183097), because of this memory-mapped i/0 thing, limited memory is available. So, as the KB article suggests, they configure the log size to 300MB and “Overwrite as needed.” Depending on how busy the host is this could mean a retention of one day or one year.
Here’s where things get funky. Recall that the services.exe process has limited memory available to it (normally 2GB). You have taken that into account by configuring a smaller log size than what Windows says you can configure. You have “Overwrite events as needed” configured to ensure that you’re under this limit. But did you know that this process isn’t just used for Windows logging? The services.exe process also plays host to other services, each of which may use memory for data, dlls and so-on. All of those have to be taken into account when configuring the log size. If you don’t, you’ll hit that 2GB wall a lot sooner than you expect.
The question is: how do you know how much memory to plan for? How do you know what data other applications may load? How can you properly scope the memory needed for logging, other applications, their data and dlls? The short answer is, practically speaking, you can’t. There are just too many variables out of your control.
So what actually happens when Windows can’t allocate enough memory for logging via the services.exe process? Recall that logs are allocated in 64k chunks. Each allocation must be contiguous; if there is not a contiguous chunk of memory available, even if there is sufficient memory available, logging will fail. If allocation fails, it is a silent error. A log that is below its configured size, but which is not given another chunk of contiguous memory to work with (even if non-contiguous memory is available), will simply fail to log anything until another contiguous chunk is available. Logs will be missing and you will be none the wiser. A log that is already at its size limit will, indeed, “overwrite as needed,” and as long as those chunks of memory are available, logs won’t be missing.
Unless perhaps you have “Retain x days” configured.
“Retain x days” sounds like a pretty good idea, right? Your security policy states that you need to have 30 days of logging available online, so you configure Windows to retain 30 days of logs. Simple enough. The problem is that new events in a full log will overwrite events older than x days, but if there are no events older than x days, new events will be discarded. There will likely be unexplainable gaps in your logs.
Taking the above into consideration, what can you do to better your odds of not losing logs? Here are my recommendations:
- Don’t use the “Retain x days” setting. Simply abandon it. It’s not worth the hassle and confusion. You’re better off not using it at all.
- Use “Overwrite as needed.” Although this won’t keep you from losing logs, it’s the best of what is available.
- Increase your RAM to more than 4GB. This will lesson the odds of you running out of memory before that 2GB limit is reached.
- Use the x64 architecture. 64-bit systems offer a higher memory limit per process than 32-bit. This gives you more potential allocated memory per-process to work with (not to be confused with total system memory.)
- Use the /3GB switch. Windows has an option which allows for applications on a 32-bit system to have an extra GB allocated to them, at the expense of one less GB for the operating system (4GB being the total available for 32-bit to work with). Instructions on how to do this can be found here.
- Use a central logging server. This is perhaps the best countermeasure against losing logs. If logs are sent to a central server in real-time, a relatively low log size can be configured in the Event Viewer. This also lowers the chance of running into a memory limitation.
None of these suggestions can fix the underlying problem. The problem is rooted deep within the design of Windows and, to the best of my knowledge (please let me know if I am incorrect), there is nothing that can be done to absolutely ensure that logs won’t be lost on an otherwise healthy system. I imagine this has been corrected on Windows versions Vista and above. At least, I hope it has. For now, may you enjoy the Holiday season and may you have enough memory for proper logging!