Why Your Windows Log Size Settings May Be Too Big
Awhile back, I posted about how certain versions of Windows always have the capability to lose logs. I encourage you to read the full post to understand the issues involved, then come back here and continue reading.
The basic problem is that Windows loads the full log into a shared memory space, and if it’s too big, then logs will be silently dropped. That’s why it is very important to have a centralized logging solution with logs sent in real-time or pretty close to it.
So how does centralized logging affect the local retention settings? To know that, you have to make certain assumptions about log size, average number of events generated per day, and so-on. Let’s start with this quote from Microsoft’s Threats and Countermeasures Guide:
Although there is no simple equation to determine the best log size for a particular server, you can calculate a reasonable size. The average event takes up about 500 bytes within each log, and the log file sizes must be a multiple of 64 KB. If you can estimate the average number of events that are generated each day for each type of log in your organization, you can determine a good size for each type of log file.
For example, if your file server generates 5,000 events per day in its Security log and you want to ensure that you have at least 4 weeks of data available at all times, then you would want to configure the size of that log to about 70 MB. (500 bytes * 5000 events/day * 28 days = 70,000,000 bytes.)
70MB doesn’t sound so bad. It’s probably well within the architectural limits of this memory-mapped I/O thing, but even that may be too big.
A better way of calculating what the local log size should be is to consider the recovery time objective and recovery point objective of the centralized log server. That is, consider how long it might take you to get your log server back online in the event of a disaster, then base your local Windows logging settings on that.
For example, if your recovery point objective is three days, you want to make sure that the local Windows logs will collect for at least three days, but not much more than that. Ideally, you also want to make sure that the solution you’re using can forward the logs collected during the downtime, or have a high-availability solution as part of the plan.
Remember, the goal is to get the local log sizes as small as possible, not as large as possible. You want to make sure you’re not losing any logs, and the two best ways to achieve that are to use centralized logging and try to avoid maxing out the services.exe process. This is counter-intuitive, but when you think it through, you’ll realize that it makes sense.