Sunday, June 2, 2013

DataStore.edb and Windows Vista

This here is not so much an instruction, but a use case.

So I got me to look at a Windows Vista system that has been running for quite a long time. With just 1 (one) gigabyte of RAM.

At every operating system start, it would search for updates and the hard disk would thrash alot.

With Resource Monitor I discovered that the oft-accessed file was DataStore.edb, located at
%WINDIR%\SoftwareDistribution\DataStore\

DataStore.edb had ballooned to was the size of 325 MB.

The file is a log file in database format listing the history of all updates installed to the system, and also includes the current status of updates waiting to install.

Now, the typical solution[Yes, Citation needed] to this that I had been looking at on the interwebs has been to create a backup of this file and then delete it. After that, Windows Update won't list the history of all updates installed to the system, and that would be that. < Well, there's more than that :>

Before manipulating DataStore.edb, turn off the Windows Update service, because when that service is active, the DataStore file is in use.

esentutl

For a short while I thought that defagmenting this one file with a command-line tool called esentutl would be the solution, so I wanted to go on with it. The catch was that the tool would not be able to copy the defragmented file to its original location and yielding an error about it, saying that the defragmented TEMP file could still manually be taken to its original location, with the original file replaced. I didn't do it and left it at that.

By the way, the esentutl tool does not list where the .TMP file is located. I eventually found out that it was at the main user data folder of a logged-in user:
C:\Users\username

For a long time I couldn't put my finger on what it was that was not working, and then a month or two later it turned out that I had not been running the esentutl command in Administrator mode.

The full command for defragmenting the file went on like this:
esentutl /d %windir%\SoftwareDistribution\DataStore\DataStore.edb

So I launched Command Prompt as Administrator and the tool did its job as expected. So that was that.

But checking for updates in Windows Update took a lot of time anyway, and the hard disk still kept thrashing when checking for updates.

What happens when removing datastore.edb

Note that you will still have to back up the file, just in case...

Well, Windows Update then knows no history of previous updates and takes a lot of time to check for them. Maybe an hour. Or so, because I assume it will check updates file-by-file for nearly all present Microsoft software. Before, when the still-large datastore.edb file was there, it seemed that the check for updates actually took much, much less time. Since I didn't measure the actual minutes and did not compare, then I can't tell with any reliable numbers as to what the effect was with regards to differences in update checking times.

Anyways, after installing the updates, DataStore.edb was recreated and sported about the same size as before (over 315 MB). So, there is really no point in deleting the file.

Worse is, that there doesn't seem to be any built-in way to merge two separate DataStore.edb files into one cohesive database, if the older database were absent for a short while and another one created anew.

Then I just moved the recently backed-up datastore.edb back, and checking for updates in Windows Update took about ten minutes, including a reasonably minor database refresh on account of the May updates installed in the interim, which was not reflected there.

To avoid Windows Update thrashing the hard drive anyway, it's then best not to have the Windows Update service run with such a low RAM count at all (1 GB); perhaps with the exception of every second Tuesday each month.

So much for now.

28.04.2014 update:

I don't have that Vista computer at hand, or any other Vista computer in any useful proximity, so it's impossible to tell with precision where to optimize wrt Windows Update.

Point being that it's possible to keep Windows Update from checking updates at every startup by making changes in scheduling.
  • I remember there being a scheduler snap-in module in Windows Management Console, which feature-wise in Windows Vista replaced the Scheduled Tasks folder in Windows xp. The Task Scheduler snap-in is alot more complex and allows very granular configuration options.
  • Turning off automatic updates is useful for very experienced users. If the computer has behaved well, then I've usually set the Windows Update service to delayed start. Attempt this at your own risk. Even if the risk is low, it doesn't account for unintended behaviour, especially when making large updates, such as upgrades to Windows service packs.

I also hazily remember a separate update not advertised on Windows Update itself, that also must have improved the situation, if only a bit. The overall result was that the DataStore file was touched only when I launched the Windows Update program.

3 comments:

Anonymous said...

Found this post too late, otherwise i could have saved me some efforts in trying to reduce datastore.edb. I ran into exactly the same problems as mentioned above. After renaming the map and running winupdate the edb was as big as it was before.
A thx to Mardus for the clear description.

Anonymous said...

Thanks. I've had this problem for ages, I only recently learned how to use Resource Monitor to find that it was the Datastore.edb file.

I'm going to try turning automatic updates off as suggested, and put a reminder in my calendar.

I don't know if adding more RAM will help anyone, I've got 4GB.

Mardus said...

4 GB is reasonably good.

I usually also have indexing turned off, or at the very least targeting only very specific folders.

With the example machine that I had at my disposal, Vista with 1 Gb will run, but — even with most apps that would otherwise run at startup — removed, there is still a third or a fourth of that one gigabyte in use by the operating system itself.

If the release of Vista were delayed for a year and given more time for testing and optimizing to resolve some of the things that have become serious issues only after an extended period of usage, then the end result would have been better.