Header image
avatar
Sönke Ludwig • Tue, 16 Jan 2024

Stability and Performance #2 · Aspect Preview 35

Following up on the previous release, preview 35 again focuses on improving stability. Most importantly, we did additional work on memory consumption issues, to the point where ingesting large amounts of photos now works reliably within the expected boundaries for memory consumption, under all circumstances that we could reproduce.

As a quick warning, this announcement has turned out more technical than usual. If you don't know what a memory leak or a garbage collector is, the important message is just that the application now runs more reliable when adding many images to the library, especially on computers without a huge amount of RAM. However, for anyone interested in technical details, this is the background:

After fixing what seemed to be a memory leak in the previous release, and thinking that we were done, unfortunately reports of excessive memory use still turned up for the new release - happening either while scanning for images or during the image analysis process. Fortunately, after testing on different machines and with different collections of images, we were eventually able to reliably reproduce this on a Macbook Pro, so that we could start to investigate the source(s) of the problem (also thanks to a particularly useful bug report!). Now, as it turned out, the reason for this behavior was actually very similar to the one fixed in the previous release.

Under heavy load, with all CPU cores working — which is the case both, during scanning, as well as during the image analysis process — the garbage collector, which is responsible for managing the vast number of memory allocations within the application, sometimes tends to collect memory not frequently enough to keep up with new memory allocations, causing more and more memory to be reserved for the application. The fact that this is very dependent on the actual workload, as well as the speed of the disk and the number of CPU cores made this so hard to reproduce.

It should be said that this is not a true memory leak, as the garbage collector will eventually be able to reuse the memory for other tasks. However, it is not able to fully give the memory back to the operating system, so that it will still cause the computer to slow down due to excessive swapping once all of the available RAM is used up.

Now, finding the places that caused this behavior was not an easy task. It involved the use of a custom debugger (that I also plan to publish eventually, as it can be a huge help in other situations, too) and tapping into the garbage collector to find out which allocations might be responsible - of which there are a LOT. After days of testing and looking at debug output, we finally started to track down a handful of places that allocated temporary buffers, ranging from a few hundred kilobytes to a few megabytes in size.

After changing those to manually allocate and free the memory, instead of relying on the garbage collector, one by one, we could see the problem getting gradually smaller, until eventually we ended up at pretty much exactly the amount of memory used that would be expected!

Together with the other bug fixes in this release, we taking large steps for getting closer to the end of the beta stage. The change log lists all of the fixes in this release, which also includes a few crashes that have been reported through the crash reporter.

Comments for the post are currently disabled.

0 comments