BHeapSampler: Arndt's Java Heap Analyis Tool FAQ
Questions & Answers
What is BHeapSampler?
BHeapSampler is a Java Memory Analysis Tool that presents a graph-view of a java heap dump
How does it work?
It takes a (size weighted) random sample of objects from where to compute root-paths and adds these path up to a graph
Why a graph? How does it compare to Eclipse-MAT's Dominator Tree?
BHeapSampler generates a class-level graph. Eclipse-MAT's Domitator Tree is an instance level tree.
These are two totally different concepts. BHeapSampler achieves information-compaction
by projecting the instance-graph to a graph whos nodes more or less correspond to classes.
More or less because the node's identity is anything between instance
and class, subject to configuration, but mostly it's class identity and second most it's
"parent identity", meaning that the node inherits it's identity from it's referes.
How is the result presented?
BHeapSampler does not include a graph-layouter. The resulting graph is stored in "DOT"-language.
(see http://en.wikipedia.org/wiki/DOT_language) Graphical layout can be done using
GraphViz, e.g. for PDF-Format via command-line:
"dot -Tpdf -omemory_graph.pdf memory_graph.dot"
Why statistical amd not exact? What about the statitical error?
Just for algorithmic reasons. The graph view was derived from a similar tool
for performance-analysis from statistical stack-sampling - just replaced the
stacktraces by memory-paths. That's why it's still statistical. Never thought
about an exact approach, maybe that's possible.
However, statistical error is not the problem when analysing a heap dump. Conceptual problems in assigning
memory allocation to responsible structures are where the headache comes from.
Statistical error can always be as low as needed by calculating enough paths.
What is the maximum size of heap dumps BHeapSampler can process?
BHeapSampler loads part of the heap-dump in it's own memory, but never allocates more memory than 2/3 of the
size of the dump. So there's no limit in size, just a 2-billion limit in object count (32-bit max-int).
Just -Xmx enough heap memory to the tool.
Which heap-dump-formats are supported?
BHeapSampler reads heap-dumps in HPROF-Binary format. The extended HPROF-format used by Android/Dalik
can be processed as well (tested for Gingerbread only, use the hprov-conv tool if it fails).
Does BHeapSampler ignore weak references in path-finding?
Yes and no. It uses sorted avoid-class-lists, which defaults to avoid Weak-/Soft-Refs
and the finalizer queue, and prefers static versus dynamic roots. So weak paths are found
if and only if no strong path exists, and the finalizer queue is found if and only
if the object is finalizable.
How can I find an alterative path if the graph shows a non-exlusive, shortest path?
You can either use the random-walk path finder, or you can modify the avoid-class-lists
to avoid the path that you already know.
Why are there two different avoid class lists?
The avoidGhostList is used for ghost-like references (Weak/Soft/Finalizers). The avoidClassList
is used to specify preferences between normal, strong paths. The technical difference is that
paths avoided via the ghost-list do not prevent the strong path from beeing displayed as exclusive.
Is BHeapSampler better than Eclipse-MAT?
BHeapSampler is powerful at what it does: presenting an intuitive view of the dominating structures of a java heap.
It can do that in productive environments where extracting the heap dump to the developers desk may
fail due to size or security restrictions. However, it's just a command-line tool and not
a full-featured memory-debugger, so facing it to other tools is comparing apples and oranges.
What is the develoment status of BHeapSampler?
Statistical heap analysis is a new idea, and BHeapSampler was started as a proof-of-concept with
limited time-budget. It has evolved to a field-proven tool, but there's no ongoing active development
other than bugfixes. It is presented here as a binary "run-and-see" version with limited configuration
(e.g. with hardcoded identity policy) mainly to present the concept of getting an inituitive view on
memory structures via a class-level graph. It would be nice to see the community pick up the idea for
an open-source project or for integration into existing tools.
You provide BHeapSampler as an obfuscated binary. How can I be sure that it does not do
any bad things like calling home or tampering with my system?
It does nothing like that. Just reads the dump and writes two files. I did not use any agressive
obfusaction options and it's only 20k of bytecode, so you can just decompile and read what it's doing.