Refactor ifdefs to make format_backtrace() be exposed for _WIN32, and remove WIN32 ifdefs in non-Platform.cpp code.
In shell commands emitted by format_backtrace, change the addr2line command to specify the full path to the addr2line binary, to avoid people getting hung up by the broken shell alias in /root/.bashrc. As far as I can tell this alias exists to invoke gcc- or clang-specific addr2line binaries, which is easy enough to solve in the .cpp code.
Delete logic that naively appends "debug" related suffixes to the fdbserver binary name. The code doesn't know if it's a debug binary or not, and if it's not, the most obvious solution is to rebuild everything. Copying binaries around manually is not needed and the emitted commands shouldn't encourage this, as far as I can tell. They should just use the given binary name.
Tested the addr2line invocations by inserting some segfaults and running the commands (with minor updates to give the absolute path to the fdbserver binary). Tested on gcc and clang debug builds.
* Specify absolute path to addr2line binaries, to avoid a broken shell alias. Also, move _WIN32 ifdefd code out of general purpose source files and into Platform.cpp
* s/brain dead/not helpful/
* fix comment
* Platform.actor.cpp: reduce size of __unixish__ block to enable format_backtrace to be defined for _WIN32
* format code
* fix i vs j bug in _WIN32 code; obviously not compiled or tested by us
* Refactor getDiskStats() now that it returns about 10 values.
Compute byte stats from sector level stats by assuming sectors are 512 bytes
and using multiplication. Rationale for why sectors can be assumed to be 512
bytes is given in a comment.
Add byte level rate stats to the trace output emitted in customSystemMonitor().
Testing: passed 99995 runs here:
20250715-001140-gglass-90a10b1f1bdd221a compressed=True data_size=41293265 duration=5559400 ended=100000 fail=5 fail_fast=10 max_runs=100000 pass=99995 priority=100 remaining=0 runtime=1:26:20 sanity=False started=100000 stopped=20250715-013800 submitted=20250715-001140 timeout=5400 username=gglass
I couldn't figure out about the 5 failures but I don't think they are related to my change.
* Fix compile error on MacOS relating to readMilliSecs et al now being members of diskStats
---------
Co-authored-by: Gideon Glass <gglass_glass@apple.com>
* Print an asan heap profile on OOM
* Use 32KiB stacks for boost coro
* Print 100%, 10 max contexts for asan OOM
* Lower machineCount to 30 in DataLossRecovery test
* Add asanMachineCount override to control ASAN memory usage
Per #7952, content in /sys/fs/cgroup/cpu,cpuacct/cpu.stat is reported in
MachineMetrics.
If the file does not exist, reports `NoCpuStatFile`.
If the file is not parsable, reports `CpuStatFileParseError`.
Changing `memory` option to limit resident memory instead of virtual memory, in config file and fdbserver/fdbbackup/fdbcli command-line argument. Since `rlimit` doesn't support limiting virtual memory, the current implementation have both of fdbmonitor and the fdbserver/fdbbackup process checking process RSS periodically and kill and restart the process if the limit is exceeded.
Adding a new `memory_vsize` option to limit virtual memory, if backward-compatible behavior is desired.
closes#6671, closes#6672
* Add contrib/debug_determinism
Add an instrumentation-based technique for debugging unseen mismatches. Also guard a few existing sources of nondeterminism that don't affect unseen with the DEBUG_DETERMINISM macro.
Also change the simulated run loop to not run as the only task inside the real run loop, since that was a source of nondeterminism.
Also fix nondeterminism from calling timer_int
* Add StorageMetadataType::currentTime
Basically a deterministic-in-simulation version of timer_int that we can
use instead of timer_int for StorageMetadataType::createdTime
1. Introduce processDiskReadSeconds and processDiskWriteSeconds, which stands for disk read/write times `since the last logging`. They can only be obtained on Linux and macOS, and will be 0 on Windows and FreeBSD;
2. Rename `busyTicks` to `IOMilliSecs`;
3. On FreeBSD, the metrics should be collected among all devices.
This change seems to be incorrect since afaict INetwork::timer isn't
guaranteed to be monotonic. Maybe we can make that guarantee or add an
INetwork::timer_monotonic symbol?