84 Commits

Author SHA1 Message Date
Jingyu Zhou
2d2a2144f4 Update copyright years to 2013-2026 (#12653)
No functional changes.
2026-01-22 10:49:41 -08:00
Zhe Wang
eefe635119 Use simple counter to replace recently added net2 counters (#12358)
* use simple counter to replace recent added net2 counters

* allow unit test to use SimpleCounter Trace event
2025-09-05 18:30:36 -07:00
Zhe Wang
a2940a7e89 add more network counters (#12349) 2025-09-04 10:33:47 -07:00
gxglass
b4f6716434 Updates to Platform.cpp relating to format_backtrace (#12342)
Refactor ifdefs to make format_backtrace() be exposed for _WIN32, and remove WIN32 ifdefs in non-Platform.cpp code.

In shell commands emitted by format_backtrace, change the addr2line command to specify the full path to the addr2line binary, to avoid people getting hung up by the broken shell alias in /root/.bashrc. As far as I can tell this alias exists to invoke gcc- or clang-specific addr2line binaries, which is easy enough to solve in the .cpp code.

Delete logic that naively appends "debug" related suffixes to the fdbserver binary name. The code doesn't know if it's a debug binary or not, and if it's not, the most obvious solution is to rebuild everything. Copying binaries around manually is not needed and the emitted commands shouldn't encourage this, as far as I can tell. They should just use the given binary name.

Tested the addr2line invocations by inserting some segfaults and running the commands (with minor updates to give the absolute path to the fdbserver binary). Tested on gcc and clang debug builds.


* Specify absolute path to addr2line binaries, to avoid a broken shell alias.  Also, move _WIN32 ifdefd code out of general purpose source files and into Platform.cpp

* s/brain dead/not helpful/

* fix comment

* Platform.actor.cpp: reduce size of __unixish__ block to enable format_backtrace to be defined for _WIN32

* format code

* fix i vs j bug in _WIN32 code; obviously not compiled or tested by us
2025-09-02 16:55:07 -07:00
Zhe Wang
80d7ff6c3e Add connection counters (#12292)
* add connection counters

* address comments
2025-08-11 10:49:56 -07:00
gxglass
91574960c2 Add server-side metrics for TLS handshakes on side threads and on the main thread. (#12269)
Rename existing (just-added) metrics to indicate that they are client-side.

Co-authored-by: Gideon Glass <gglass_glass@apple.com>
2025-07-22 20:41:31 -07:00
gxglass
8d9f762346 Add metrics to count how often we run TLS handshake on the main thread vs side threads. (#12268)
Also add a comment suggesting that ever using the main thread for this is a bad idea.

Co-authored-by: Gideon Glass <gglass_glass@apple.com>
2025-07-22 16:42:02 -07:00
gxglass
a602543df2 Refactor getDiskStatistics() and add stats for bytes read and bytes written (#12247)
* Refactor getDiskStats() now that it returns about 10 values.

Compute byte stats from sector level stats by assuming sectors are 512 bytes
and using multiplication.  Rationale for why sectors can be assumed to be 512
bytes is given in a comment.

Add byte level rate stats to the trace output emitted in customSystemMonitor().

Testing: passed 99995 runs here:

  20250715-001140-gglass-90a10b1f1bdd221a            compressed=True data_size=41293265 duration=5559400 ended=100000 fail=5 fail_fast=10 max_runs=100000 pass=99995 priority=100 remaining=0 runtime=1:26:20 sanity=False started=100000 stopped=20250715-013800 submitted=20250715-001140 timeout=5400 username=gglass

I couldn't figure out about the 5 failures but I don't think they are related to my change.

* Fix compile error on MacOS relating to readMilliSecs et al now being members of diskStats

---------

Co-authored-by: Gideon Glass <gglass_glass@apple.com>
2025-07-15 14:02:10 -07:00
Syed Paymaan Raza
c3e7542cda Update end year in copyright header 2024-08-02 09:40:11 -07:00
Johannes M. Scheuermann
b2b3a8e791 Fix complie issue for ALLOC_INSTRUMENTATION 2024-02-01 14:48:04 +01:00
Andrew Noyes
218cda3cf6 Lower ASAN memory usage (#9216)
* Print an asan heap profile on OOM

* Use 32KiB stacks for boost coro

* Print 100%, 10 max contexts for asan OOM

* Lower machineCount to 30 in DataLossRecovery test

* Add asanMachineCount override to control ASAN memory usage
2023-01-24 13:04:47 -08:00
sfc-gh-tclinkenbeard
5a1a969343 Trace data hall id in MachineMetrics events 2023-01-11 10:02:31 -08:00
Sreenath Bodagala
774fc1168e - Log FoundationDB version as part of "ProcessMetrics". 2022-12-12 21:16:30 +00:00
Xiaoge Su
970463223c Merge branch 'main' into main 2022-09-20 16:56:56 -07:00
A.J. Beamon
4fd64630e8 Convert literal string ref instances to use _sr suffix 2022-09-19 11:35:58 -07:00
Xiaoge Su
8130fce97f Update code per comments
Also sort #include in Platform.actor.cpp
2022-09-12 11:44:41 -07:00
Xiaoge Su
92eaf53da3 Reads and reports cpu.stat
Per #7952, content in /sys/fs/cgroup/cpu,cpuacct/cpu.stat is reported in
MachineMetrics.

If the file does not exist, reports `NoCpuStatFile`.

If the file is not parsable, reports `CpuStatFileParseError`.
2022-09-12 11:43:10 -07:00
Yi Wu
994b8c92f8 Add option to limit resident memory and remove default memory limit (#6719)
Changing `memory` option to limit resident memory instead of virtual memory, in config file and fdbserver/fdbbackup/fdbcli command-line argument. Since `rlimit` doesn't support limiting virtual memory, the current implementation have both of fdbmonitor and the fdbserver/fdbbackup process checking process RSS periodically and kill and restart the process if the limit is exceeded.

Adding a new `memory_vsize` option to limit virtual memory, if backward-compatible behavior is desired.

closes #6671, closes #6672
2022-04-06 20:06:24 -07:00
Trevor Clinkenbeard
6390d93efd Merge pull request #6646 from sfc-gh-tclinkenbeard/fix-copyright-headers
Update copyright header dates
2022-03-21 16:49:20 -07:00
sfc-gh-tclinkenbeard
a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Steve Atherton
074698cdb1 Added 16k magazine size to memory stats. (#6639) 2022-03-21 13:30:27 -07:00
Andrew Noyes
7a9217a392 Add contrib/debug_determinism (#6389)
* Add contrib/debug_determinism

Add an instrumentation-based technique for debugging unseen mismatches. Also guard a few existing sources of nondeterminism that don't affect unseen with the DEBUG_DETERMINISM macro.

Also change the simulated run loop to not run as the only task inside the real run loop, since that was a source of nondeterminism.

Also fix nondeterminism from calling timer_int

* Add StorageMetadataType::currentTime

Basically a deterministic-in-simulation version of timer_int that we can
use instead of timer_int for StorageMetadataType::createdTime
2022-02-25 12:54:31 -08:00
Renxuan Wang
4a8e2a80e6 Improve/fix disk metrics.
1. Introduce processDiskReadSeconds and processDiskWriteSeconds, which stands for disk read/write times `since the last logging`. They can only be obtained on Linux and macOS, and will be 0 on Windows and FreeBSD;
2. Rename `busyTicks` to `IOMilliSecs`;
3. On FreeBSD, the metrics should be collected among all devices.
2022-01-27 14:40:32 -08:00
Yao Xiao
c8e6819a10 Add FastAlloc memory utilization trace. (#5739)
Co-authored-by: Yao Xiao <yaoxiao@Yaos-MacBook-Pro.local>
2021-10-11 15:06:43 -07:00
Zhe Wu
c07a07dbbe Take uptime into account when making failover decision 2021-10-07 11:19:34 -07:00
Trevor Clinkenbeard
0120a6ba72 Merge pull request #4936 from sfc-gh-tclinkenbeard/remove-string-copies
Remove unnecessary std::string copies from flow
2021-06-14 13:49:20 -07:00
sfc-gh-tclinkenbeard
399c2c96f0 Remove unnecessary std::string copies from flow 2021-06-09 11:40:01 -07:00
sfc-gh-tclinkenbeard
f28ac955c3 Remove unnecessary temporary objects while growing objects of type std::vector<std::pair<A, B>> 2021-05-10 16:32:50 -07:00
FDB Formatster
df90cc89de apply clang-format to *.c, *.cpp, *.h, *.hpp files 2021-03-10 10:18:07 -08:00
Andrew Noyes
877997632d Merge branch 'release-6.3' into anoyes/merge-release-6.3-master
Include conflict markers for review purposes
2020-12-04 01:38:07 +00:00
Andrew Noyes
1f541f02be Merge branch 'anoyes/merge-6.2-to-6.3' into anoyes/release-6.3-merge
Merge, leaving conflict markers for now
2020-11-24 16:55:34 +00:00
David Youngworth
d64cf8b9e3 Merge branch 6.3 into master 2020-11-17 11:22:45 -08:00
David Youngworth
d0391db862 Merge branch 'release-6.2' into release-6.3 2020-11-16 10:15:23 -08:00
Xiaoge Su
3a6948c199 Report histogram periodically 2020-11-12 17:04:33 -08:00
Vishesh Yadav
2c56d379b2 Merge pull request #3998 from dongxinEric/misc/attach-dcid-to-process-metrics-when-possible
Attach datacenter id to process, network, machine and memory metrics.
2020-11-06 10:54:23 -08:00
Andrew Noyes
c50e997f60 Make status tests deterministic
This change seems to be incorrect since afaict INetwork::timer isn't
guaranteed to be monotonic. Maybe we can make that guarantee or add an
INetwork::timer_monotonic symbol?
2020-11-05 17:05:34 +00:00
Russell Sears
32c87bbb33 Lightweight, power of two spaced histogram implementation + automatic reporting 2020-11-02 11:13:16 -08:00
Xin Dong
e73d189f88 Attach datacenter id to process, network, machine and memory metrics. 2020-10-30 11:20:40 -07:00
Young Liu
8cc3e4d3c6 Merge release-6.3 into master 2020-10-19 22:51:56 -07:00
Meng Xu
4dff55c4ea Add comment for PriorityStarved metrics
Metrics include:
PriorityStarvedBelowX, PriorityMaxStarvedBelowX and PriorityBusyX
2020-10-16 13:45:03 -07:00
sfc-gh-tclinkenbeard
0ac08f6a9b Replace NULL with nullptr in flow 2020-09-20 11:31:49 -07:00
Evan Tschannen
2f5359fa13 fix: lastRunLoopBusyness did count the currently active time 2020-08-31 09:21:44 -07:00
Evan Tschannen
f6f9aea09e fix: runLoopBusyness was always zero 2020-08-28 09:29:54 -07:00
Evan Tschannen
9e2ee1ed4c fixed lastedZeroBusy; added a knob 2020-08-17 23:16:59 -07:00
Evan Tschannen
c72068d6b5 clients load balance across proxies based on process busyness instead of number of requests 2020-08-12 17:17:21 -07:00
A.J. Beamon
d8690d31cd Merge branch 'master' into per-priority-busy-logging
# Conflicts:
#	flow/Net2.actor.cpp
2020-04-15 08:31:30 -07:00
A.J. Beamon
b1172417f5 Merge branch 'master' into per-priority-busy-logging
# Conflicts:
#	flow/Knobs.cpp
#	flow/Knobs.h
#	flow/Net2.actor.cpp
2020-04-14 14:22:12 -07:00
A.J. Beamon
e104a2e3a6 Merge commit 'cf01233f28a2c42908656a39f458a4475c1d44a3' into run-loop-busy-profiler
# Conflicts:
#	documentation/sphinx/source/release-notes.rst
#	fdbclient/NativeAPI.actor.h
#	fdbserver/fdbserver.actor.cpp
#	flow/Net2.actor.cpp
2020-04-14 14:02:24 -07:00
Alex Miller
04498cbc0e Make policy failures be reported as per 1s and not over 5s. 2020-03-13 02:49:06 -07:00
Alex Miller
75e2fffe5a Add a ProcessMetrics.TLSPolicyFailures metric
This reports the number of policy failures over the past 5s interval.
It also is step 1 towards getting this information into status json.
2020-03-13 02:24:37 -07:00