Commit Graph

303 Commits

Author SHA1 Message Date
Syed Paymaan Raza
c146ee0869 [fdbserver] Use STL contains method and std::find for containment checks (#11702) 2024-10-15 11:40:02 -07:00
Syed Paymaan Raza
c3e7542cda Update end year in copyright header 2024-08-02 09:40:11 -07:00
Jingyu Zhou
d9e4c49503 Fix more -Wunused-variable warnings 2024-07-17 15:35:49 -07:00
Jingyu Zhou
3a3ee247ab Fix Wunused-but-set-variable warnings 2024-07-17 13:09:32 -07:00
neethuhaneesha
8de8dd4281 RKUpdate metrics changes. 2024-05-21 11:57:58 -07:00
yaoxiao-github
f4af16ecf7 Retry when Ratekeeper failed to get StorageQueueInfo. 2024-01-10 16:07:27 -08:00
Jingyu Zhou
266a3f018b Disable hot shard throttling for restore
Otherwise, the restored database is not consistent.

To reproduce, at commit 45c459cfc of PR #11064:

-f ./tests/restarting/from_7.3.0/DrUpgradeRestart-1.toml -s 4210130489 -b off
-f ./tests/restarting/from_7.3.0/DrUpgradeRestart-2.toml --restarting -s 4210130490 -b on
2023-11-17 13:56:58 -08:00
Dan Lambright
af53e9a532 Log ignored zones and reasons in RkUpdate (#11067) 2023-11-17 16:07:48 -05:00
Sreenath Bodagala
d8f0a21ecc Merge remote-tracking branch 'apple-upstream/main' 2023-11-07 21:10:43 +00:00
Sreenath Bodagala
58c0e79874 - Prevent failover when storage servers are behind. 2023-11-03 21:45:48 +00:00
Dan Lambright
015167c17e Throttle commits against hot shards (#10970)
* throttle hot shards

* expire throttled shards over time

* add backoff

* Parallelize messaging from RK to CP

* Obtain shards from a single SS

* handle expired transactions

* bump transaction_throttled_hot_shard

* Change SevError to SevWarn for CannotMonitorHotShardForSS

* Add log per request
2023-10-31 12:01:34 -04:00
sfc-gh-tclinkenbeard
2228cd3320 Monitor multiple write tags in StorageQueueInfo::refreshCommitCost 2023-08-02 15:52:57 -07:00
Hao Fu
a5f4d53c45 Remove SS entries from RateKeeper once it is down (#10627)
* Remove SS entries from RateKeeper once it is down

Before the change, certain data structures in RateKeeper would
not delete data associated with a deleted/cancelled SS, thus
it causes significant unnecessary CPU usage, results in degrades
of GRV proxy in performance.  This change fixes it.
2023-07-24 13:47:23 -07:00
sfc-gh-tclinkenbeard
9c6b365267 Simplify limiting rate calculation for GlobalTagThrottler 2023-07-16 22:33:13 -07:00
sfc-gh-tclinkenbeard
d33d0ece55 GlobalTagThrottler should decay throughput from missing storage servers 2023-06-29 20:53:09 -07:00
sfc-gh-tclinkenbeard
ae6167e576 Update StorageQueueInfo::getTagThrottlingRatio implementation 2023-05-26 16:10:37 -07:00
sfc-gh-tclinkenbeard
9639192a88 Add GLOBAL_TAG_THROTTLING_REPORT_ONLY knob 2023-04-21 11:13:42 -07:00
sfc-gh-tclinkenbeard
568518b6a3 Update semantics of MIN_TAG_WRITE_PAGES_RATE to reflect name 2023-04-17 11:58:50 -07:00
Josh Slocum
aef5130da2 adding system priority option to getDatabaseConfiguration, and several debugging improvements (#9864) 2023-04-06 15:08:40 -05:00
Josh Slocum
3748693a28 fixing txn flags in new bw rk function (#9813) 2023-03-27 15:56:15 -05:00
Josh Slocum
33c0b35ee6 No RK throttling on blob workers if no blob ranges (#9425) 2023-02-21 15:23:40 -06:00
sfc-gh-tclinkenbeard
c0fcf59a8c Fix bug in GlobalTagThrottler::getLimitingTps.
Also add comments to GlobalTagThrottler unit tests
2022-10-27 21:18:39 -07:00
sfc-gh-tclinkenbeard
6a8c6e83e4 Rename StorageQueueInfo::getTagThrottlingRatio 2022-10-27 21:18:39 -07:00
sfc-gh-tclinkenbeard
950ac1c867 Improve encapsulation for TLogQueueInfo and StorageQueueInfo 2022-10-27 21:18:35 -07:00
A.J. Beamon
4fd64630e8 Convert literal string ref instances to use _sr suffix 2022-09-19 11:35:58 -07:00
sfc-gh-tclinkenbeard
82adc1e856 Make g_simulator a pointer 2022-09-15 09:00:33 -07:00
Josh Slocum
9721de70b6 Adding knob and increasing delay for simulation ratekeeper throttling assert 2022-08-31 09:08:27 -05:00
sfc-gh-tclinkenbeard
9df990e375 Remove global_tag_throttler status section 2022-08-29 23:17:20 -07:00
A.J. Beamon
2907d2d4dd Merge pull request #8004 from sfc-gh-ajbeamon/fix-ub
Fix some undefined bevavior in RK and a unit test
2022-08-29 09:16:11 -07:00
Evan Tschannen
8314e80371 Fixed a few bugs which caused ratekeeper to unnecessarily throttle a cluster (#8006)
* do not count recently created change feeds for throttling

* fix: blocked assignments were not decremented when force purging

* fix: created needs to be updated when the changefeed is reset

* added asserts to detect if ratekeeper is throttled on blob workers
2022-08-26 15:38:31 -07:00
A.J. Beamon
0e782412a8 Fix some undefined bevavior: 1) a unit test was not initializing members of the WorkloadContext it was using, and 2) very large ratekeeper limits for batch priority were overflowing the types used to log them 2022-08-26 14:17:01 -07:00
Evan Tschannen
493771b6a8 Throttle the cluster if the blob manager cannot assign ranges (#7900)
* Throttle the cluster if the blob manager cannot assign ranges

* fixed a number of different bugs which caused ratekeeper to throttle to zero because of blob worker lag

* fix: do not mark an assignment as block if it is cancelled

* remove asserts to merge bug fixes

* fix formatting

* restored old control flow to storage updater

* storage updater did not throw errors

* disable buggify to see if it fixes CI
2022-08-23 13:33:46 -05:00
Evan Tschannen
a9d3c9f9b3 Added throttling when a blob worker falls behind (#7751)
* throttle the cluster when blob workers fall behind

* do not throttle on blob workers if they are not enabled

* remove an unnecessary actor

* fixed a compile error

* fetch blob worker metrics at the same interval as the rate is updated, avoid fetching the complete blob worker list too frequently

* fixed another compilation bug

* added a 5 second delay before bw throttling to prevent false positives caused by the 100e6 version jump during recovery. Lower the throttling thresholds to react much quicker to bw lag.

* fixed a number of problems

* changed the minBlobVersionRequest to look at storage server versions since this will be a lot more efficient

* fix: do not let desired go backwards

* fix: track the version of notAtLatest changefeeds for throttling

* ratekeeper now throttled blob workers by estimating the transaction per second throughput of the blob workers

* added metrics for blob worker change feeds

* added a knob to disable bw throttling

* fixed the transaction options in blob manager
2022-08-12 13:15:56 -07:00
Trevor Clinkenbeard
583021c2d9 Merge pull request #7772 from sfc-gh-tclinkenbeard/global-tag-throttling6
Add status section for global tag throttler
2022-08-11 17:38:31 -03:00
sfc-gh-tclinkenbeard
66373f1e74 Addressed review comments 2022-08-10 21:44:12 -03:00
Jingyu Zhou
eba77d78f4 Add knobs for min/max Ratekeeper limit
The default has no effects.
2022-08-08 15:27:21 -07:00
sfc-gh-tclinkenbeard
1bd47a07b2 Add ENFORCE_TAG_THROTTLING_ON_PROXIES knob 2022-08-05 00:40:10 -07:00
Jingyu Zhou
84d483605b Merge pull request #7431 from xis19/main
Let the storage server reports busiest write tag
2022-08-04 10:23:31 -07:00
sfc-gh-tclinkenbeard
2699439282 Add global_tag_throttler section to status 2022-08-02 16:53:03 -07:00
Xiaoge Su
fd3c3f0774 fixup! Reformat source 2022-08-01 18:56:50 -07:00
Xiaoge Su
195890dd7b Add ratekeeper ID for storage server busiest write tag report 2022-08-01 18:56:50 -07:00
Xiaoge Su
aa69f5f36e fixup! Update per code review 2022-08-01 18:56:50 -07:00
Xiaoge Su
90b887f394 fixup! Update per comments 2022-08-01 18:56:50 -07:00
Xiaoge Su
ec40c6bfec fixup! Add a wrapper of ResourceWeakRef for better support of self pointer 2022-08-01 18:56:50 -07:00
Xiaoge Su
cf04afe925 fixup! Non-owning reference to an object
See documents in flow/OwningResource.h
2022-08-01 18:56:50 -07:00
Xiaoge Su
542b5e61cf Let the storage server reports busiest write tag
Issue #7258

The ratekeeper is recording the busiest write tag for *all* storage
servers, which throttles the traceevent. Distribute the busiest write
tag to corresponding storage servers should reduces this throttling
issue.
2022-08-01 18:56:50 -07:00
sfc-gh-tclinkenbeard
20ac60fb11 Set throttling ratio in GlobalTagThrottler::tryUpdateAutoThrottling 2022-07-19 17:04:04 -07:00
sfc-gh-tclinkenbeard
b49c36f0b0 Add StorageQueueInfo::getWriteQueueSizeLimitRatio method 2022-07-19 16:28:27 -07:00
Markus Pilman
1de37afd52 Make TEST macros C++ only (#7558)
* proof of concept

* use code-probe instead of test

* code probe working on gcc

* code probe implemented

* renamed TestProbe to CodeProbe

* fixed refactoring typo

* support filtered output

* print probes at end of simulation

* fix missed probes print

* fix deduplication

* Fix refactoring issues

* revert bad refactor

* make sure file paths are relative

* fix more wrong refactor changes
2022-07-19 13:15:51 -07:00
sfc-gh-tclinkenbeard
086e4bff06 Merge remote-tracking branch 'origin/main' into global-tag-throttling3 2022-06-28 10:18:13 -07:00