* [Release-7.3] TeamRedundant and TeamUnhealthy data moves choose best destination with probability (#11668)
* team redundant and unhealthy data moves can choose best dest with probability
* nits
* nits
* enable wantTrueBestIfMoveout
* fix getteam stuck
* [Release-7.3] Delay team remover when space pivot is low (#11665)
* [Release-7.3] Validate ServerTeam count per server in simulation (#11678)
* validate server team count in simulation
* change naming (not relevant to the PR title)
* address comments and add a new trace event BuildTeamsLastBuildTeamsFailed triggered when buildTeam failed
* range lock framework
* improve the framework
* persist to txnStateStore
* fix bugs
* code clean
* code clean
* bug fix
* address comments
* add complex test workload and fix bugs found by the workload
* add workload correctness check and fix bugs
* code clean up
* add random range lock injection
* fix bugs in RandomRangeLock.actor.cpp
* enable random range lock injection in general workloads
* add rangelockcycle test
* disable random range lock in backup workloads
* nits
* add range lock ownership concept
* enable lock ownership to rangeLock
* api deal with tenant
* fix CI
* add test for multiple rangeLock owners
* nits
* address comments and renaming
* address comments
* [fdbserver][simulator] Add remoteDesiredTLogCount option
* [fdbserver][simulator] Allow explicitly specifying number of stateless classes in each DC
* [fdbserver][gray_failure] RemoteTLog lagging SS simulation test
* [fdbserver][gray_failure] Consider remote processes + CC inter/intra latency awareness
* [fdbserver][cc] Make processInSameDC O(1)
* Add assertions to code paths with txsTag
txsTag should be obsolete by now, since it's used in 6.1, which is no longer
supported for upgrade.
* Actually remove txsTag usage
20240926-225930-jzhou-7ed3304c415ae65e
* Remove more code
20240926-235242-jzhou-7ed3304c415ae65e
* Disable two verbose trace events
They can cause TraceTooManyLines errors.
* Add rocksdb, sharded rocksdb to configure workload
Also remove mentioning of ssd-redwood-1-experimental.
* Fix test failure when SHARD_ENCODE_LOCATION_METADATA is off
s3 token is from local disk and might be expired or invalid,
before this change backup retries to upload data to s3 indefinitely,
thus it is a waste of network bandwidth.
Now retry with a get request of list all buckets in the case of
s3 token error, and only retry the upload when token error disappears.
There is one left that doesn't seem to have a good way for conversion. To make
sure the converted code is behaving correctly, I added a few CodeProbes to
ensure code coverage.
E.g., StatusBuilderPerf and TLogVersionMessagesOverheadFactor are more like
performance tests, which shouldn't be running so many times.
Without the change, a 100k-run has this many for these tests:
1318 tests/rare/CycleWithKills.toml
1591 tests/rare/TLogVersionMessagesOverheadFactor.toml
1647 tests/rare/ConfigDBUnitTest.toml
1839 tests/rare/StatusBuilderPerf.toml
After the change, a 100k-run has:
129 tests/rare/TLogVersionMessagesOverheadFactor.toml
151 tests/rare/CycleWithKills.toml
160 tests/rare/StatusBuilderPerf.toml
375 tests/rare/ConfigDBUnitTest.toml