* Do not check for PROXY_USE_RESOLVER_PRIVATE_MUTATIONS in rangeLockEnabled
* Dont modify knob proxy_use_resolver_private_mutations in range lock tests
---------
Co-authored-by: Dan Lambright <hlambright@apple.com>
* disable version vector with range lock tests
* turn off rangeLock if versionvector is on (#11747)
---------
Co-authored-by: Dan Lambright <hlambright@apple.com>
Co-authored-by: Zhe Wang <zhe.wang@wustl.edu>
* range lock framework
* improve the framework
* persist to txnStateStore
* fix bugs
* code clean
* code clean
* bug fix
* address comments
* add complex test workload and fix bugs found by the workload
* add workload correctness check and fix bugs
* code clean up
* add random range lock injection
* fix bugs in RandomRangeLock.actor.cpp
* enable random range lock injection in general workloads
* add rangelockcycle test
* disable random range lock in backup workloads
* nits
* add range lock ownership concept
* enable lock ownership to rangeLock
* api deal with tenant
* fix CI
* add test for multiple rangeLock owners
* nits
* address comments and renaming
* address comments
* [fdbserver][simulator] Add remoteDesiredTLogCount option
* [fdbserver][simulator] Allow explicitly specifying number of stateless classes in each DC
* [fdbserver][gray_failure] RemoteTLog lagging SS simulation test
* [fdbserver][gray_failure] Consider remote processes + CC inter/intra latency awareness
* [fdbserver][cc] Make processInSameDC O(1)
This fixes issue of old tlog format in 6.3, which was no longer supported in 7.4
As a side effect, 7.4 will not support upgrades from versions lower than 7.1
Remove version vector upgrade tests before 7.3, since there are many changes
or fixes only after 7.3.
* [testing] Automatically discover unit-test and register as ctest
This patch adds `collect_unit_tests()` to CMake which searches over
the codebase and finds all the unit-tests written using Flow's TEST_CASE
macro and register as ctest.
The test then can be then run using ctest command or directly via Test
Explorer in VSCode.
* Update CMakeLists.txt
* Check failed tests
* Update TestDirectory.py to create more unique directory
* Put the feature behind flag
* Ignore FileNotFoundError when tearing down a temp cluster
ctest can fail with errors like:
[2024-08-22 20:37:15,780] - fdbcli_tests.py:231 - DEBUG - killall - Old generation: 8, New generation: 10
log-dir: /codebuild/output/src3881629096/src/github.com/apple/foundationdb/build_output/tmp/ZIpl2Hm2zri6nlpu/log
etc-dir: /codebuild/output/src3881629096/src/github.com/apple/foundationdb/build_output/tmp/ZIpl2Hm2zri6nlpu/etc
data-dir: /codebuild/output/src3881629096/src/github.com/apple/foundationdb/build_output/tmp/ZIpl2Hm2zri6nlpu/data
cluster-file: /codebuild/output/src3881629096/src/github.com/apple/foundationdb/build_output/tmp/ZIpl2Hm2zri6nlpu/etc/fdb.cluster
Traceback (most recent call last):
File "/codebuild/output/src3881629096/src/github.com/apple/foundationdb/tests/TestRunner/tmp_cluster.py", line 146, in <module>
print(f.read())
File "/codebuild/output/src3881629096/src/github.com/apple/foundationdb/tests/TestRunner/tmp_cluster.py", line 45, in __exit__
shutil.rmtree(self.tmp_dir)
File "/opt/rh/rh-python38/root/usr/lib64/python3.8/shutil.py", line 718, in rmtree
_rmtree_safe_fd(fd, path, onerror)
File "/opt/rh/rh-python38/root/usr/lib64/python3.8/shutil.py", line 655, in _rmtree_safe_fd
_rmtree_safe_fd(dirfd, fullname, onerror)
File "/opt/rh/rh-python38/root/usr/lib64/python3.8/shutil.py", line 675, in _rmtree_safe_fd
onerror(os.unlink, fullname, sys.exc_info())
File "/opt/rh/rh-python38/root/usr/lib64/python3.8/shutil.py", line 673, in _rmtree_safe_fd
os.unlink(entry.name, dir_fd=topfd)
FileNotFoundError: [Errno 2] No such file or directory: 'fdbmonitor.lock'
* Explicitly ignore FileNotFoundError
E.g., StatusBuilderPerf and TLogVersionMessagesOverheadFactor are more like
performance tests, which shouldn't be running so many times.
Without the change, a 100k-run has this many for these tests:
1318 tests/rare/CycleWithKills.toml
1591 tests/rare/TLogVersionMessagesOverheadFactor.toml
1647 tests/rare/ConfigDBUnitTest.toml
1839 tests/rare/StatusBuilderPerf.toml
After the change, a 100k-run has:
129 tests/rare/TLogVersionMessagesOverheadFactor.toml
151 tests/rare/CycleWithKills.toml
160 tests/rare/StatusBuilderPerf.toml
375 tests/rare/ConfigDBUnitTest.toml
* Add usable region check per shard for encode shard location metadata
* nits
* nit
* address comments
* fix SS assertion failed for a wrong data move type generated by an old binary which does not encode the data move type in the data move id
* fix ClientTransactionProfilingCorrectness 7.3 upgrade test considering physical shard move compatibility
* code clean
* split CycleTestRestart in upgrading test from release-7.3
* address comments
* nits
This change also disable waitForQuiescenceEnd in clientmetric test.
There are other transactions happening, so there will be lots of
conflicts in fdbClientInfo/client_latency prefix.
This change lower the sample rate of client metric to avoid
such conflicts. It also increases the keys to write correspondingly
to make sure client latency are being written.
This restarting test start with 7.1 version and setup the sample rate
for transaction log, then it test with 7.3 version and verify transaction
log are still being written.
This change can only be merged after knowing which release in 7.1 has
ClientMetric workload, and the first phase of restarting test needs to
run with at least that version.
* - Compare storage replicas on reads (in "loadBalance()")
* - Do consistency check on reads in loadbalance
* - Do replica consistency check in the case where loadBalance issues
requests to multiple storage servers
* - Address a state variable related bug
* - Code formatting
* - API simplification
* - Simplify code
* - Code formatting
* - Address a review comment
* acs framework
* code refactor and fix bugs
* add ss crash loop protector
* use sharedptr instead of raw pointer
* fixed critical bugs and add provate mutation acs to the framework
* enable ACS for all mutations except for clear serverTag mutation and fix bugs
* fix restarting tests
* refactor code and fix bugs
* fix AccumulativeChecksumState toString
* fix bugs
* allow all mutations in acs and fixed bugs
* fix bugs and code cleanup
* code clean up for adding recovery support
* simplify code and support recovery
* clear acs state at ss
* fix bug
* terminate validator if ss will be removed in the current batch
* simplify code
* add trace
* address comments
* optimize code
* deep copy when adding mutation to acs validator
* warp encode and decode persist acs key
* make acstable private
* remove unless func
* remove unless func
* remove epoch in ACS validator
* add acs mutation counter in SS metrics
* code cleanup and make knob check better
* make mutation buffer global
* simplify code
* add comments
* make knob randomly set
* address comments
* ss reboot after acs mismatch found