Commit Graph

1566 Commits

Author SHA1 Message Date
Zhe Wang
ca4ab1eca9 Fix traceTooManyEvents and externalTimeouts in BulkLoad test (#11769) 2024-11-11 11:05:43 -08:00
Dan Lambright
4b3525f8a3 Do not test bulk loading and version vector simultaneously (#11759)
Co-authored-by: Dan Lambright <hlambright@apple.com>
2024-11-06 13:39:21 -08:00
Syed Paymaan Raza
84fb8f843c Gray failure allows storage servers to complain (#11753) 2024-11-05 16:53:02 -08:00
Dan Lambright
5716e4e7c2 Do not check for PROXY_USE_RESOLVER_PRIVATE_MUTATIONS in rangeLockEnabled (#11752)
* Do not check for PROXY_USE_RESOLVER_PRIVATE_MUTATIONS in rangeLockEnabled

* Dont modify knob proxy_use_resolver_private_mutations in range lock tests

---------

Co-authored-by: Dan Lambright <hlambright@apple.com>
2024-11-05 12:13:18 -05:00
Dan Lambright
317956ee14 disable version vector with range lock tests (#11746)
* disable version vector with range lock tests

* turn off rangeLock if versionvector is on (#11747)

---------

Co-authored-by: Dan Lambright <hlambright@apple.com>
Co-authored-by: Zhe Wang <zhe.wang@wustl.edu>
2024-10-31 21:07:01 -04:00
Zhe Wang
42e17d8bd1 BulkLoading Use RangeLock (#11741)
* use range lock in bulk load

* refactor BulkLoading workload and nits

* add background traffic

* nits

* address comments
2024-10-31 12:58:13 -07:00
Zhe Wang
43446204ed Database Per-Range Lock (#11693)
* range lock framework

* improve the framework

* persist to txnStateStore

* fix bugs

* code clean

* code clean

* bug fix

* address comments

* add complex test workload and fix bugs found by the workload

* add workload correctness check and fix bugs

* code clean up

* add random range lock injection

* fix bugs in RandomRangeLock.actor.cpp

* enable random range lock injection in general workloads

* add rangelockcycle test

* disable random range lock in backup workloads

* nits

* add range lock ownership concept

* enable lock ownership to rangeLock

* api deal with tenant

* fix CI

* add test for multiple rangeLock owners

* nits

* address comments and renaming

* address comments
2024-10-23 16:25:56 -07:00
Syed Paymaan Raza
5f480947ad [fdbserver] Gray failure and simulator improvements related to remote processes (#11717)
* [fdbserver][simulator] Add remoteDesiredTLogCount option

* [fdbserver][simulator] Allow explicitly specifying number of stateless classes in each DC

* [fdbserver][gray_failure] RemoteTLog lagging SS simulation test

* [fdbserver][gray_failure] Consider remote processes + CC inter/intra latency awareness

* [fdbserver][cc] Make processInSameDC O(1)
2024-10-23 13:15:29 -07:00
Zhe Wang
7d95b87483 improve the probability that sharded rocksdb is selected in simulation tests 2024-10-10 09:48:23 -07:00
Jingyu Zhou
a315ff1840 Remove ctest from prev3 version
Only test as early as previous 2 versions.
2024-09-18 16:09:36 -07:00
Jingyu Zhou
fc18d73502 Revert a typo 2024-09-17 16:51:07 -07:00
Jingyu Zhou
cc0cb656ec Remove 6.3 and 7.0 upgrade tests
This fixes issue of old tlog format in 6.3, which was no longer supported in 7.4

As a side effect, 7.4 will not support upgrades from versions lower than 7.1

Remove version vector upgrade tests before 7.3, since there are many changes
or fixes only after 7.3.
2024-09-17 14:50:33 -07:00
Jingyu Zhou
80ca71833b Make xxhash checksum the default for TLog
Update downgrade tests to use the xxhash.
2024-09-17 12:46:42 -07:00
Jingyu Zhou
18788f814c Drop upgrade tests before 6.3 (#11660)
This is not supported or recommended.
2024-09-13 11:34:57 -07:00
He Liu
274ae7dacd Remove storage engine type from DataLossRecovery test. (#11655) 2024-09-12 13:31:03 -07:00
Dan Lambright
eaf08c2414 Disable version vector for tenant test. (#11644)
Co-authored-by: Dan Lambright <hlambright@apple.com>
2024-09-10 14:02:05 -04:00
Zhe Wang
5ee0db13e6 Fix external timeout with ShardedRocksDB and re-enable ShardedRocksDB in simulation tests (#11638)
* speedup sharded rocksdb in simulation

* re-enable shardedrocksdb and disable physical shard move
2024-09-08 10:57:55 -07:00
Vishesh Yadav
a84026cca5 [testing] Automatically discover unit-test and register as ctest (#11612)
* [testing] Automatically discover unit-test and register as ctest

This patch adds `collect_unit_tests()` to CMake which searches over
the codebase and finds all the unit-tests written using Flow's TEST_CASE
macro and register as ctest.

The test then can be then run using ctest command or directly via Test
Explorer in VSCode.

* Update CMakeLists.txt

* Check failed tests

* Update TestDirectory.py to create more unique directory

* Put the feature behind flag
2024-09-04 11:31:49 -07:00
Jingyu Zhou
8475ad87bb Ignore FileNotFoundError when tearing down a temp cluster (#11594)
* Ignore FileNotFoundError when tearing down a temp cluster

ctest can fail with errors like:

[2024-08-22 20:37:15,780] - fdbcli_tests.py:231 - DEBUG - killall - Old generation: 8, New generation: 10
log-dir: /codebuild/output/src3881629096/src/github.com/apple/foundationdb/build_output/tmp/ZIpl2Hm2zri6nlpu/log
etc-dir: /codebuild/output/src3881629096/src/github.com/apple/foundationdb/build_output/tmp/ZIpl2Hm2zri6nlpu/etc
data-dir: /codebuild/output/src3881629096/src/github.com/apple/foundationdb/build_output/tmp/ZIpl2Hm2zri6nlpu/data
cluster-file: /codebuild/output/src3881629096/src/github.com/apple/foundationdb/build_output/tmp/ZIpl2Hm2zri6nlpu/etc/fdb.cluster
Traceback (most recent call last):
  File "/codebuild/output/src3881629096/src/github.com/apple/foundationdb/tests/TestRunner/tmp_cluster.py", line 146, in <module>
    print(f.read())
  File "/codebuild/output/src3881629096/src/github.com/apple/foundationdb/tests/TestRunner/tmp_cluster.py", line 45, in __exit__
    shutil.rmtree(self.tmp_dir)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/shutil.py", line 718, in rmtree
    _rmtree_safe_fd(fd, path, onerror)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/shutil.py", line 655, in _rmtree_safe_fd
    _rmtree_safe_fd(dirfd, fullname, onerror)
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/shutil.py", line 675, in _rmtree_safe_fd
    onerror(os.unlink, fullname, sys.exc_info())
  File "/opt/rh/rh-python38/root/usr/lib64/python3.8/shutil.py", line 673, in _rmtree_safe_fd
    os.unlink(entry.name, dir_fd=topfd)
FileNotFoundError: [Errno 2] No such file or directory: 'fdbmonitor.lock'

* Explicitly ignore FileNotFoundError
2024-08-26 11:20:38 -07:00
Zhe Wang
6c502e9707 Solve RocksDB external timeout error and re-enable RocksDB simulation tests (#11577)
* init knob tune

* include rocksdb in tests

* probably reuse rocksdb iterator in simulation

* clear unnecessary knob change
2024-08-16 12:37:18 -07:00
Jingyu Zhou
bd2e108531 Merge pull request #11555 from jzhou77/fix
Reduce chance of running rare tests
2024-08-06 10:13:03 -07:00
Syed Paymaan Raza
392bad2bd3 More copyright end year updates (#11556) 2024-08-05 14:00:32 -07:00
Jingyu Zhou
3ff6c01a2c Move BlobGranule and Metacluster tests to rare directory
To reduce the number of runs for these experimental features in Joshua.
2024-08-03 15:57:29 -07:00
Jingyu Zhou
6d580e16b2 Make rare tests run less in Joshua
By increasing the priority value
2024-08-03 15:42:12 -07:00
Jingyu Zhou
5d2deddb7d Reduce the chance to run some rare tests
E.g., StatusBuilderPerf and TLogVersionMessagesOverheadFactor are more like
performance tests, which shouldn't be running so many times.

Without the change, a 100k-run has this many for these tests:

   1318 tests/rare/CycleWithKills.toml
   1591 tests/rare/TLogVersionMessagesOverheadFactor.toml
   1647 tests/rare/ConfigDBUnitTest.toml
   1839 tests/rare/StatusBuilderPerf.toml

After the change, a 100k-run has:

    129 tests/rare/TLogVersionMessagesOverheadFactor.toml
    151 tests/rare/CycleWithKills.toml
    160 tests/rare/StatusBuilderPerf.toml
    375 tests/rare/ConfigDBUnitTest.toml
2024-08-02 17:24:30 -07:00
Syed Paymaan Raza
c3e7542cda Update end year in copyright header 2024-08-02 09:40:11 -07:00
Zhe Wang
a245b9622c Fix a couple of simulation failures (#11543)
* Add usable region check per shard for encode shard location metadata

* nits

* nit

* address comments

* fix SS assertion failed for a wrong data move type generated by an old binary which does not encode the data move type in the data move id

* fix ClientTransactionProfilingCorrectness 7.3 upgrade test considering physical shard move compatibility

* code clean

* split CycleTestRestart in upgrading test from release-7.3

* address comments

* nits
2024-08-01 22:32:32 -07:00
Jingyu Zhou
bbd8900352 Add some debug output for simulation and disable a few blob tests 2024-07-30 15:52:59 -07:00
Zhe Wang
dcebf1a9bc Add extraStorageMachineCountPerDC config to simulation (#11529) 2024-07-26 09:05:41 -07:00
Zhe Wang
74990e44bd Bulk Loading Framework (#11369) 2024-07-23 14:57:28 -07:00
hao fu
67ea901d96 lower the sample rate of client metric in simulation
This change also disable waitForQuiescenceEnd in clientmetric test.
There are other transactions happening, so there will be lots of
conflicts in fdbClientInfo/client_latency prefix.

This change lower the sample rate of client metric to avoid
such conflicts. It also increases the keys to write correspondingly
to make sure client latency are being written.
2024-07-22 21:44:41 +08:00
hao fu
5f1c0b658c Disable tenant mode 2024-07-13 21:39:18 +08:00
hao fu
c0ce3b4fae Add 2 separate restarting tests, for 7.1 and 7.3 2024-07-13 21:39:18 +08:00
hao fu
539bcc56fc Add client metric test
This restarting test start with 7.1 version and setup the sample rate
for transaction log, then it test with 7.3 version and verify transaction
log are still being written.

This change can only be merged after knowing which release in 7.1 has
ClientMetric workload, and the first phase of restarting test needs to
run with at least that version.
2024-07-13 21:39:18 +08:00
dependabot[bot]
fe946da0fb Bump authlib from 1.0.1 to 1.3.1 in /tests/authorization
Bumps [authlib](https://github.com/lepture/authlib) from 1.0.1 to 1.3.1.
- [Release notes](https://github.com/lepture/authlib/releases)
- [Changelog](https://github.com/lepture/authlib/blob/master/docs/changelog.rst)
- [Commits](https://github.com/lepture/authlib/compare/v1.0.1...v1.3.1)

---
updated-dependencies:
- dependency-name: authlib
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-06-10 16:03:40 +00:00
neethuhaneesha
a7498b50ad Excluding some sharded rocksdb tests in simulation 2024-05-29 13:28:31 -07:00
Zhe Wang
ad646daf12 disable tenant in downgrade test (#11432) 2024-05-24 17:52:01 -07:00
Zhe Wang
62c2f8fe6d split from_6.3.13 into from_6.3.13_until_7.3.0 and from_7.3.0 and the latter one disables tenant in the first test 2024-05-24 00:06:43 -07:00
Zhe Wang
67b8c2448a Fix compatibility issue by GroupTenant in restarting tests 2024-05-23 23:14:48 -07:00
Zhe Wang
96f246c491 address comments 2024-05-23 16:57:52 -07:00
Zhe Wang
0dbb343da5 fix checksum in downgrade test 2024-05-23 14:01:53 -07:00
Jingyu Zhou
2e4968a55b Disable tests that have unknown knob deterministic_blob_metadata 2024-05-22 13:45:34 -07:00
Jingyu Zhou
e598cf7136 Remove unknown knob enable_configurable_encryption
This causes test failures.
2024-05-22 13:36:23 -07:00
Jingyu Zhou
31b66c502a Ignore BlobRestoreLarge test for long running time
Sometimes the test times out and causes Joshua errors.
2024-05-21 11:16:21 -07:00
Xiaoge Su
2091f8dae7 fixup! Fix not found issue caused by abuse of Python3_EXECUTABLE variable 2024-04-29 14:15:25 -07:00
Xiaoge Su
38adabf8df fixup! Fix the tests related to python 2024-04-22 18:44:32 -07:00
Sreenath Bodagala
a4430b9169 Compare storage replicas on reads (#11235)
* - Compare storage replicas on reads (in "loadBalance()")

* - Do consistency check on reads in loadbalance

* - Do replica consistency check in the case where loadBalance issues
requests to multiple storage servers

* - Address a state variable related bug

* - Code formatting

* - API simplification

* - Simplify code

* - Code formatting

* - Address a review comment
2024-04-11 16:08:54 -04:00
Zhe Wang
33eecd0775 Real-time corruption detection with accumulative checksum (#11255)
* acs framework

* code refactor and fix bugs

* add ss crash loop protector

* use sharedptr instead of raw pointer

* fixed critical bugs and add provate mutation acs to the framework

* enable ACS for all mutations except for clear serverTag mutation and fix bugs

* fix restarting tests

* refactor code and fix bugs

* fix AccumulativeChecksumState toString

* fix bugs

* allow all mutations in acs and fixed bugs

* fix bugs and code cleanup

* code clean up for adding recovery support

* simplify code and support recovery

* clear acs state at ss

* fix bug

* terminate validator if ss will be removed in the current batch

* simplify code

* add trace

* address comments

* optimize code

* deep copy when adding mutation to acs validator

* warp encode and decode persist acs key

* make acstable private

* remove unless func

* remove unless func

* remove epoch in ACS validator

* add acs mutation counter in SS metrics

* code cleanup and make knob check better

* make mutation buffer global

* simplify code

* add comments

* make knob randomly set

* address comments

* ss reboot after acs mismatch found
2024-04-04 15:03:44 -07:00
Jingyu Zhou
1ed42ee658 Merge pull request #11236 from apple/fanout
Fix bug in which private mutations were detected incorrectly.
2024-03-13 10:59:32 -07:00
Dan Lambright
fc17d76825 IGNORE vv upgrade tests 2024-03-13 11:32:46 -04:00