These features have been previously marked for deletion per PR #12400.
This change necessarily affects a lot of files. In general I found it preferable to cut along the FDB <-> tenant boundary, rather than try to cut tenant into multiple pieces, stitch the Frankenstein tenant implementation back together with FDB, and generally remove the limbs one by one. So it is a single big deletion.
Note that some tenant-related metadata has been written in a non-flag-controlled manner by prior releases and probably must be ignored indefinitely. Fortunately this is isolated to include/fdbclient/ClientLogEvents.h. (Details: deleting an Optional from a serialized struct results in deserialization of garbage in upgrade tests. The serialized nullopt to indicate "no Tenant" is formally part of FDB persistent metadata even in FDB clusters that never would have enabled the tenant feature.)
During the course of testing these changes, many interesting bugs were encountered. I won't discuss details of them here. Causes range from flat out damage (by me) to production code in the course of removing tenant related bits (mainly in NativeAPI.actor.cpp and CommitProxy.actor.cpp), damage to various workload files (particularly FuzzApiCorrectness.actor.cpp, which is very sensitive to changes), and many toml files needing updated test flags/options.
More testing details: https://quip-apple.com/Zr6VAycxoli9
20251209-012852-gglass-8ff850b772d868f2 compressed=True data_size=35311687 duration=21671404 ended=500000 fail_fast=1000 max_runs=500000 pass=500000 priority=100 remaining=0 runtime=2:31:30 sanity=False started=500000 stopped=20251209-040022 submitted=20251209-012852 timeout=5400 username=gglass
* remove some unneeded tests, and remove mentions of deleted tests from tests/CmakeLists.txt
* Initiate removal of metacluster. NOTE: this seems to also want removal of tenant. Consider removing them together.
* work on removing metacluster
* delete files with `Tenant` in the name, having reviewed them to ensure that they basically contain what the name implies
* fdb_c.h: remove prototypes for C API methods which have been deleted (blob granule) or which are so long deprecated that they are outside any reasonable/documented support window
* Surgical removal of tenant references from files in bindings/ top level directory. Compilation not yet attempted.
* Surgical removal of tenant related stuff from fdbcli/ top level directory. Compilation not yet attempted.
* Misc tenant code removal, and other stuff which I think may not be needed. Compilation still not attempted.
* Remove more tenant or tenant-adjacent or blob-granule-adjacent stuff. Or at least stuff that looks adjacent to that stuff. Not compiled or tested.
* Start removing Tenant stuff from fdbclient/. Far from complete. Compilation not attempted.
* Remove tenant references from many source files. There are still about 7 principal fdbclient/ and fdbserver/ files with a lot of tenant logic left to delete. Also, all of fdbserver/workloads needs to be looked at. Still have not attempted compilation.
* Remove tenant entanglement from watch functionality
* Remove tenant stuff from fdbserver/tester.actor.cpp
* Delete metacluster workloads
* Remove tenant related stuff from workloads. Also taken the liberty of removing some functionality that appears unused or untestable by Apple.
* Checkpoint tenant removal from FuzzApiCorrectness.actor.cpp
* NativeAPI.actor.cpp: `Tenant` has left the building.
* SimulatedCluster.actor.cpp: `Tenant` has left the building
* DDShardTracker.actor.cpp: Tenant evicted
* storageserver.actor.cpp: `tenant` has left the building.
* fdbserver/workloads/FuzzApiCorrectness.actor.cpp: remove tenant references, but some lingering cleanup needed in `loadAndRun`
* FileBackupAgent.actor.cpp: tenant has left the building
* CommitProxyServer.actor.cpp: remove tenant
* Remove more tenant references from misc files such as bindings tests, documentation, and some fdbserver headers I left earlier
* Fix missing-file errors in CMakeLists.txt files. This is the first attempt to compile this stuff.
* checkpoint misc changes to fix compile errors
* checkpoint more compile fixes
* StorageServerInterface.h: put back more verify() calls
* More misc compile fixes
* whole bunch of misc fixups including some code put-backs to address compile errors
* More compile fixes
* More compile fixes. Still does not compile.
* incremental compile fixing
* ...
* ...
* Checkpoint a bunch of compile fixes. Not quite there but getting closer
* More compile fixes. There seem to be about 10 files left, mainly CommitProxyServer.actor.cpp and storageserver.actor.cpp
* IT COMPILES NOW. THIS IS STILL ALL UNTESTED. Unsurprisingly, CommitProxyServer.actor.cpp and storageserver.actor.cpp took the most tweaking.
The updates in CMakeLists.txt and workloads/UnitTests.actor.cpp are basically trivial and mainly reflect
the ordering of dependencies -- that stuff didn't get attempted until all of fdbserver compiled.
* Put back one block relating to encryption at rest mode. Simplify some TODO(gglass) instances.
* Put back some encryption related knobs
* remove `enable_tenants` from local_cluster.py to maybe fix some ctests
* Remove tenant related options from toml files.
* feature-status.md: add a line for encryption at rest, which seems to have been added for multi-tenant; status is now in doubt
* Fix a pretty bad bug introduced in tenant deletion; ensure we dont attempt to construct a std::string of negative length
* workloads/FuzzApiCorrectness.actor.cpp: avoid division by zero
* flow/Platform.actor.cpp: add a try/catch wrapper around side threads; emit a better addr2line type command
* NativeAPI.actor.cpp: fix a bug introduced in tenant removal relating to reporting conflicting keys under conflictingKeysRange
* ReportConflictingKeys.actor.cpp: separate an ANDed assert into two asserts
* SpecialKeySPaceCorrectness.actor.cpp: put back some logic removed with tenant removal. This test was failing due to a bug with conflict key range reporting. Fixed separately in NativeAPI.actor.cpp.
* remove QuotaCommand.actor.cpp
* Force disable tenant and encryption on disk in upgrade tests
* Add back file I guess I deleted? who knows
* put back another file
* design/feature-status.md: update the new row for encryption at rest to firm up the claim that it is experimental, unowned, and scheduled for deletion
* Remove EncryptKeyProxyTest since we do not use it
* new file tests/slow/BulkDumpingS3WithChaos.toml: remove tenantModes setting
* Undo damage to pushToBackupMutations() from removing tenant feature. This caused inverted_range errors and failed commits in backup related simulations.
* tests/restarting/from_7.4.0/Snap*-1: ensure that tenantModes = disabled
* Try again on workloads/FuzzApiCorrectness.actor.cpp
* simplify tenant-free (mostly) FuzzApiCorrectness workload code
* try harder to remove lingering tenant-related brokenness from FuzzApiCorrectness.actor.cpp
* Explicitly specify tenantModes = ['disabled'] in all the -1 restart files
* Remove tenantModes from 7.1-based upgrade tests as its an unknown option. Hopefully the code doesnt actually turn on tenant stuff
* do not specify tenantModes in downgrade tests
* Downgrade test to_7.4.5: dont say tenantModes
* more tenantModes updates
* Remove a legacy allowDefaultTenant that no longer is meaningful in downgrade to 8.0
* Put back empty Optional<TenantName> turdlets into serialized log events to avoid breaking ClientTransactionProfilingCorrectness upgrade tests (even with tenantMode = disabled)
* disable encryption on a few more upgrade related test cases. That feature is slated for removal anyway
* Remove unneeded workload files that have been subject to #if 0 for a while. Remove commented out block in ClusterRecovery
* disable encryption in more upgrade tests
* Remove choice four-letter words from commentary
* Format 42 files
* Try to fix a doc bug failing the CI build
* More doc compilation error fixes
* Delete more tenant junk from documentation
* fix spelling mistake in comment
* Remove deleted cross-references from documentation. This necessitated editing release 3.0.0 release notes, which is insane.
* Remove more tenant stuff from bindings tests
* Remove more tenant bits from design/ files
* Remove more tenant related stuff
* Delete more tenant references. Put back ten-ant spellings as tenant now that grep output is substantially reduced.
* Put back some tenant stuff into apitester; its deletion seems to have introduced bugs. Also whine about comments some more, because, really, the comments deserve it.
* Updates to workload files and one other thing based on review comments
* de-actorify decodeKVPairs
* format one source file
* Restore transaction tagging doc
* Restore throttle doc details in administration.rst
* Restore fdbserver/workloads/GetEstimatedRangeSize.actor.cpp and associated toml file, minus tenant stuff
* bindings/c/test/{shim related}: update comments and disable functionality that no longer works post-tenant
* put the cli-throttle tag back in
* bindingtester: fix python syntax errors
* remove useless comment
* Remove comment about useless comments, and remove the useless comments
This is the first experimental feature to be deleted in the list published at PR #12400.
There is more code here than I anticipated. It is about 40,000 lines total, of which about three quarters are in dedicated files which I am deleting, and about one quarter is in shared files. That means about 10k lines in shared files, which is the stuff we tend to notice day to day (that plus the test failures on heretofore not-yet-disabled test cases, which I am now deleting).
I ran 3 million simulations, mostly against 692df86 or very similar code (differing by one TraceEvent). This was prior to syncing with upstream/main, which had no conflicts and from which I don't expect problems. The number of failures in these runs was about 8. We looked at them and believe there is a high likelihood that these are existing issues not related to the changes in this PR. More details on these failures can be found in docs linked from here: https://quip-apple.com/MN7gAyXLjgyn
* change Long Term status for unowned features for "scheduled for deletion" where applicable
* Relax wording about scheduled for deletion features
* Delete blob granule feature. WIP. Does not compile.
* more incremental hacking to remove / comment out blob granule related code
* more hacking to remove blob granule related code, e.g. blob manager and blob migrator roles
* delete more blob granule stuff
* more hacking
* more hacking
* more hacking
* More changes to remove blob granule related code. IT COMPILES NOW
* dont try to run AuthzSecurity tests as we have deleted that workload as part of this effort
* delete more stuff that matches, abbreviates, or smells like blob granule related
* EncryptKeyProxy: dont do blobMetadata stuff, because that is not used and support is being removed
* delete more references to blob granule stuff
* SimulationConfig::setEncryptionAtRestMode: always use DISABLED; also disable EncryptKeyProxyTest.toml
* format code
* manual update to bindings/java/src/tests.cmake to remove a deleted file
* fix compile errors. I guess by default I dont build Java bindings
* remove unneeded blob granule functions rather than #if..#endif them out
* remove more code in #if..#endif
* remove more code in #if 0..#endif
* revert changes to fdb_c.h in preparation for marking removed API calls as removed
* rework C API declarations to in preparation for marking blob granule APIs as removed
* deprecate removed glob granule related API functions as of version 740 (and add a comment to request a justification of this convention)
* make progress on broken ctests. E.g. 1) python does not need to do blob granule stuff. 2) authz tests seemingly not needed
* remove blob granule stuff from Java and Python APIs and fix test runner stuff so that ctests pass
* reformat comments to fix compile error. FIXME: why is this error not happening on the default compile commands we use
* hacks all the way down to try to fix the Mac build
* add pointed comment about the perceived pointlessness of the API deprecation scheme embodied in this source file
* really serious about the C++ style comments, arent we
* remove commented-out code from prior iterative efforts
* put back undeleted code in original order
* delete commented-out code
* update feature-status.md to say blob granule is mostly deleted
* upgrade `mostly deleted` to `has been deleted`
Right now this only allows one server address being excluded. This is useful
when the database is unavailable but we want the recruitment to skip some
particular processes.
Manually tested the concept works with a loopback cluster.
* Encryption data at-rest db-config
Description
diff-1: Handle 'force' updates to encryption_at_rest db-config
Major changes proposed:
1. Introduce 'encryption_data_at_rest_mode" 'configure new'
option to enable Encryption data at-rest. The feature is disabled
by default.
2. The configuration is meant to be set at the time of database
creation, addition checks will be done to avoid updating the config
in subsequent PR.
3. DatabaseConfiguration validity check to account for "tenant_mode"
set to `required` if Encryption data at-rest is selected given
EncryptionDomain matches Tenant boundaries.
Testing
devCorrectness - 100K