* - Compare storage replicas on reads (in "loadBalance()")
* - Do consistency check on reads in loadbalance
* - Do replica consistency check in the case where loadBalance issues
requests to multiple storage servers
* - Address a state variable related bug
* - Code formatting
* - API simplification
* - Simplify code
* - Code formatting
* - Address a review comment
* EaR: reduce metrics logging
BlobCipherMetrics used to break down by usage types (whehter it is for tlog, redwood, backup, etc), and these counters will be printed to trace log even when encryption is not enabled, or the specific usage is not happening on a node (e.g. a node with only stateless roles will also print blob cipher counters for redwood). We are reducing the BlobCipherMetrics loggings by:
1. Default to not breakdown the metrics by usage type, and the behavior is controlled by the knob `ENCRYPT_KEY_CACHE_ENABLE_DETAIL_LOGGING`
2. When the detail breakdown is enabled, the counters are lazily initialize
3. Even if the counters are initialized, they will not be logged if the count is 0 (so like if a node was recruited as tlog but then drops the tlog role later on, the tlog counter inside BlobCipherMetrics will not be logged anymore).
* buggify BlobCipherMetrics detail logging knob
* format
* EaR: REST KMS fixes - encryption integration testing
Description
Major changes:
1. Multiple fixes observed while performing integration end-to-end
testing for Encryption at-rest feature.
2. Improve REST module logging. Introduced FLOW_KNOBS->REST_LOG_LEVEL
to have more granular control of feature logging disconnected from
the cluster log level.
Testing
Integration testbed:
1. Run fdbserver standalone
2. Run external KMS http-server to serve encryption key fetch requests
* Define API for unsuppressable TraceEvent types
Add trace checking tests for authz trace events
* Revert temporary configurations used for debugging
* Simplify/Modernize flow audit logging API
- Do event type whitelist checks at compile time
- Use ""_audit literal API instead of a tag struct
- Replace int with a lightweight struct for tracking/modifying TraceEvent enablement
* Revert installing signal handler for SIGTERM and refactor test script
Move trace checker to local_cluster.py
* Lengthen public key refresh interval and add more audited events
* Try and make MSVC and Mac build happy
* consteval > constexpr
'inline consteval' still causes link errors in Mac builds
Previously with EaR we always enable authentication (e.g. we encrypt Redwood pages). The authentication is a form of checksum, so dedicated page checksum was not needed. This PR adds back xxhash page checksum when authentication is disabled. Also change the knob to default disable authentication.
Several fixes/improvements related to distributed traces.
Remove "key" attributes and the TRACING_SPAN_ATTRIBUTES_ENABLED knob: we almost never want to log actual keys (as they can contain private data), however, we do want to use other span attributes.
In Transaction::setTransactionID, properly propagate spanContext flags, and set all copies of spancontext
- setknob <knob_name> <knob_value> [config_class]
- getknob <knob_name> [config_class]
- Added new option to begin to specify if it's a configuration txn. Syntax is begin [config-txn]
- Added utility function for converting tuples to string
- Added knobmanagment test in fdbcli_tests.py
Adding the following metrics:
* BlobCipherKeyCache hit/miss
* EKP: KMS requests latencies
* For each component that using encryption, they now need to pass a UsageType enum to the encryption helper methods (GetEncryptCipherKeys/GetLatestEncryptCipherKey/encrypt/decrypt) and those methods will help to log get cipher key latency samples and encryption/decryption cpu times accordingly.