Commit Graph

139 Commits

Author SHA1 Message Date
Syed Paymaan Raza
c3e7542cda Update end year in copyright header 2024-08-02 09:40:11 -07:00
A.J. Beamon
64b6a5d257 Allow boolean parameters to be nested inside of namespaces or classes 2023-03-30 15:09:59 -07:00
Sreenath Bodagala
fe4a839600 - Do not add fdbserver processes to the client list. (#9225)
Note: Server processes started getting reported as clients since 7.1.0
(not sure if this change in behavior was intentional or not), and this
breaks the operator upgrade logic.
2023-01-25 14:39:32 -05:00
Xiaoge Su
50de69c897 Extract IConnection and NetworkAddress out from network.h 2023-01-24 14:48:31 -08:00
FoundationDB CI
86d6106dc1 format source code after switch to clang 15 2022-12-08 17:26:45 +00:00
Trevor Clinkenbeard
6d09553e67 Update fdbclient/MonitorLeader.actor.cpp
Co-authored-by: Jingyu Zhou <jingyuzhou@gmail.com>
2022-12-01 09:59:08 -08:00
sfc-gh-tclinkenbeard
994fd4aa41 Remove rare annotation from some code probes 2022-12-01 09:59:08 -08:00
sfc-gh-tclinkenbeard
c03f60c618 Update rare code probe annotations 2022-11-15 13:21:25 -08:00
A.J. Beamon
e1fe28b78b Switch some usages of LiteralStringRef to use the _sr suffix 2022-09-30 16:04:16 -07:00
A.J. Beamon
4fd64630e8 Convert literal string ref instances to use _sr suffix 2022-09-19 11:35:58 -07:00
Lukas Joswiak
2fe3fc5379 Fix issue with pointer dereference after actor cancellation 2022-09-13 16:53:54 -07:00
sfc-gh-tclinkenbeard
7a2554c73e Remove code probe for \"Coordinator hostname resolving failure\"
This is not intended to be hit in simulation, and we have visibility
into these events with the MonitorProxiesConnectFailed trace event.
2022-09-12 10:09:08 -07:00
Vaidas Gasiunas
81fff640bd Testing with invalid cluster files, fixing update from changed cluster file (#8126)
* ApiTester: test with invalid cluster files

* More asserts in monitorProxies

* ApiTester: Test tampering the cluster file

* Fix update of connection string from the cluster file to use the new connection string only if it valid

* ApiTester: add linker dependency on std++fs

* upgrade_test: no-cleanup-on-error option

* ApiTester: use atomic operations to change and access the transaction handle
2022-09-10 09:23:00 +02:00
Xiaoge Su
751cf6a721 fixup! typo 2022-08-15 15:25:20 -07:00
Xiaoge Su
c4d1231914 Remove ItemWithExamples, replace by Samples
This is a fully cleanup of ItemWithExamples with a new Samples class.
2022-08-15 15:25:20 -07:00
Xiaoge Su
68dc99ea0f Refactor ClientStatusStats and ClientStats
MonitorLeader.actor.cpp:ClientStatusStats and
Status.actor.cpp:ClientStats shares the same structure, thus refactored.
2022-08-15 15:25:20 -07:00
Markus Pilman
1de37afd52 Make TEST macros C++ only (#7558)
* proof of concept

* use code-probe instead of test

* code probe working on gcc

* code probe implemented

* renamed TestProbe to CodeProbe

* fixed refactoring typo

* support filtered output

* print probes at end of simulation

* fix missed probes print

* fix deduplication

* Fix refactoring issues

* revert bad refactor

* make sure file paths are relative

* fix more wrong refactor changes
2022-07-19 13:15:51 -07:00
Vaidas Gasiunas
e28a8401fb Update coordinator list from cluster file (#7382)
* Log failed connection attempts in monitorProxies

* Update coordinator list from the cluster file after failing to connect to all coordinators

* Wiggle and upgrade test with legacy version monitoring; updating tests to use 7.1.9

* Update coordinator list from the cluster file: addressing review comments

* Update coordinator list from the cluster file: addressing review comments

* Wait on future for all setAndPersistConnectionString calls
2022-06-23 09:22:09 +02:00
Renxuan Wang
839af5701e Fix bug in resolveTCPEndpoint() when hostname resolving fails. (#7375)
* Close trace file when error happens in runNetwork().

* Improve the bestCount algorithm in getLeader().

In the current implementation, if the nominees are [0,1], the chosen leader will be 1, which is an exception to other cases and our expectation that if 2 nominees have the same frequency, the one with lower id will be the leader.

* Remove unnecessary new statement.

stream will never be a nullptr.

* Move self->dnsCache out of lambda capture.

Member variables are not capture by default, thus, `host` and `service` are not captured. This somehow successfully compile, but throws std::bad_alloc or basic_string::_S_create exceptions when we call `host+":"+service` in dnsCache.remove().

* Revert unintended change.

* Address comments.
2022-06-13 20:24:30 -07:00
Renxuan Wang
df4e0deb4d coordinatorsKey should not always store IP addresses. (#7204)
* coordinatorsKey should not storing IP addresses.

Currently, when we do a commit of coordinator change, we are always converting hostnames to IP addresses and store the converted results in coordinatorsKey (\xff/coordinators). This result in ForwardRequest also sending IP addresses, and receivers will update their cluster files with IPs, then we lose the dynamic IP feature.

* Remove the legacy coordinators() function.

* Update async_resolve().

ip::basic_resolver::async_resolve(const query & q, ResolveHandler && handler) is deprecated.

* Clean code format.

* Fix typo.

* Remove SpecifiedQuorumChange and NoQuorumChange.
2022-05-23 11:42:56 -07:00
A.J. Beamon
993a79b42c Restart protocol monitoring logic when we cycle the coordinator, even if its the same coordinator. Add some defensive code in case we fetch the protocol version for an absent peer. (#6397) 2022-05-19 10:00:23 +02:00
Renxuan Wang
30e124c09b Remove HostnameStatus and resolve trigger.
They are no longer needed since we have coordinators DNS cache; and they are introducing complex crashes.
2022-05-12 10:13:55 -07:00
Renxuan Wang
154de018ff One place in PaxosConfigConsumer was missed out in #6926. (#7006)
* One place in PaxosConfigConsumer was missed out.

* Minor improvements.
2022-04-28 18:32:55 -07:00
Renxuan Wang
c69a07a858 Check in the new Hostname logic. (#6926)
* Revert #6655.

20220407-031010-renxuan-c101052c21da8346           compressed=True data_size=31004844 duration=4310801 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=1:04:15 sanity=False started=100047 stopped=20220407-041425 submitted=20220407-031010 timeout=5400 username=renxuan

* Revert #6271.

20220407-051532-renxuan-470f0fe6aac1c217           compressed=True data_size=30982370 duration=3491067 ended=100002 fail_fast=10 max_runs=100000 pass=100002 priority=100 remaining=0 runtime=0:59:57 sanity=False started=100141 stopped=20220407-061529 submitted=20220407-051532 timeout=5400 username=renxuan

* Revert #6266.

Remove resolving-related functionalities in connection string. Connection string will be used for storing purpose only, and non-mutable.

20220407-175119-renxuan-55d30ee1a4b42c2f           compressed=True data_size=30970443 duration=5437659 ended=100000 fail_fast=10 max_runs=100000 pass=100000 priority=100 remaining=0 runtime=0:59:31 sanity=False started=100154 stopped=20220407-185050 submitted=20220407-175119 timeout=5400 username=renxuan

* Add hostname to coordinator interfaces.

* Turn on the new hostname logic.

* Add the corresponding change in config txns.

The most notable change is before calling basicLoadBalance(), we need to call tryInitializeRequestStream() to initialize request streams first.

Passed correctness tests.

* Return error when hostnames cannot be resolved in coordinators command.

* Minor fixes.
2022-04-27 21:54:13 -07:00
Renxuan Wang
e40cc8722c A few hostname improvements. (#6825)
* Add tryResolveHostnames() in connection string.

* Add missing hostname to related interfaces.

* Do not pass RequestStream into *GetReplyFromHostname() functions.

Because we are using new RequestStream for each request anyways. Also, the passed in pointer could be nullptr, which results in seg faults.

* Add dynamic hostname resolve and reconnect intervals.

* Address comments.
2022-04-20 13:42:46 -07:00
Jingyu Zhou
17dc1a61f3 ClientDBInfo may be unintentionally not set
The ClientDBInfo's comparison is through an internal UID and shrinkProxyList()
can change proxies inside ClientDBInfo. Since the UID is not changed by that
function, subsequent set can be unintentionally skipped.

This was not a big issue before. However, VV introduces a change that the
client side compares the returned proxy ID with its known set of GRV proxies
and will retry GRV if the returned proxy ID is not in the set. Due the above
bug, GRV returned by a proxy is not within the client set, and results in
indefinite retrying GRVs.
2022-04-18 09:09:14 -07:00
Renxuan Wang
938e8ed996 Do not throw lookup_failed when resolving fails.
Instead, return an empty Optional<NetworkAddress>. For resolveWithRetry(), still return NetworkAddress because it retries until succeed.
2022-04-08 14:21:49 -07:00
Renxuan Wang
465ff712b6 Move Hostname to its own files. (#6759)
* Change DNS cache to use std::map.

Revert commit 90c259d84e, because if we use unordered_map, toString() can be inconsistent.

* Move ClientKnob::COORDINATOR_HOSTNAME_RESOLVE_DELAY to FlowKnob::HOSTNAME_RESOLVE_DELAY.

* Move Hostname to its own files.

Also, add resolve-related variables and functions in Hostname.
2022-04-04 19:04:51 -07:00
Renxuan Wang
2a59c5fd4e Workers should monitor coordinators in submitCandidacy(). (#6655)
* Workers should monitor coordinators in submitCandidacy().

* Change re-resolve delay to a knob.
2022-03-24 19:20:42 -07:00
sfc-gh-tclinkenbeard
a71099471b Update copyright header dates 2022-03-21 13:36:23 -07:00
Renxuan Wang
3e761aef3b Add resetConnectionString() function to ClusterConnectionString. 2022-02-24 23:02:29 -08:00
A.J. Beamon
250a88e682 Enforce that trace event suppression calls happen first when using trace event call chaining. Fix various instances where we weren't following this requirement. 2022-02-24 12:25:52 -08:00
Renxuan Wang
3c1394578b Address comments. 2022-02-22 16:29:59 -08:00
Renxuan Wang
622d89b552 Rebase on main.
Since we changed ClusterConnectionString's status flag from boolean to enum in #6422, we need to update this PR correspondingly.
2022-02-22 16:29:59 -08:00
Renxuan Wang
8eb7a10404 Address comments. 2022-02-22 16:29:59 -08:00
Renxuan Wang
481587a8c6 Turn on hostname logic. 2022-02-22 16:29:59 -08:00
Renxuan Wang
c29c522d3e Change ClusterConnectionString's status flag from boolean to enum.
So that when multiple threads try to resolve hostnames concurrently, we should have only one resolving and others wait on the result. Otherwise, they would add duplicated resolved results to coordinators.
2022-02-21 10:53:03 -08:00
Renxuan Wang
f9f3735f73 Add resolveHostnamesBlocking() in ConnectionString and IClusterConnectionRecord.
Also, combine IClusterConnectionRecord::getConnectionString() and IClusterConnectionRecord::getMutableConnectionString() to IClusterConnectionRecord::getConnectionString(), and rename setConnectionString() to setAndPersistConnectionString().
2022-01-28 12:20:41 -08:00
Renxuan Wang
8c49391c51 Address comments. 2022-01-20 19:18:15 -08:00
Renxuan Wang
02f72dd377 Address comments. 2022-01-20 19:18:15 -08:00
Renxuan Wang
a6c482ee91 Add the support of using hostname in ConnectionString.
This PR comes without simulation tests except some unit tests. The simulation tests will be in the PR that uses hostname in code logic.
2022-01-20 19:18:15 -08:00
Renxuan Wang
9f70e84a7b Remove trimFromHostname(). 2021-12-07 15:39:51 -08:00
Renxuan Wang
85dff214a4 Address comments. 2021-11-04 11:42:28 -07:00
Renxuan Wang
f15ceb5489 Add Hostname struct, and fromHostname in NetworkAddress struct. 2021-11-04 11:42:28 -07:00
Renxuan Wang
f503ba6a7c Move getConnectionString() to IClusterConnectionRecord.
ClusterConnectionString was moved to IClusterConnectionRecord in PR #5853.
2021-10-27 12:59:52 -07:00
A.J. Beamon
9358adcf49 Address some review comments. 2021-10-22 11:05:18 -07:00
A.J. Beamon
e882eb33fc Abstract the cluster file into a cluster connection record that can be backed by something other than the filesystem. 2021-10-22 11:05:18 -07:00
Vaidas Gasiunas
16171d8252 Refactoring well-known endpoint registration
- List all well-known endpoints of FDB in a single enum
- Identify well-known endpoints by plain IDs
2021-09-21 11:05:31 -06:00
Xiaoge Su
abf73047ca Enforce std:: specifier rather than using namespace 2021-09-16 19:40:28 -07:00
Renxuan Wang
96fcde45c2 Minor leader election code improvements.
1. Rename monitorLeaderRemotely* functions to monitorLeaderWithDelayedCandidacy*. "Remote" is not clearly describing what the functions are doing;
2. Rename monitorLeaderForProxies() to monitorLeaderAndGetClientInfo() to better describe the function;
3. Remove monitorLeaderRemotelyInternal() and monitorLeaderRemotely() in MonitorLeader.actor.cpp, to eliminate code duplication. They already exist in worker.actor.cpp;
4. Move the declaration of getLeader() from LeaderElection.actor.cpp to MonitorLeader.h;
5. Update a few comments.
2021-09-16 15:34:45 -07:00