By continuing to serve the configuration database interface used by
clients even when disabled, clients no longer need to specify a special
`--no-config-db` option when running commands which talk to the
configuration database.
An inconsistent coordinator disk queue could cause repeated reboots of
the coordinator process. In simulation, this could cause stuck runs if
another role was running on the coordinator process, as the other role
would also repeatedly reboot (specifically stateful roles like tlogs).
See the comment included in this commit for much more detail.
Note: Server processes started getting reported as clients since 7.1.0
(not sure if this change in behavior was intentional or not), and this
breaks the operator upgrade logic.
The `--no-config-db` flag, passed to `fdbserver`, will disable the
configuration database. When this flag is specified, no `ConfigNode`s
will be started, the `ConfigBroadcaster` will not be started, and on a
coordinator change no attempt will be made to lock `ConfigNode`s.
Configuration database data lives on the coordinators. When a change
coordinators command is issued, the data must be sent to the new
coordinators to keep the database consistent.
* Debug version of fdbserver -r changeclusterkey tool
* Remove debug symbols; move function to Coordinaton.actor.cpp
* Format and add traces
* fix comments
* coordinatorsKey should not storing IP addresses.
Currently, when we do a commit of coordinator change, we are always converting hostnames to IP addresses and store the converted results in coordinatorsKey (\xff/coordinators). This result in ForwardRequest also sending IP addresses, and receivers will update their cluster files with IPs, then we lose the dynamic IP feature.
* Remove the legacy coordinators() function.
* Update async_resolve().
ip::basic_resolver::async_resolve(const query & q, ResolveHandler && handler) is deprecated.
* Clean code format.
* Fix typo.
* Remove SpecifiedQuorumChange and NoQuorumChange.
* Add tryResolveHostnames() in connection string.
* Add missing hostname to related interfaces.
* Do not pass RequestStream into *GetReplyFromHostname() functions.
Because we are using new RequestStream for each request anyways. Also, the passed in pointer could be nullptr, which results in seg faults.
* Add dynamic hostname resolve and reconnect intervals.
* Address comments.
Because when a coordinator restarts or newly joins a cluster, a client trying to connect to it may already have client data, while the coordinator doesn't. In this case, the coordinator should not reply empty client data.
1. Rename monitorLeaderRemotely* functions to monitorLeaderWithDelayedCandidacy*. "Remote" is not clearly describing what the functions are doing;
2. Rename monitorLeaderForProxies() to monitorLeaderAndGetClientInfo() to better describe the function;
3. Remove monitorLeaderRemotelyInternal() and monitorLeaderRemotely() in MonitorLeader.actor.cpp, to eliminate code duplication. They already exist in worker.actor.cpp;
4. Move the declaration of getLeader() from LeaderElection.actor.cpp to MonitorLeader.h;
5. Update a few comments.