Look for the `GIT_SSH_COMMAND` environment variable and prefer it to
`GIT_SSH`. The `GIT_SSH_COMMAND` will execute via the shell, which is
useful to provide additional arguments.
When sending paths to the remote server, escape them properly.
Escape them with a single quote, followed by the escaped character,
followed by another single quote. This prevents misparsing on the
remote side and potential command injection.
Allow `git_str_puts_escaped` to take an escaping prefix and an escaping
suffix; this allows for more options, including the ability to better
support escaping executed paths.
Construct the arguments for the ssh exec as an explicit array, instead
of trying to create a command-line for sh. The latter may use user input
(the remote path) so this may be vulnerable to command injection.
When using `git_process_new` on win32, resolve the path to the
application in the same way that we do on POSIX.
Search `PATH` for command to execute (unless the given executable is
fully qualified). In addition, better match Windows executable lookup
behavior itself (allowing the command to be `foo`, and looking for a
matching `foo.exe` or `foo.cmd`.)
By default, `git_process_new` will no longer try to prepare a single
string to execute with the shell. Instead, by default, arguments remain
parameterized and the command to execute is located within the `PATH`.
The shell can also still optionally be used (so that additional
arguments can be included and variables handled appropriately) but this
is done by keeping arguments parameterized for safety.
This new behavior prevents accidental misuse and potential command-line
injection.
Ensure that when we look for an executable on Windows that we add
executable suffixes (`.exe`, `.cmd`). Without this, we would not support
looking for (eg) `ssh`, since we actually need to identify a file named
`ssh.exe` (or `ssh.cmd`) in `PATH`.
* Do not search `PATH` for fully- or partially-qualified filenames
(eg, `foo/bar`)
* Ensure that a file in the `PATH` is executable before returning it
Ensure that our `find_executable` behaves as expected:
* When the executable contains a fully- or partially-qualified filename
component (eg, `foo/bar`) that `PATH` is not searched; these paths are
relative to the current working directory.
* An empty segment in `PATH` (on POSIX systems) is treated as the
current directory; this is for compatibility with Bourne shells.
* When a file exists in `PATH`, it is actually executable (on POSIX)
It's certainly possible for the root filesystem to be case-sensitive
while /tmp is not, or vice versa. One example where this might happen
is when running Docker containers (like ci/docker/fedora) on macOS with
the repository checkout on AppleFS (not case sensitive) while the
container's /tmp is case sensitive.
This fix allows the test to pass under those circumstances as well.
The `ssh_custom_free()` function calls `strlen()` on the `publickey`
field, which stores binary data, not a null-terminated string. This
causes a heap buffer overflow when the public key data is not
null-terminated or contains embedded null bytes.
The `publickey` field stores binary data, as required by the underlying
`libssh2_userauth_publickey()` function, which accepts a public key
parameter of the type `const unsigned char*`.
Use the stored `publickey_len` instead of `strlen()` to determine the
correct buffer size.
opt_usage.c:214:59: warning: 'required' may be used uninitialized [-Wmaybe-uninitialized]
214 | ((spec->usage & CLI_OPT_USAGE_CHOICE) && required));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~
diff_driver.c:343:17: warning: 'drv' may be used uninitialized [-Wmaybe-uninitialized]
343 | if (drv && drv != *out)
| ~~~~^~~~~~~~~~~~~~
Prior to this patch the code correctly barfed on
control characters with values lower than \040 (space),
but failed to account for DEL.
This patch fixes the behavior to be consistent with git [1]:
> They cannot have ASCII control characters (i.e. bytes whose values are
> lower than \040, or \177 DEL)
[1]: https://git-scm.com/docs/git-check-ref-format#_description
This removes the phrase "if cert verification fails" because the
certificate callback is *always* called whether it fails or not. This
was changed in
17491f6e56,
but presumably this piece of documentation was not updated.
The initialization of the on-disk state of refdbs is currently not
handled by the actual refdb backend, but it's implemented ad-hoc where
needed. This is problematic once we have multiple different refdbs as
the filesystem structure is of course not the same.
Introduce a new callback function `git_refdb_backend::init()`. If set,
this callback can be invoked via `git_refdb_init()` to initialize the
on-disk state of a refdb. Like this, each backend can decide for itself
how exactly to do this.
Note that the initialization of the refdb is a bit intricate. A
repository is only recognized as such when it has a "HEAD" file as well
as a "refs/" directory. Consequently, regardless of which refdb format
we use, those files must always be present. This also proves to be
problematic for us, as we cannot access the repository and thus don't
have access to the refdb if those files didn't exist.
To work around the issue we thus handle the creation of those files
outside of the refdb-specific logic. We actually use the same strategy
as Git does, and write the invalid reference "ref: refs/heads/.invalid"
into "HEAD". This looks almost like a ref, but the name of that ref
is not valid and should thus trip up Git clients that try to read that
ref in a repository that really uses a different format.
So while that invalid "HEAD" reference will of course get rewritten by
the "files" backend, other backends should just retain it as-is.
In our tests for "onbranch" config conditionals we set HEAD to point to
various different branches via `git_repository_create_head()`. This
function circumvents the refdb though and directly writes to the "HEAD"
file. While this works now, it will create problems once we have
multiple refdb backends.
Furthermore, the function is about to go away in the next commit. So
let's prepare for that and use `git_reference_symbolic_create()`
instead.
While we only support initializing repositories with the "files"
reference backend right now, we are in the process of implementing a
second backend with the "reftable" format. And while we already have the
infrastructure to decide which format a repository should use when we
open it, we do not have infrastructure yet to create new repositories
with a different reference format.
Introduce a new field `git_repository_init_options::refdb_type`. If
unset, we'll default to the "files" backend. Otherwise though, if set to
a valid `git_refdb_t`, we will use that new format to initialize the
repostiory.
Note that for now the only thing we do is to write the "refStorage"
extension accordingly. What we explicitly don't yet do is to also handle
the backend-specific logic to initialize the refdb on disk. This will be
implemented in subsequent commits.
To support multiple different reference backend implementations,
Git introduced a "refStorage" extension that stores the reference
storage format a Git client should try to use.
Wire up the logic to read this new extension when we open a repository
from disk. For now, only the "files" backend is supported by us. When
trying to open a repository that has a refstorage format that we don't
understand we now error out.
There are two functions that create a new repository that doesn't really
have references. While those are mostly non-functional when it comes to
references, we do expect that you can access the refdb, even if it's not
yielding any refs. For now we mark those to use the "files" backend, so
that the status quo is retained. Eventually though it might not be the
worst idea to introduce an explicit "in-memory" reference database. But
that is outside the scope of this patch series.
When we read the repository format information we do so by using the
full configuration of that repository. This configuration not only
includes the repository-level configuration though, but it also includes
the global- and system-level configuration. These configurations should
in practice never contain information about which format a specific
repository uses.
Despite this obvious conceptual error there's also a more subtle issue:
reading the full configuration may require us to evaluate conditional
includes. Those conditional includes may themselves require that the
repository format is already populated though. This is for example the
case with the "onbranch" condition: we need to populate the refdb to
evaluate that condition, but to populate the refdb we need to first know
about the repository format.
Fix this by using the repository-level configuration, only, to determine
the repository's format.
With a recent upgrade to a newer version of MSVC we now get a bunch of
warnings when two operands use different enum types. While sensible in
theory, in practice we have a couple of non-public enums that extend
public enums, like for example with `GIT_SUBMODULE_STATUS`.
Let's for now disable this warning to unblock our builds. The
alternative would be to add casts all over the place, but that feels
rather cumbersome.