The initialization of the on-disk state of refdbs is currently not
handled by the actual refdb backend, but it's implemented ad-hoc where
needed. This is problematic once we have multiple different refdbs as
the filesystem structure is of course not the same.
Introduce a new callback function `git_refdb_backend::init()`. If set,
this callback can be invoked via `git_refdb_init()` to initialize the
on-disk state of a refdb. Like this, each backend can decide for itself
how exactly to do this.
Note that the initialization of the refdb is a bit intricate. A
repository is only recognized as such when it has a "HEAD" file as well
as a "refs/" directory. Consequently, regardless of which refdb format
we use, those files must always be present. This also proves to be
problematic for us, as we cannot access the repository and thus don't
have access to the refdb if those files didn't exist.
To work around the issue we thus handle the creation of those files
outside of the refdb-specific logic. We actually use the same strategy
as Git does, and write the invalid reference "ref: refs/heads/.invalid"
into "HEAD". This looks almost like a ref, but the name of that ref
is not valid and should thus trip up Git clients that try to read that
ref in a repository that really uses a different format.
So while that invalid "HEAD" reference will of course get rewritten by
the "files" backend, other backends should just retain it as-is.
There are several places where users may want to specify the type of
object IDs (sha1 or sha256) that should be used, for example, when
dealing with repositories, indexes, etc.
However, given that sha256 support remains disappointingly uncommon in
the wild, we should avoid hard API breaks when possible. Instead, update
these APIs to have an "extended" format (eg, `git_odb_open_ext`) that
provides an options structure with oid type information.
This allows callers who do care about sha256 to use it, and callers who
do not to avoid gratuitous API breakage.
Instead of making the commit and dump functions take individual options
structures; provide the options structure to the writer creator. This
allows us to add additional information (like OID type) during
generation.
The `git_reflog_entry__alloc` function is not actually defined, nor
used. Remove references to it in the headers. It is not clear why the
corresponding `__free` is, or should be, exported. Make it internal to
the library.
Unlike existing functions, this produces a _thin_ packfile by
making use of the fact that only new objects appear in the
mempack Object Database.
A thin packfile only contains certain objects, but not its whole
closure of references. This makes it suitable for efficiently
writing sets of new objects to a local repository, by avoiding
many small I/O operations.
This relies on write_pack (e.g. git_packbuilder_write_buf) to
implement the "recency order" optimization step.
Basic measurements comparing against the writing of individual
objects show a speedup during when writing large amounts of
content on machines with comparatively slow I/O operations,
and little to no change on machines with fast I/O operations.
This is a leftover leaky abstraction. If consumers aren't meant to
_call_ the `free` function then they shouldn't _see_ the free function.
Move it out into a `git_config_backend_entry` that is, well, produced by
the config backends.
This makes our code messier but is an improvement for consumers.
Update our headers so that they can include the necessary definitions.
Docs generators (in particular, `clang -Xclang -ast-dump`) were unable
to see the necessary definitions.
Most callers only need to _get_ error messages. Only callers implemented
more complicated functions (like a custom ODB for example) need to set
them.
(Callback users should likely ferry their own error information in their
callback payload.)
Provide two memory-backed configuration backends -- one that takes a
string in config file format `[section] key=value` and one that takes a
list of strings in `section.key=value` format.
Remove the number of functions that custom allocator users need to
provide; nobody should need to implement `substrdup`. Keep it to the
basics that are actually _needed_ for allocation (malloc, realloc,
free) and reimplement the rest ourselves.
In addition, move the failure check and error setting _out_ of the
custom allocators and into a wrapper so that users don't need to deal
with this. This also allows us to call our allocator (without the
wrapper) early so that it does not try to set an error on failure, which
may be important for bootstrapping.
Make socket I/O non-blocking and add optional timeouts.
Users may now set `GIT_OPT_SET_SERVER_CONNECT_TIMEOUT` to set a shorter
connection timeout. (The connect timeout cannot be longer than the
operating system default.) Users may also now configure the socket read
and write timeouts with `GIT_OPT_SET_SERVER_TIMEOUT`.
By default, connects still timeout based on the operating system
defaults (typically 75 seconds) and socket read and writes block.
Add a test against our custom testing git server that ensures that we
can timeout reads against a slow server.
Users should provide us an array of object ids; we don't need a separate
type. And especially, we should not be mutating user-providing values.
Instead, use `git_oid *` in the shallow code.
Push options are an optional capability of a git server. If they are
listed in the servers capabilities, when the client lists push-options
in it's list of capabilities, then the client sends it's list of push
options followed by a flush-pkt.
So, If we have any declared push options, then we will list it as a
client capability, and send the options.
If the request contains push options but the server has no push
options capability, then error.
6fc6eeb66c removed
`git_transport_smart_proxy_option`, and there was nothing added to
replace it. That made it hard for custom transports / smart
subtransports to know what remote connect options to use (e.g. proxy
options).
This change introduces `git_transport_smart_remote_connect_options` to
replace it.
The existing mechanism for providing options to remote fetch/push calls,
and subsequently to transports, is unsatisfactory. It requires an
options structure to avoid breaking the API and callback signatures.
1. Introduce `git_remote_connect_options` to satisfy those needs.
2. Add a new remote connection API, `git_remote_connect_ext` that will
take this new options structure. Existing `git_remote_connect` calls
will proxy to that. `git_remote_fetch` and `git_remote_push` will
proxy their fetch/push options to that as well.
3. Define the interaction between `git_remote_connect` and fetch/push.
Connect _may_ be called before fetch/push, but _need not_ be. The
semantics of which options would be used for these operations was
not specified if you specify options for both connect _and_ fetch.
Now these are defined that the fetch or push options will be used
_if_ they were specified. Otherwise, the connect options will be
used if they were specified. Otherwise, the library's defaults will
be used.
4. Update the transports to understand `git_remote_connect_options`.
This is a breaking change to the systems API.
Commit b1a6c316a6 moved auto-refresh into
the pack backend, and added a comment accordingly. Commit
43820f204e moved auto-refresh back *out*
of backends into the ODB layer, but didn't update the comment.
Introduce a function to create an email from a diff and multiple inputs
about the source of the diff.
Creating an email from a diff requires many more inputs, and should be
discouraged in favor of building directly from a commit, and is thus in
the `sys` namespace.