21 Commits

Author SHA1 Message Date
Edward Thomson
56e2a85643 sha256: simplify API changes for sha256 support
There are several places where users may want to specify the type of
object IDs (sha1 or sha256) that should be used, for example, when
dealing with repositories, indexes, etc.

However, given that sha256 support remains disappointingly uncommon in
the wild, we should avoid hard API breaks when possible. Instead, update
these APIs to have an "extended" format (eg, `git_odb_open_ext`) that
provides an options structure with oid type information.

This allows callers who do care about sha256 to use it, and callers who
do not to avoid gratuitous API breakage.
2025-01-02 13:13:59 +00:00
Edward Thomson
9aa5faa38b indexer: move oid_type into the opts structure
Object ID type should be an option within the options structure; move it
there.
2024-12-18 16:27:46 +00:00
Edward Thomson
fe2ee3a018 object: lookup sha256 objects
This is much of the plumbing for the object database to support SHA256,
and for objects to be able to parse SHA256 versions of themselves.
2023-02-12 22:02:00 +00:00
Edward Thomson
b43567d655 sha256: indirection for experimental functions
The experimental function signature is only available when
`GIT_EXPERIMENTAL_SHA256` is enabled.
2022-07-13 22:50:33 -04:00
Edward Thomson
3eba9181cf odb: add git_odb_options
Users will need to be able to specify the object id type for the given
object database; add a new `git_odb_options` with that option.
2022-06-20 17:05:30 -04:00
Edward Thomson
8444b6dce7 odb_hash*: accept the oid type to hash into
The git_odb_hash helper functions should not assume SHA1, and instead
should be given the oid type that they're producing.
2022-06-20 17:05:29 -04:00
Edward Thomson
dbc4ac1c76 oid: GIT_OID_*SZ is now GIT_OID_SHA1_*SIZE
In preparation for SHA256 support, `GIT_OID_RAWSZ` and `GIT_OID_HEXSZ`
need to indicate that they're the size of _SHA1_ OIDs.
2022-06-14 22:29:57 -04:00
Edward Thomson
f882140577 fuzzer: use raw oid data
The indexer expects raw oid data, provide it.
2022-04-10 16:14:19 -04:00
Edward Thomson
70d9bfa47c packbuilder: use the packfile name instead of hash
Deprecate the `git_packfile_hash` function.  Callers should use the new
`git_packfile_name` function which provides a unique packfile name.
2022-01-27 20:15:09 -05:00
Edward Thomson
489aec4447 fuzzers: declare standalone functions 2021-11-11 17:11:25 -05:00
Edward Thomson
f0e693b18a str: introduce git_str for internal, git_buf is external
libgit2 has two distinct requirements that were previously solved by
`git_buf`.  We require:

1. A general purpose string class that provides a number of utility APIs
   for manipulating data (eg, concatenating, truncating, etc).
2. A structure that we can use to return strings to callers that they
   can take ownership of.

By using a single class (`git_buf`) for both of these purposes, we have
confused the API to the point that refactorings are difficult and
reasoning about correctness is also difficult.

Move the utility class `git_buf` to be called `git_str`: this represents
its general purpose, as an internal string buffer class.  The name also
is an homage to Junio Hamano ("gitstr").

The public API remains `git_buf`, and has a much smaller footprint.  It
is generally only used as an "out" param with strict requirements that
follow the documentation.  (Exceptions exist for some legacy APIs to
avoid breaking callers unnecessarily.)

Utility functions exist to convert a user-specified `git_buf` to a
`git_str` so that we can call internal functions, then converting it
back again.
2021-10-17 09:49:01 -04:00
Patrick Steinhardt
3c966fb4fb fuzzers: clean up header includes
There's multiple headers included in our fuzzers that aren't required at
all. Furthermore, some of them are not available on Win32, causing
builds to fail. Remove them to fix this.
2019-07-05 11:58:33 +02:00
Patrick Steinhardt
9d43d45b21 fuzzers: use git_buf_printf instead of snprintf
The `snprintf` function does not exist on Win32, it only has
`_snprintf_s` available. Let's just avoid any cross-platform hassle and
use our own `git_buf` functionality instead.
2019-07-05 11:58:33 +02:00
Patrick Steinhardt
a6b2fffd46 fuzzers: use POSIX emulation layer to unlink files
Use `p_unlink` instead of `unlink` to remove the generated packfiles in
our packfile fuzzer. Like this, we do not have to worry about using
proper includes that are known on all platforms, especially Win32.
2019-07-05 11:58:33 +02:00
Edward Thomson
a1ef995dc0 indexer: use git_indexer_progress throughout
Update internal usage of `git_transfer_progress` to
`git_indexer_progreses`.
2019-02-22 11:25:14 +00:00
Edward Thomson
115a6c50c9 errors: remove giterr usage in fuzzers 2019-01-22 22:30:37 +00:00
Edward Thomson
83151018ef object_type: convert final internal users to new names
Update some missed types that were continuing to use the old `GIT_OBJ`
names.
2019-01-17 11:03:19 +00:00
Edward Thomson
6d6bec0cc6 fuzzer: update for indexer changes 2018-08-26 11:52:21 +01:00
Patrick Steinhardt
e38ddc90bf fuzzers: limit maximum pack object count
By default, libgit2 allows up to 2^32 objects when downloading a
packfile from a remote. For each of these objects, libgit2 will allocate
up to two small structs, which in total adds up to quite a lot of
memory. As a result, our fuzzers might run out of memory rather quick in
case where they receive as input a packfile with such a huge count of
objects.

Limit the packfile object count to 10M objects. This is sufficiently big
to still work with most largish repos (linux.git has around 6M objects
as of now), but small enough to not cause the fuzzer to OOM.
2018-08-03 09:50:35 +02:00
Patrick Steinhardt
de53972f65 fuzzers: avoid use of libgit2 internals in packfile_raw
The packfile_raw fuzzer is using some internal APIs from libgit2, which
makes it hard to compile it as part of the oss-fuzz project. As oss-fuzz
requires us to link against the C++ FuzzingEngine library, we cannot use
"-DBUILD_FUZZERS=ON" directly but instead have to first compile an
object from our fuzzers and then link against the C++ library. Compiling
the fuzzer objects thus requires an external invocation of CC, and we
certainly don't want to do further black magic by adding libgit2's
private source directory to the header include path.

To fix the issue, convert the code to not use any internal APIs. Besides
some headers which we have to add now, this also requires us to change
to the hashing function of the ODB. Note that this will change the
hashing result, as we have previously not prepended the object header to
the data that is to be hashed. But this shouldn't matter in practice, as
we don't care for the hash value anyway.
2018-08-03 09:50:35 +02:00
Patrick Steinhardt
59328ed84e fuzzers: rename "fuzz" directory to match our style
Our layout uses names like "examples" or "tests" which is why the "fuzz"
directory doesn't really fit in here. Rename the directory to be called
"fuzzers" instead. Furthermore, we rename the fuzzer "fuzz_packfile_raw"
to "packfile_raw_fuzzer", which is also in line with the already
existing fuzzer at google/oss-fuzz.

While at it, rename the "packfile_raw" fuzzer to instead just be called
"packfile" fuzzer.
2018-08-03 09:50:35 +02:00