Commit Graph

173 Commits

Author SHA1 Message Date
Edward Thomson
ba3595af0f diff: deprecate diff_format_email
`git_diff_format_email` is deprecated in favor of `git_email_create`.
2021-09-18 08:32:42 -04:00
Edward Thomson
3f13d2e8a3 email: allow git_diff_commit_as_email to take 0 as patch index
Allow a `0` patch index and `0` patch count; in this case, simply don't
display these in the email.
2021-09-18 08:32:41 -04:00
Edward Thomson
1ee3c37f48 Merge branch 'pr/5853' 2021-05-19 09:31:30 +01:00
Edward Thomson
6b1f6e00bf diff: test ignore-blank-lines 2021-05-18 12:22:22 +01:00
Kartikaya Gupta
2d24690c67 Add testcase 2021-05-12 20:42:25 -04:00
Edward Thomson
d525e063ba buf: remove internal git_buf_text namespace
The `git_buf_text` namespace is unnecessary and strange.  Remove it,
just keep the functions prefixed with `git_buf`.
2021-05-11 01:29:22 +01:00
Edward Thomson
9293e165a0 Merge pull request #5494 from kevinjswinton/master
Fix binary diff showing /dev/null
2020-10-04 21:41:28 +01:00
Edward Thomson
a94fedc113 Merge pull request #5620 from dlax/parse-patch-add-delete-no-index
patch_parse: handle absence of "index" header for new/deleted cases
2020-10-04 18:04:01 +01:00
Drew DeVault
ec26b16d73 diff stats: fix segfaults with new files 2020-09-16 15:53:27 -04:00
Denis Laxalde
74293ea04a patch_parse: handle absence of "index" header for new/deleted cases
This follows up on 11de594f85 which added
support for parsing patches without extended headers (the "index
<hash>..<hash> <mode>" line); issue #5267.

We now allow transition from "file mode" state to "path" state directly
if there is no "index", which will happen for patches adding or deleting
files as demonstrated in added test case.
2020-08-29 16:54:15 +02:00
Edward Thomson
c708e5e51d Merge pull request #5541 from libgit2/ethomson/clar_tap
clar: add tap output option
2020-06-05 14:11:34 +01:00
Edward Thomson
cad7a1bad4 clar: include the function name 2020-06-05 08:49:07 +01:00
Edward Thomson
06d69dfcfd diff::parse: don't include diff.h
We don't call any internal functions in the test; we don't need to
include `../src/diff.h`.
2020-06-05 07:17:15 +01:00
Edward Thomson
3414d4707c diff::workdir: actually test the buffers
The static test data is erroneously initialized with a length of 0 for
three of the strings.  This means the tests are not actually examining
those strings.  Provide the length.
2020-05-23 16:27:56 +01:00
Kevin Swinton
e72ade87fd Fix binary diff showing /dev/null
Fixes issue where a changed binary file's content in the working
tree isn't displayed correctly, instead showing an oid of zero,
and with its path being reported incorrectly as "/dev/null".
2020-04-18 11:32:56 +01:00
Patrick Steinhardt
17670ef25c tests: diff: add test to verify behaviour with empty dir ordering
It was reported that, given a file "abc.txt", a diff will be shown if an
empty directory "abb/" is created, but not if "abd/" is created. Add a
test to verify that we do the right thing here and do not depend on any
ordering.
2020-02-07 15:05:01 +01:00
Patrick Steinhardt
b0691db32c tests: diff: verify that we are able to diff with empty subtrees
While it is not allowed for a tree to have an empty tree as child (e.g.
an empty directory), libgit2's tree builder makes it easy to create such
trees. As a result, some applications may inadvertently end up with such
an invalid tree, and we should try our best and handle them.

One such case is when diffing two trees, where one of both trees has
such an empty subtree. It was reported that this will cause our diff
code to fail. While I wasn't able to reproduce this error, let's still
add a test that verifies we continue to handle them correctly.
2020-02-07 15:05:01 +01:00
Gregory Herrero
ece5bb5e7d diff: make patchid computation work with all types of commits.
Current implementation of patchid is not computing a correct patchid
when given a patch where, for example, a new file is added or removed.
Some more corner cases need to be handled to have same behavior as git
patch-id command.
Add some more tests to cover those corner cases.

Signed-off-by: Gregory Herrero <gregory.herrero@oracle.com>
2019-11-28 14:17:50 +01:00
Denis Laxalde
11de594f85 patch_parse: handle patches without extended headers
Extended header lines (especially the "index <hash>..<hash> <mode>") are
not required by "git apply" so it import patches. So we allow the
from-file/to-file lines (--- a/file\n+++ b/file) to directly follow the
git diff header.

This fixes #5267.
2019-10-16 22:53:29 +02:00
Denis Laxalde
b61810bf1f patch_parse: handle patches with new empty files
Patches containing additions of empty files will not contain diff data
but will end with the index header line followed by the terminating
sequence "-- ". We follow the same logic as in cc4c44a and allow "-- "
to immediately follow the index header.
2019-09-28 15:52:25 +02:00
Patrick Steinhardt
e54343a402 fileops: rename to "futils.h" to match function signatures
Our file utils functions all have a "futils" prefix, e.g.
`git_futils_touch`. One would thus naturally guess that their
definitions and implementation would live in files "futils.h" and
"futils.c", respectively, but in fact they live in "fileops.h".

Rename the files to match expectations.
2019-07-20 19:11:20 +02:00
Patrick Steinhardt
658022c41a configuration: cvar -> configmap
`cvar` is an unhelpful name.  Refactor its usage to `configmap` for more
clarity.
2019-07-18 13:53:41 +02:00
Edward Thomson
5d92e54745 oid: is_zero instead of iszero
The only function that is named `issomething` (without underscore) was
`git_oid_iszero`.  Rename it to `git_oid_is_zero` for consistency with
the rest of the library.
2019-06-16 00:16:47 +01:00
Edward Thomson
0b5ba0d744 Rename opt init functions to options_init
In libgit2 nomenclature, when we need to verb a direct object, we name
a function `git_directobject_verb`.  Thus, if we need to init an options
structure named `git_foo_options`, then the name of the function that
does that should be `git_foo_options_init`.

The previous names of `git_foo_init_options` is close - it _sounds_ as
if it's initializing the options of a `foo`, but in fact
`git_foo_options` is its own noun that should be respected.

Deprecate the old names; they'll now call directly to the new ones.
2019-06-14 09:57:00 +01:00
Drew DeVault
30c06b601e patch_parse.c: Handle CRLF in parse_header_start 2019-04-05 20:44:10 -04:00
Erik Aigner
9d65360b4e tests: diff: test parsing diffs with a new file with spaces in its path
Add a test that verifies that we are able to parse patches which add a
new file that has spaces in its path.
2019-03-29 12:51:49 +01:00
Edward Thomson
f673e232af git_error: use new names in internal APIs and usage
Move to the `git_error` name in the internal API for error-related
functions.
2019-01-22 22:30:35 +00:00
Edward Thomson
168fe39bea object_type: use new enumeration names
Use the new object_type enumeration names within the codebase.
2018-12-01 11:54:57 +00:00
Edward Thomson
18e71e6d59 index: use new enum and structure names
Use the new-style index names throughout our own codebase.
2018-12-01 10:46:44 +00:00
Patrick Steinhardt
e5090ee329 diff_stats: use git's formatting of renames with common directories
In cases where a file gets renamed such that the directories containing
it previous and after the rename have a common prefix, then git will
avoid printing this prefix twice and instead format the rename as
"prefix/{old => new}". We currently didn't do anything like that, but
simply printed "prefix/old -> prefix/new".

Adjust our behaviour to instead match upstream. Adjust the test for this
behaviour to expect the new format.
2018-10-04 11:26:24 +02:00
Patrick Steinhardt
3148efd2ee tests: verify diff stats with renames in subdirectory
Until now, we didn't have any tests that verified that our format for
renames in subdirectories is correct. While our current behaviour is no
different than for renames that do not happen with a common prefix
shared between old and new file name, we intend to change the format to
instead match the format that upstream git uses.

Add a test case for this to document our current behaviour and to show
how the next commit will change that format.
2018-10-04 11:24:24 +02:00
Patrick Steinhardt
0652abaaea Merge pull request #4702 from tiennou/fix/coverity
Assorted Coverity fixes
2018-07-20 12:56:49 +02:00
Edward Thomson
6dfc8bc249 Merge pull request #4719 from pks-t/pks/delta-oob
Delta OOB access
2018-07-09 23:10:05 +01:00
Etienne Samson
8455a2709f tests: add missing cl_git_pass to tests
Reported by Coverity, CID 1393678-1393697.
2018-07-06 22:24:21 +02:00
Patrick Steinhardt
7db258706a delta: fix sign-extension of big left-shift
Our delta code was originally adapted from JGit, which itself adapted it
from git itself. Due to this heritage, we inherited a bug from git.git
in how we compute the delta offset, which was fixed upstream in
48fb7deb5 (Fix big left-shifts of unsigned char, 2009-06-17). As
explained by Linus:

    Shifting 'unsigned char' or 'unsigned short' left can result in sign
    extension errors, since the C integer promotion rules means that the
    unsigned char/short will get implicitly promoted to a signed 'int' due to
    the shift (or due to other operations).

    This normally doesn't matter, but if you shift things up sufficiently, it
    will now set the sign bit in 'int', and a subsequent cast to a bigger type
    (eg 'long' or 'unsigned long') will now sign-extend the value despite the
    original expression being unsigned.

    One example of this would be something like

            unsigned long size;
            unsigned char c;

            size += c << 24;

    where despite all the variables being unsigned, 'c << 24' ends up being a
    signed entity, and will get sign-extended when then doing the addition in
    an 'unsigned long' type.

    Since git uses 'unsigned char' pointers extensively, we actually have this
    bug in a couple of places.

In our delta code, we inherited such a bogus shift when computing the
offset at which the delta base is to be found. Due to the sign extension
we can end up with an offset where all the bits are set. This can allow
an arbitrary memory read, as the addition in `base_len < off + len` can
now overflow if `off` has all its bits set.

Fix the issue by casting the result of `*delta++ << 24UL` to an unsigned
integer again. Add a test with a crafted delta that would actually
succeed with an out-of-bounds read in case where the cast wouldn't
exist.

Reported-by: Riccardo Schirone <rschiron@redhat.com>
Test-provided-by: Riccardo Schirone <rschiron@redhat.com>
2018-06-29 09:29:49 +02:00
Etienne Samson
f9e2802675 patch_parse: populate line numbers while parsing diffs 2018-06-19 00:12:58 +02:00
Patrick Steinhardt
ecf4f33a4e Convert usage of git_buf_free to new git_buf_dispose 2018-06-10 19:34:37 +02:00
Stan Hu
9d83a2b087 Sanitize the hunk header to ensure it contains UTF-8 valid data
The diff driver truncates the hunk header text to 80 bytes, which can truncate
4-byte Unicode characters and introduce garbage characters in the diff
output. This change sanitizes the hunk header before it is displayed.

This mirrors the test in git: https://github.com/git/git/blob/master/t/t4025-hunk-header.sh

Closes https://github.com/libgit2/rugged/issues/716
2018-05-05 14:54:27 -07:00
Erik van Zijst
cd6a4323b7 typo: Fixed a trivial typo in test function. 2018-04-05 13:15:45 -07:00
Patrick Steinhardt
ce7080a0a3 diff_tform: fix rename detection with rewrite/delete pair
A rewritten file can either be classified as a modification of its
contents or of a delete of the complete file followed by an addition of
the new content. This distinction becomes important when we want to
detect renames for rewrites. Given a scenario where a file "a" has been
deleted and another file "b" has been renamed to "a", this should be
detected as a deletion of "a" followed by a rename of "a" -> "b". Thus,
splitting of the original rewrite into a delete/add pair is important
here.

This splitting is represented by a flag we can set at the current delta.
While the flag is already being set in case we want to break rewrites,
we do not do so in case where the `GIT_DIFF_FIND_RENAMES_FROM_REWRITES`
flag is set. This can trigger an assert when we try to match the source
and target deltas.

Fix the issue by setting the `GIT_DIFF_FLAG__TO_SPLIT` flag at the delta
when it is a rename target and `GIT_DIFF_FIND_RENAMES_FROM_REWRITES` is
set.
2018-02-20 11:03:42 +00:00
Patrick Steinhardt
80e77b8704 tests: add rename-rewrite scenarios to "renames" repository
Add two more scenarios to the "renames" repository. The first scenario
has a major rewrite of a file and a delete of another file, the second
scenario has a deletion of a file and rename of another file to the
deleted file. Both scenarios will be used in the following commit.
2018-02-20 11:01:01 +00:00
Patrick Steinhardt
d91da1da06 tests: diff::rename: use defines for commit OIDs
While we frequently reuse commit OIDs throughout the file, we do not
have any constants to refer to these commits. Make this a bit easier to
read by giving the commit OIDs somewhat descriptive names of what kind
of commit they refer to.
2018-02-20 10:58:56 +00:00
Patrick Steinhardt
2388a9e2ab diff_file: properly refcount blobs when initializing file contents
When initializing a `git_diff_file_content` from a source whose data is
derived from a blob, we simply assign the blob's pointer to the
resulting struct without incrementing its refcount. Thus, the structure
can only be used as long as the blob is kept alive by the caller.

Fix the issue by using `git_blob_dup` instead of a direct assignment.
This function will increment the refcount of the blob without allocating
new memory, so it does exactly what we want. As
`git_diff_file_content__unload` already frees the blob when
`GIT_DIFF_FLAG__FREE_BLOB` is set, we don't need to add new code
handling the free but only have to set that flag correctly.
2017-12-15 10:52:13 +00:00
Patrick Steinhardt
cc4c44a98a patch_parse: fix parsing patches only containing exact renames
Patches which contain exact renames only will not contain an actual diff
body, but only a list of files that were renamed. Thus, the patch header
is immediately followed by the terminating sequence "-- ". We currently
do not recognize this character sequence as a possible terminating
sequence. Add it and create a test to catch the failure.
2017-09-01 09:41:12 +02:00
Patrick Steinhardt
89a3482829 diff: implement function to calculate patch ID
The upstream git project provides the ability to calculate a so-called
patch ID. Quoting from git-patch-id(1):

    A "patch ID" is nothing but a sum of SHA-1 of the file diffs
    associated with a patch, with whitespace and line numbers ignored."

Patch IDs can be used to identify two patches which are probably the
same thing, e.g. when a patch has been cherry-picked to another branch.

This commit implements a new function `git_diff_patchid`, which gets a
patch and derives an OID from the diff. Note the different terminology
here: a patch in libgit2 are the differences in a single file and a diff
can contain multiple patches for different files. The implementation
matches the upstream implementation and should derive the same OID for
the same diff. In fact, some code has been directly derived from the
upstream implementation.

The upstream implementation has two different modes to calculate patch
IDs, which is the stable and unstable mode. The old way of calculating
the patch IDs was unstable in a sense that a different ordering the
diffs was leading to different results. This oversight was fixed in git
1.9, but as git tries hard to never break existing workflows, the old
and unstable way is still default. The newer and stable way does not
care for ordering of the diff hunks, and in fact it is the mode that
should probably be used today. So right now, we only implement the
stable way of generating the patch ID.
2017-06-26 15:39:26 +02:00
Patrick Steinhardt
c0eba379d1 diff_parse: correctly set options for parsed diffs
The function `diff_parsed_alloc` allocates and initializes a
`git_diff_parsed` structure. This structure also contains diff options.
While we initialize its flags, we fail to do a real initialization of
its values. This bites us when we want to actually use the generated
diff as we do not se the option's version field, which is required to
operate correctly.

Fix the issue by executing `git_diff_init_options` on the embedded
struct.
2017-03-14 13:09:35 +01:00
Patrick Steinhardt
ad5a909cfb patch_parse: fix parsing minimal trailing diff line
In a diff, the shortest possible hunk with a modification (that is, no
deletion) results from a file with only one line with a single character
which is removed. Thus the following hunk

    @@ -1 +1 @@
    -a
    +

is the shortest valid hunk modifying a line. The function parsing the
hunk body though assumes that there must always be at least 4 bytes
present to make up a valid hunk, which is obviously wrong in this case.
The absolute minimum number of bytes required for a modification is
actually 2 bytes, that is the "+" and the following newline. Note: if
there is no trailing newline, the assumption will not be offended as the
diff will have a line "\ No trailing newline" at its end.

This patch fixes the issue by lowering the amount of bytes required.
2017-03-14 13:09:13 +01:00
Patrick Steinhardt
ace3508f4c patch_generate: fix git_diff_foreach only working with generated diffs
The current logic of `git_diff_foreach` makes the assumption that all
diffs passed in are actually derived from generated diffs. With these
assumptions we try to derive the actual diff by inspecting either the
working directory files or blobs of a repository. This obviously cannot
work for diffs parsed from a file, where we do not necessarily have a
repository at hand.

Since the introduced split of parsed and generated patches, there are
multiple functions which help us to handle patches generically, being
indifferent from where they stem from. Use these functions and remove
the old logic specific to generated patches. This allows re-using the
same code for invoking the callbacks on the deltas.
2017-03-14 13:08:28 +01:00
Edward Thomson
610cff13a3 Merge branch 'pr/3809' 2016-10-09 16:05:48 +01:00
Sim Domingo
dc5cfdbab9 make git_diff_stats_to_buf not show 0 insertions or 0 deletions 2016-10-09 16:03:00 +01:00