Skip to content

Fix blob filter types #1459

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions git/index/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
from .typ import (
BaseIndexEntry,
IndexEntry,
StageType,
)
from .util import TemporaryFileSwap, post_clear_cache, default_index, git_working_dir

Expand Down Expand Up @@ -83,13 +84,12 @@
from git.util import Actor


StageType = int
Treeish = Union[Tree, Commit, str, bytes]

# ------------------------------------------------------------------------------------


__all__ = ("IndexFile", "CheckoutError")
__all__ = ("IndexFile", "CheckoutError", "StageType")


class IndexFile(LazyMixin, git_diff.Diffable, Serializable):
Expand Down
17 changes: 11 additions & 6 deletions git/index/typ.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
"""Module with additional types used by the index"""

from binascii import b2a_hex
from pathlib import Path

from .util import pack, unpack
from git.objects import Blob
Expand All @@ -15,9 +16,11 @@
if TYPE_CHECKING:
from git.repo import Repo

StageType = int

# ---------------------------------------------------------------------------------

__all__ = ("BlobFilter", "BaseIndexEntry", "IndexEntry")
__all__ = ("BlobFilter", "BaseIndexEntry", "IndexEntry", "StageType")

# { Invariants
CE_NAMEMASK = 0x0FFF
Expand Down Expand Up @@ -48,12 +51,14 @@ def __init__(self, paths: Sequence[PathLike]) -> None:
"""
self.paths = paths

def __call__(self, stage_blob: Blob) -> bool:
path = stage_blob[1].path
for p in self.paths:
if path.startswith(p):
def __call__(self, stage_blob: Tuple[StageType, Blob]) -> bool:
blob_pathlike: PathLike = stage_blob[1].path
blob_path: Path = blob_pathlike if isinstance(blob_pathlike, Path) else Path(blob_pathlike)
for pathlike in self.paths:
path: Path = pathlike if isinstance(pathlike, Path) else Path(pathlike)
# TODO: Change to use `PosixPath.is_relative_to` once Python 3.8 is no longer supported.
if all(i == j for i, j in zip(path.parts, blob_path.parts)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not another way of writing path1 == path2, since all path components (here .parts) have to be equivalent? The benefit of doing so of course is that it ignores which path separator is used, but I am having trouble to understand how this checks that blob_path starts with one of self.paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should work because zip stops when the shortest of the arguments is exhausted. So it is only checking if all the path components are equivalent in the case that the length of the parts are equal. But, when one path has more parts than the other, then it is checking that the longer path starts with the shorter path.

Example:

def startswith(a, b): return all(i == j for i, j in zip(a, b))

a = [1, 2, 3]
b = [1, 2]
assert startswith(a, b)

Copy link
Contributor Author

@AustinScola AustinScola Jun 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, the implementation of PosixPath.is_relative_to looks like this:

    def is_relative_to(self, *other):
        """Return True if the path is relative to another path or False.
        """
        try:
            self.relative_to(*other)
            return True
        except ValueError:
            return False

We could do something similar to this in-place.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation, I see how this works now and why this is correct. The alternative has to remain a TODO until it becomes available, and looking at the code I am happy to suggest to not ever go there - the try:except clause seems like a deal-breaker to me. It's entirely up to you though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let me undo what I last said. I once again think this implementation is incorrect, as these paths are not equivalent.

base = [1, 2, 3]
added_path = [1, 2]

The above would match even though added_path is not contained in base.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I pushed a change that addresses this and adds a test too.

return True
# END for each path in filter paths
return False


Expand Down