Skip to content

Safer tempfiles #1450

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

whokilleddb
Copy link
Contributor

Description

Replaced mktemp() with safer mkstemp() to prevent Race Conditions

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in each file seem to be leaking file handles.

The race here is that somebody else could write a file to the path we obtained which in both cases will be overwritten with another file written by git or that is moved into place by GitPython. If another processes manages to write the file after GitPython/git respectively than this is nothing this change prevents either.

For that reason I don't think this PR will go anywhere as there isn't anything to fix.
Please educate me if I am missing something.

@@ -351,7 +351,7 @@ def from_tree(cls, repo: "Repo", *treeish: Treeish, **kwargs: Any) -> "IndexFile

# tmp file created in git home directory to be sure renaming
# works - /tmp/ dirs could be on another device
tmp_index = tempfile.mktemp("", "", repo.git_dir)
tmp_index = tempfile.NamedTemporaryFile(dir=repo.git_dir, delete=False).name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this effectively opens a temporary file, extracts the name, and then forgets about the opened file.
As there are no destructors in python, it's unclear when (or if) this resource is ever freed and is effectively leaking a very limited process resource over time.

@@ -40,7 +40,8 @@ class TemporaryFileSwap(object):

def __init__(self, file_path: PathLike) -> None:
self.file_path = file_path
self.tmp_file_path = str(self.file_path) + tempfile.mktemp("", "", "")
self.tmp_file_path = str(self.file_path) + osp.basename(tempfile.mkstemp("", "", "")[1])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user is responsible for closing the file in [0] or else it will definitely be leaked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Byron, to be exact, functions that create temporary file names (such as tempfile.mktemp and os.tempnam) are fundamentally insecure, as they do not ensure exclusive access to a file with the temporary name they return. The file name returned by these functions is guaranteed to be unique on creation but the file must be opened in a separate operation. There is no guarantee that the creation and open operations will happen atomically. This provides an opportunity for an attacker to interfere with the file before it is opened.

Also, if nothing else, mktemp has been deprecated since Python 2.3 [ Source ]

@whokilleddb whokilleddb requested a review from Byron May 28, 2022 15:31
@whokilleddb
Copy link
Contributor Author

Also, I found this amazing stack overflow answer. I hope this is helpful. Thank You :D

Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please make a statement related to the resource leaks?

What's an actual attack vector in the two cases, respecting their context (i.e. directory the file lives in, its intended use)? Remember that in index/base.py the file is going to be written into by git, in index/util.py it's going to be the target of a move operation.

@whokilleddb
Copy link
Contributor Author

Hi @Byron, thank you for being so patient. The reason mktemp is vulnerable to the race condition is that an attacker can simply predict and replace the temporary file with a symlink to a sensitive file.

Later when the program opens the symlink for operations like read or write, they end up writing or reading from the sensitive file instead, which is enough to tamper with a system or leak information about it.

However, mkstemp and NamedTemporaryFile (they internally call the same functions) solve this issue by:

  • mkstemp returns a handle to the file as well as the filename. So, there is no race condition between creation and opening. It also employs additional checks to make sure that there are no conflicts in name while creation. Hence, an attacker can no longer get the opportunity to replace the file with a symlink, thereby effectively dismissing the race condition.

  • NamedTemporaryFile internally calls the mkstemp function as well, thereby encapsulating all it's safety features, with the added benefit that it returns a python File object and with delete=True set, the temporary files so created will be deleted once it's closed, thereby preventing any information leak.

I hope this helps with the explanation. Thank you for being so patient and kind with the issue. It's been a pleasure to contribute and learn 😄

@whokilleddb whokilleddb requested a review from Byron May 29, 2022 05:46
@Byron
Copy link
Member

Byron commented May 29, 2022

Can you please make a statement related to the resource leaks?

I don't know what to do to get an answer about this. If leaking resources is OK, the python process will crash after a while which seems a lot like denial of access if caused by an attacker. In this case, it looks like a supply-chain attack, doesn't it?

Without an answer to the question above, there doesn't seem to be a reason to understand if each of these cases actually represents a potential issue, and if so why exactly that is.

There is no security issue just because mktemp is used, in these cases, and with even more patience we will both arrive there or you manage to show exactly what has to be done to be exploited. And by that I don't mean to repeat general truisms about the matter but to actually look at what happens. I recommend to start with git/index/util.py:43.

@Byron
Copy link
Member

Byron commented Aug 30, 2022

Closed due to inactivity.

@Byron Byron closed this Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants