Skip to content

Doesn't work dereference of sha1 to object. #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
AlexSokol opened this issue Apr 27, 2012 · 6 comments
Closed

Doesn't work dereference of sha1 to object. #2

AlexSokol opened this issue Apr 27, 2012 · 6 comments

Comments

@AlexSokol
Copy link

Then I try to use the program from here https://bitbucket.org/eugeneai/linux-project.git, Python returns this message:

Traceback (most recent call last):
File "/home/alex/convert.py", line 22, in
print i, comm.message.strip(), comm.committer
File "/usr/local/lib/python2.7/dist-packages/gitdb/util.py", line 238, in __ getattr __
self.set_cache(attr)

@Byron
Copy link
Member

Byron commented Apr 27, 2012

Unfortunately this is only a partial stack trace, hence it does not help.
gitdb has been phased out as well, and merged into gitpython.

@Byron Byron closed this as completed Apr 27, 2012
@AlexSokol
Copy link
Author

Ok, that is full stack trace:

Traceback (most recent call last):
File "/home/alex/convert.py", line 22, in
print i, comm.message.strip(), comm.committer
File "/usr/local/lib/python2.7/dist-packages/gitdb/util.py", line 238, in
getattr
self.set_cache(attr)
File
"/usr/local/lib/python2.7/dist-packages/GitPython-0.3.2.RC1-py2.7.egg/git/objects/commit.py",
line 131, in set_cache
binsha, typename, self.size, stream = self.repo.odb.stream(self.binsha)
File "/usr/local/lib/python2.7/dist-packages/gitdb/db/base.py", line 259,
in stream
return self._db_query(sha).stream(sha)
File "/usr/local/lib/python2.7/dist-packages/gitdb/db/base.py", line 239,
in _db_query
if db.has_object(sha):
File "/usr/local/lib/python2.7/dist-packages/gitdb/db/pack.py", line 88,
in has_object
self._pack_info(sha)
File "/usr/local/lib/python2.7/dist-packages/gitdb/db/pack.py", line 71,
in _pack_info
index = item2
File "/usr/local/lib/python2.7/dist-packages/gitdb/pack.py", line 419, in
sha_to_index
get_sha = self.sha
File "/usr/local/lib/python2.7/dist-packages/gitdb/util.py", line 238, in
getattr
self.set_cache(attr)
File "/usr/local/lib/python2.7/dist-packages/gitdb/pack.py", line 281, in
set_cache
mmap = self._cursor.map()
File "/usr/local/lib/python2.7/dist-packages/gitdb/util.py", line 238, in
getattr
self.set_cache(attr)
File "/usr/local/lib/python2.7/dist-packages/gitdb/pack.py", line 274, in
set_cache
raise AssertionError("The index file at %s is too large to fit into a
mapped window (%i > %i). This is a limitation of the implementation" %
(self._indexpath, self._cursor.file_size(), mman.window_size()))
AssertionError: The index file at
/home/alex/linux/.git/objects/pack/pack-5e55a3f93ec5edf76bbe7206d6fc3814e13c5b58.idx
is too large to fit into a mapped window (65351336 > 33554432). This is a
limitation of the implementation

I use GitPython-0.3.2.RC1, Python 2.7 and Ubuntu 12.04.
And I can take the list of commits (headcommit.hexsha), but then I try to
use headcommit.parents or something else this problem arises.

@Byron
Copy link
Member

Byron commented Apr 27, 2012

You have a rather big index (~65MB), but use gitpython on a 32 bit system. This means it will map only 32 MB chunks at once into memory. On a 64 bit system, it would use up to 1024 MB per mapping.
The issue you ran into is a sanity check as the implementation makes assumptions to make things easier on the implementors side.
The best workaround would be to use a 64 bit system or adjust the code to set the window size to something more suitable for your index file.

@AlexSokol
Copy link
Author

Thanks for your answer.

@AlexSokol
Copy link
Author

When I use 64 bit sistem, my programm works up to 15th commit, but then

Traceback (most recent call last):
File "/home/alex/convert.py", line 17, in
b = comm.parents
File "/usr/local/lib/python2.7/dist-packages/gitdb/util.py", line 238, in getattr
self.set_cache(attr)
File "/usr/local/lib/python2.7/dist-packages/GitPython-0.3.2.RC1-py2.7.egg/git/objects/commit.py", line 132, in >set_cache
self._deserialize(StringIO(stream.read()))
File "/usr/local/lib/python2.7/dist-packages/GitPython-0.3.2.RC1-py2.7.egg/git/objects/commit.py", line 443, in >_deserialize
self.author.name = self.author.name.decode(self.encoding)
LookupError: unknown encoding: object 320cfa6ce0b3dc794fedfa4bae54c0f65077234d

And what does "window size" mean and how it can be edited?

@Byron
Copy link
Member

Byron commented Apr 30, 2012

It seems that it parsed the commit's encoding incorrectly, as it has a hash string in a place where something like UTF-8 would be expected. Therefore the author's name cannot be encoded.
Apparently gitdb fails to parse the commit object correctly.

As a general workaround, you can just use a different Repository backend. It defaults to gitdb, but you may use GitCmd as well. Please have a look at the git.repo module on how to do that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants