You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Handle encodings better; make the sum type "public"
Windows does not have a direct analogue of LANG=C or LC_ALL=C. Some
programs give them special treatment, but they do not affect the
way localized behavior of the Windows API operates. In particular,
the bash.exe WSL wrapper, as well as wsl.exe and wslconfig.exe, do
not produce their own localized messages (for errors not
originating in a running distribution) when they are set. Windows
also provides significant localization through localized versions
of Windows, so changing language settings in Windows, even
system-wide, does not always produce the same effect that many or
most Windows users who use a particular language would experience.
Various encodings may appear when bash.exe is WSL-related but gives
its own error message. Such a message is often in UTF-16LE, which
is what Windows uses internally, and preserves any localization.
That is the more well behaved scenario and already was detected;
this commit moves, but does not change, the code for that.
The situation where it is not UTF-16LE was previously handled by
treating it as UTF-8. Because the default strict error treatment
was used, this would error out in test discovery in some localized
setups, preventing all tests in test_index from running, including
the majority of them that are not related to hooks. This fixes that
by doing better detection that should decode the mesages correctly
most of the time, that should in practice decode them well enough
to tell (by the aka.ms URL) if the message is complaining about
there being no installed distribution all(?) of the time, and that
should avoid breaking unrelated tests even if that can't be done.
An English non-UTF-16LE message appears on GitHub Actions CI when
no distribution is installed. Testing of this situation on other
languages was performed in a virtual machine on a development
machine.
That the message is output in a narrow character set some of the
time when bash.exe produces it appears to be a limitation of the
bash.exe wrapper. In particular, with a zh-CN version of Windows
(and with the language not changed to anything else), a localized
message in Simplified Chinese was correctly printed when running
wsl.exe, but running bash.exe produced literal "?" characters in
place of Chinese characters (it was not a display or font issue,
and they were "?" and not Unicode replacement characters). The
change here does not overcome that; the literal "?" characters will
be included. But "https://aka.ms/wslstore" is still present if it
is an error about not having any distributions, so the correct
status is still inferred.
For more information on code pages in Windows, see:
https://serverfault.com/a/836221
The following alternatives to all or part of the approach taken
here were considered but, at least for now, not done, because they
would not clearly be simpler or more correct:
- Abandoning this "run bash.exe and see what it shows us" approach
altogether and instead reimplementing the rules CreateProcessW
uses, to find if the bash.exe the system finds is the one in
System32, and then, if so, checking the metadata in that
executable to determine if it's the WSL wrapper. I believe that
would be even more complex to do correctly than it seems; the
behavior noted in the WinBashStatus docstring and recent commit
messages is not the whole story. The documented search order for
CreateProcessW seems not to be followed in some circumstances.
One is that the Windows Store version of Python seems always to
forgo the usual System32 search that precedes seaching directories
in PATH. It looks like there may also be some subtleties in which
directories 32-bit builds search in.
- Using chardet. Although the chardet library is excellent, it is
not clear that the code needed to bridge this highly specific use
case to it would be simpler than the code currently in use. Some
of the work might still need to be done by other means; when I
tested it out for this, this did not detect the UTF-16LE messages
as such for English. (They are often also valid UTF-8, because
interleaving null characters is, while strange, permitted.)
- Also calling wsl.exe and/or wslconfig.exe. It's still necessary
to call bash.exe, because it may not be the WSL bash, even on a
system with WSL fully set up. Furthermore, those tools' output
seem to vary in some complex ways, too. Using only one subprocess
for the detection seemed the simplest. Even using "wsl --list"
would introduce significant additional logic. Sometimes its
output is a list of distributions, sometimes it is an error
message, and if WSL is not set up it may be a help message.
- Using the Windows API to check for WSL systems.
https://learn.microsoft.com/en-us/windows/win32/api/wslapi/ does
not currently include functions to list registered distributions.
- Attempting to get wsl.exe to produce an English message using
Windows API techniques like those used in Locale Emulator. This
would be complicated, somewhat unintuitive and error prone to do
in Python, and I am not sure how well it would work on a system
that does not have an English language pack installed.
- Checking on disk for WSL distributions in the places they are
most often expected to be. This would intertwine WinBashStatus
with deep details of how WSL actually operates, and this seems
like the sort of thing that is likely to change in the future.
However, there may be a more straightforward way to do this (that
is at least as correct and that remains transparent to debug).
Especially if the WinBashStatus class remains in test_index for
long (rather than just being used to aid in debugging existing test
falures and possible subsequent design decisions for making commit
hooks work more robustly on Windows in GitPython), then this may
be worth revisiting.
Thus it is *not* with the intention of treating WinBashStatus as a
"stable" part of the test suite that it is renamed from
_WinBashStatus. This is instead done because:
- Like HOOKS_SHEBANG, it makes sense to import it when figuring out
how the tests work or debugging them.
- When debugging, it is intended that it be imported to call
check() and examine the resulting `process` and `message`
information, at least in the CheckError case.
0 commit comments