Skip to content

NoneType handling for str.format() with specifiers #18952

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

ChristinaTrinh
Copy link

Fixes #18800

This PR addresses the problem described in the issue attached above.

What is changed?
To handle the NoneType argument passed into str.format() function, we added a check on whether the argument is NoneType, if it is, then we output a message fail.

How was it tested?
A a total of 6 test cases for str.format() with specifiers <, >, ^.

  • The first three tests check that when None argument is passed into str.format() with any of the specifiers <, >, ^ will give the expected Error message.

  • The last three tests check that when a valid string is passed in then there is no error message is produced.

image
  • Ran locally with command pytest -n0 -k testFormatCallNoneAlignment and the test passed

Another bug discovered
We additionally found a bug in the case where a user-defined function does not have __format__ but called str.format(), then myPy should catch this, but currently myPy does not detect this.

For example:
In this example, since GoodFomat() has __format__, the call to "{:*^15}".format(GoodFormat()) should succeed.
class GoodFormat:
def __format__(self, format_spec):
return f"<Formatted:{format_spec}>"

However in this example, Foo() does not have __format__ so it should produce an error message because str.format() would not know how to format when we make the call "{:*^15}".format(Foo()), which results in a crash when running the code with python. Currently, myPy passes this case.
class Foo:
def __str__(self):
return "hello"

We tried to address this bug but it was harder than expected because other than the builtins types such as string, int, float,.... there are other types that we do not know how to catch. This is the PR that we had up that attempted to fix it but it does not pass the workflow.

Copy link
Contributor

Diff from mypy_primer, showing the effect of this PR on open source code:

vision (https://github.com/pytorch/vision)
+ torchvision/utils.py:271: error: Unused "type: ignore" comment  [unused-ignore]

Copy link
Collaborator

@A5rocks A5rocks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes some amount of sense, though why i in range(len("<>^)) and not i in range(len(spec.format_spec))?

(I think this will crash when typechecking f"{None:^}"?)


Honestly I don't really like this approach of ad-hoc checking that format specifiers have some specific character in them, though I guess what I would prefer (actually parsing the format specification) wouldn't help that much.

Comment on lines +456 to +458
for i in range(len("<>^")):
if spec.format_spec[i] in "<>^":
specifierIndex = i
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic seems confused -- why are we mixing i (refers to position in <>^) with indexing into spec.format_spec?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be just

specifier_char = next(c for c in spec.format_spec if c in "<>^")

and then specifier_char instead of spec.format_spec[specifierIndex] (and next if is redundant, obviously, since you already check that such symbol exists).

Aside, please stick to snake_case naming convention - it's both used in this codebase and recommended by PEP8, only very old stdlib parts still have some camelCase identifiers.

Comment on lines +451 to +452
a_type = get_proper_type(actual_type)
if isinstance(a_type, NoneType):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be cleaner:

Suggested change
a_type = get_proper_type(actual_type)
if isinstance(a_type, NoneType):
if isinstance(get_proper_type(actual_type), NoneType):

@A5rocks
Copy link
Collaborator

A5rocks commented Apr 24, 2025

Reading the linked issue, I think this also needs to handle the case where the type passed into the format string is a str | None, rather than just None as is done here.

@sterliakov
Copy link
Collaborator

str | None, rather than just None as is done here.

@A5rocks Nope, perform_special_format_checks is called for every union member - see how it's invoked in check_specs_in_format_call.

Comment on lines +456 to +458
for i in range(len("<>^")):
if spec.format_spec[i] in "<>^":
specifierIndex = i
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be just

specifier_char = next(c for c in spec.format_spec if c in "<>^")

and then specifier_char instead of spec.format_spec[specifierIndex] (and next if is redundant, obviously, since you already check that such symbol exists).

Aside, please stick to snake_case naming convention - it's both used in this codebase and recommended by PEP8, only very old stdlib parts still have some camelCase identifiers.

[case testFormatCallNoneAlignment]
'{:<1}'.format(None) # E: Alignment format specifier "<" is not supported for None
'{:>1}'.format(None) # E: Alignment format specifier ">" is not supported for None
'{:^1}'.format(None) # E: Alignment format specifier "^" is not supported for None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a few more tests - at least with f-strings, with conversion (e.g. "{!s:>5}".format(None) is allowed) and with dynamic spec (f"{None:{foo}}" is also valid, foo may be an empty string).

Copy link

@VallinZ VallinZ Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking at the case of

foo = "^"'
test2 = f"{None:{foo}}"

which python throws an error but the current implementation does not catch it.
I look into the callExp of this case, it return:

CallExpr:9(
  MemberExpr:9(
    StrExpr({:{}})
    format)
  Args(
    NameExpr(None [builtins.None])
    CallExpr:9(
      MemberExpr:9(
        StrExpr({:{}})
        format)
      Args(
        NameExpr(foo [mypy.LocalTest.test.foo])
        StrExpr()))))

Is there a way to see what value foo represent in MyPy? My understanding is that MyPy is a static tool, so it doesn't have the value of foo. So, technically, there is no way to check this case? So, would it be an option we just raise an warning that there might be an error rather saying that this is an error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

str-format for fill and align specifiers
5 participants