The text is released under the CC-BY-NC-ND license, and code is released under the MIT license. If you find this content useful, please consider supporting the work by buying the book!
如果您在本章中还没有阅读其他部分,请先阅读本文:我发现这里讨论的工具是对我在日常工作中使用 IPython 最有帮助的东西。
当一个技术人员被要求帮助朋友,家人或同事解决一个计算机相关的问题时,通常知道如何找到答案比答案自身更加重要。在数据科学中也是类似的:那些搜得到的网络资源,如在线文档,邮件列表 或者 StackOverflow 包含大量的信息。成为一个数据科学的践行者,于其记住那些你可能会用的的命令或者工具不如知道如何从搜索引擎或者其他途径快速找到要用的内容的相关文档。
IPython / Jupyter 最有用的功能之一帮助用户尽快的找到所需要的文档,帮助他们有效地完成工作。虽然网络搜索仍然在回答复杂的问题中发挥作用,但是通过 IPython 可以找到大量的信息。下面是一些通过 IPython 就能快速找到答案的例子:
在这里,我们将讨论 IPython 用来快速访问这些信息的工具,即用 ?
查看文档,而用 ??
查看源代码,并用 Tab 进行自动补全。
If you read no other section in this chapter, read this one: I find the tools discussed here to be the most transformative contributions of IPython to my daily workflow.
When a technologically-minded person is asked to help a friend, family member, or colleague with a computer problem, most of the time it's less a matter of knowing the answer as much as knowing how to quickly find an unknown answer. In data science it's the same: searchable web resources such as online documentation, mailing-list threads, and StackOverflow answers contain a wealth of information, even (especially?) if it is a topic you've found yourself searching before. Being an effective practitioner of data science is less about memorizing the tool or command you should use for every possible situation, and more about learning to effectively find the information you don't know, whether through a web search engine or another means.
One of the most useful functions of IPython/Jupyter is to shorten the gap between the user and the type of documentation and search that will help them do their work effectively. While web searches still play a role in answering complicated questions, an amazing amount of information can be found through IPython alone. Some examples of the questions IPython can help answer in a few keystrokes:
Here we'll discuss IPython's tools to quickly access this information, namely the ?
character to explore documentation, the ??
characters to explore source code, and the Tab key for auto-completion.
?
查看文档¶?
¶Python 语言以及其所建立的数据科学生态环境从来都是把用户牢记于心,其中能够快速访问各种文档就是一个体现。每一个 Python 对象都有一个叫做 doc string 的文本,通常这段文本对对象有一个大概的解释,并介绍了其主要的使用方式。Python 内置 help()
函数可以访问这些信息。例如,想要看 len
函数的信息,你可以这么做:
The Python language and its data science ecosystem is built with the user in mind, and one big part of that is access to documentation.
Every Python object contains the reference to a string, known as a doc string, which in most cases will contain a concise summary of the object and how to use it.
Python has a built-in help()
function that can access this information and prints the results.
For example, to see the documentation of the built-in len
function, you can do the following:
In [1]: help(len)
Help on built-in function len in module builtins:
len(...)
len(object) -> integer
Return the number of items of a sequence or mapping.
至于是以文本展示还是会有一个弹出的窗口展示会依你的解释器环境而定。
Depending on your interpreter, this information may be displayed as inline text, or in some separate pop-up window.
由于查看一个对象文档这样的操作是在是太普遍了,IPython 用 ?
作为一个访问相关信息的快捷方式:
Because finding help on an object is so common and useful, IPython introduces the ?
character as a shorthand for accessing this documentation and other relevant information:
In [2]: len?
Type: builtin_function_or_method
String form: <built-in function len>
Namespace: Python builtin
Docstring:
len(object) -> integer
Return the number of items of a sequence or mapping.
它适用于各种情况,当然也包括对象的方法:
This notation works for just about anything, including object methods:
In [3]: L = [1, 2, 3]
In [4]: L.insert?
Type: builtin_function_or_method
String form: <built-in method insert of list object at 0x1024b8ea8>
Docstring: L.insert(index, object) -- insert object before index
或者它们的类型的文档:
or even objects themselves, with the documentation from their type:
In [5]: L?
Type: list
String form: [1, 2, 3]
Length: 3
Docstring:
list() -> new empty list
list(iterable) -> new list initialized from iterable's items
当然,对于你自己所创建的函数或者文档也同样有效。这里我们定义一个包含 docstring 的函数:
Importantly, this will even work for functions or other objects you create yourself! Here we'll define a small function with a docstring:
In [6]: def square(a):
....: """Return the square of a."""
....: return a ** 2
....:
注意,要为函数创建一个 docstring,我们只是在第一行放置一个字符串。因为 doc string 通常是多行的,按照惯例,我们对多行字符串使用 Python 的三引号。
Note that to create a docstring for our function, we simply placed a string literal in the first line. Because doc strings are usually multiple lines, by convention we used Python's triple-quote notation for multi-line strings.
然后我们就可以用 ?
符号来看到这个 doc string:
Now we'll use the ?
mark to find this doc string:
In [7]: square?
Type: function
String form: <function square at 0x103713cb0>
Definition: square(a)
Docstring: Return the square of a.
有了这种快速访问文档的方式,你也应该养成给自己的代码添加 doc string 的习惯。
This quick access to documentation via docstrings is one reason you should get in the habit of always adding such inline documentation to the code you write!
??
访问源代码¶??
¶由于 Python 语言的可读性,如果你想对一个对象有进一步的了解,可以直接阅读相应的源码。IPython 也提供了阅读源码的快捷方式 ??
:
Because the Python language is so easily readable, another level of insight can usually be gained by reading the source code of the object you're curious about.
IPython provides a shortcut to the source code with the double question mark (??
):
In [8]: square??
Type: function
String form: <function square at 0x103713cb0>
Definition: square(a)
Source:
def square(a):
"Return the square of a"
return a ** 2
对这种简单的函数,双问号可以让你快速了解函数的使用方法。
For simple functions like this, the double question-mark can give quick insight into the under-the-hood details.
如果你用的多了你会发现 ??
不是每次都能展示出源码:因为有些代码并不是用 Python 写的,有可能是 C 或者其他语言。在这种情况下 ??
和 ?
的效果是一样的。你会看到很多内置的 Python 对象或者函数都是这种情况,例如 len
:
If you play with this much, you'll notice that sometimes the ??
suffix doesn't display any source code: this is generally because the object in question is not implemented in Python, but in C or some other compiled extension language.
If this is the case, the ??
suffix gives the same output as the ?
suffix.
You'll find this particularly with many of Python's built-in objects and types, for example len
from above:
In [9]: len??
Type: builtin_function_or_method
String form: <built-in function len>
Namespace: Python builtin
Docstring:
len(object) -> integer
Return the number of items of a sequence or mapping.
使用 ?
??
可以快速的找到任何 Python 函数或者模块的相关文档。
Using ?
and/or ??
gives a powerful and quick interface for finding information about what any Python function or module does.
IPython 另一个有用的地方是支持通过 tab 自动补全,它可以让我们很快的了解一个类、模块或者是一个命名空间下的内容。下面的例子会展示在什么会后应该使用 <TAB>
。
IPython's other useful interface is the use of the tab key for auto-completion and exploration of the contents of objects, modules, and name-spaces.
In the examples that follow, we'll use <TAB>
to indicate when the Tab key should be pressed.
每一个 Python 对象都有大量的属性和方法。就像之前提到的 help
方法,Python 内建一个 dir
函数可以放回一个对象中的所有属性和方法,但是 tab 自动补全用起来更方便。想要展示一个对象所有的属性,你可以在输入一个对象的名字后输入 .
和 Tab 键:
Every Python object has various attributes and methods associated with it.
Like with the help
function discussed before, Python has a built-in dir
function that returns a list of these, but the tab-completion interface is much easier to use in practice.
To see a list of all available attributes of an object, you can type the name of the object followed by a period (".
") character and the Tab key:
In [10]: L.<TAB>
L.append L.copy L.extend L.insert L.remove L.sort
L.clear L.count L.index L.pop L.reverse
当你继续输入属性的第一个或者更多的字符后再键入 Tab IPython 会找到所有匹配这些前缀的属性和方法:
To narrow-down the list, you can type the first character or several characters of the name, and the Tab key will find the matching attributes and methods:
In [10]: L.c<TAB>
L.clear L.copy L.count
In [10]: L.co<TAB>
L.copy L.count
假如只剩下一个结果可以展示,Tab 会自动帮你补全这个名称。例如下面的情况会直接展示 L.count
:
If there is only a single option, pressing the Tab key will complete the line for you.
For example, the following will instantly be replaced with L.count
:
In [10]: L.cou<TAB>
虽然 Python 没有区分公共属性和私有属性,但是按照规约,以下划线开始的属性视为私有属性。通常这些私有属性或方法在 Tab 自动补全中不会出现,但是如果你键入下划线后再输入 Tab 键它们依然会被展示出来:
Though Python has no strictly-enforced distinction between public/external attributes and private/internal attributes, by convention a preceding underscore is used to denote such methods. For clarity, these private methods and special methods are omitted from the list by default, but it's possible to list them by explicitly typing the underscore:
In [10]: L._<TAB>
L.__add__ L.__gt__ L.__reduce__
L.__class__ L.__hash__ L.__reduce_ex__
For brevity, we've only shown the first couple lines of the output. Most of these are Python's special double-underscore methods (often nicknamed "dunder" methods).
在引入包时同样可以使用自动补全。这里我们想要找到在 itertools
包中所有已 co
开头的函数:
Tab completion is also useful when importing objects from packages.
Here we'll use it to find all possible imports in the itertools
package that start with co
:
In [10]: from itertools import co<TAB>
combinations compress
combinations_with_replacement count
同样你可以用 tab 自动补全去查看你的系统可以引入哪些包(这个结果会依据你自己的 Python 环境中都安装了哪些第三方包):
Similarly, you can use tab-completion to see which imports are available on your system (this will change depending on which third-party scripts and modules are visible to your Python session):
In [10]: import <TAB>
Display all 399 possibilities? (y or n)
Crypto dis py_compile
Cython distutils pyclbr
... ... ...
difflib pwd zmq
In [10]: import h<TAB>
hashlib hmac http
heapq html husl
(这里我就不展示我电脑中所有 399 个包的结果了。)
(Note that for brevity, I did not print here all 399 importable packages and modules on my system.)
Tab 自动补全在你知道想要访问的方法或属性的前面的几个字符的情况下非常有用,但是如果想要匹配方法名中间或末尾的字符就不行了。不过 IPython 提供用 *
对命名进行通配符匹配。
例如我们可以用一下的方式获取所有匹配 Warning
的对象:
Tab completion is useful if you know the first few characters of the object or attribute you're looking for, but is little help if you'd like to match characters at the middle or end of the word.
For this use-case, IPython provides a means of wildcard matching for names using the *
character.
For example, we can use this to list every object in the namespace that ends with Warning
:
In [10]: *Warning?
BytesWarning RuntimeWarning
DeprecationWarning SyntaxWarning
FutureWarning UnicodeWarning
ImportWarning UserWarning
PendingDeprecationWarning Warning
ResourceWarning
注意 *
可以匹配任意字符,也包含空字符。如果我们在找一个包含 find
的方法,我们可以这么搜索:
Notice that the *
character matches any string, including the empty string.
Similarly, suppose we are looking for a string method that contains the word find
somewhere in its name.
We can search for it this way:
In [10]: str.*find*?
str.find
str.rfind
我发现这种通配符搜索在用一个新的包或者重新拾起长久不用的包的时候非常有用。
I find this type of flexible wildcard search can be very useful for finding a particular command when getting to know a new package or reacquainting myself with a familiar one.