史上最权威最详细的python中import机制

2022-10-11 09:06:37

5. The import system

Python code in onemodule gains access to the code in another module by the process ofimporting it. Theimport statement is the most common way of invoking the import machinery, but it is not the only way. Functions such asimportlib.import_module() and built-in__import__() can also be used to invoke the import machinery.

在一个module中的python代码通过import获得访问另一个module的代码的权限。import语句是唤起import机制最常用的方式,但不是唯一的方式。例如:函数importlib.import_module()和内置的__import__()也能用于唤起import机制。

Theimport statement combines two operations; it searches for the named module, then it binds the results of that search to a name in the local scope. The search operation of theimport statement is defined as a call to the__import__() function, with the appropriate arguments. The return value of__import__() is used to perform the name binding operation of theimport statement. See theimport statement for the exact details of that name binding operation.

import语句结合两个操作;搜索已命名的module,然后把搜索的结果与本地域中的名字绑定。import语句定义为__import__()函数的一次调用。__import__()函数的返回值用于执行import语句的名字绑定操作。

A direct call to__import__() performs only the module search and, if found, the module creation operation. While certain side-effects may occur, such as the importing of parent packages, and the updating of various caches (includingsys.modules), only theimport statement performs a name binding operation.

直接调用__import__()仅仅执行module的搜索,如果找到,就会创建一个module对象。尽管可能有一些副作用,例如父包的导入、更新不容的缓存(包括sys。modules),但是只有import语句执行一次名字绑定操作。

When animport statement is executed, the standard builtin__import__() function is called. Other mechanisms for invoking the import system (such asimportlib.import_module()) may choose to bypass__import__() and use their own solutions to implement import semantics.

当一条import语句被执行时,标准内置的__import__()函数被调用。其他的唤起import机制(例如:importlib.import_module())可能会选择绕过__import__()使用它们自己的解决方案来实现导入语义。

When a module is first imported, Python searches for the module and if found, it creates a module object1, initializing it. If the named module cannot be found, aModuleNotFoundError is raised. Python implements various strategies to search for the named module when the import machinery is invoked. These strategies can be modified and extended by using various hooks described in the sections below.

当一个module第一次导入的时候,python会搜索这个module如果找到,它会创建一个module对象并且初始化它。如果不能找到,一个ModuleNotFoundError异常会抛出。当唤起import机制时,python实现不同策略来搜索已命名module。这些策略能通过不同的hooks方式被修改和拓展。

Changed in version 3.3: The import system has been updated to fully implement the second phase ofPEP 302. There is no longer any implicit import machinery - the full import system is exposed throughsys.meta_path. In addition, native namespace package support has been implemented (seePEP 420).

在版本3.3中已经改变:import系统已经更新可以全面实现PEP 302的第二个阶段。不在有任何隐士的import机制,通过sys.meta_path来揭露完全的import系统。此外,已经实现了namespace package。

5.1.importlib

Theimportlib module provides a rich API for interacting with the import system. For exampleimportlib.import_module() provides a recommended, simpler API than built-in__import__() for invoking the import machinery. Refer to theimportlib library documentation for additional detail.

importlib 提供了丰富的与import系统交互的API。例如:importlib.import_module() 提供了一个推荐的、比内置__import__()更简单的唤起import机制的API。详情见importlib 库文档。

5.2. Packages

Python has only one type of module object, and all modules are of this type, regardless of whether the module is implemented in Python, C, or something else. To help organize modules and provide a naming hierarchy, Python has a concept ofpackages.

python只有一种module对象类型。为了帮助组织module并且提供名字机制,python有package的概念。

You can think of packages as the directories on a file system and modules as files within directories, but don’t take this analogy too literally since packages and modules need not originate from the file system. For the purposes of this documentation, we’ll use this convenient analogy of directories and files. Like file system directories, packages are organized hierarchically, and packages may themselves contain subpackages, as well as regular modules.

你可以把包看作是文件系统中的目录,把模块看作是目录中的文件,但不要把这个比喻看得太重,因为包和模块不一定来自于文件系统。在本文档中,我们将使用这个方便的目录和文件的类比。像文件系统的目录一样,包是分层组织的,包本身可能包含子包,以及常规模块。

It’s important to keep in mind that all packages are modules, but not all modules are packages. Or put another way, packages are just a special kind of module. Specifically, any module that contains a__path__ attribute is considered a package.

重要的是要记住,所有的包都是模块,但不是所有的模块都是包。或者换一种说法,包只是模块的一种特殊类型。具体来说,任何包含 __path__ 属性的模块都被认为是一个包。

All modules have a name. Subpackage names are separated from their parent package name by a dot, akin to Python’s standard attribute access syntax. Thus you might have a module calledsys and a package calledemail, which in turn has a subpackage calledemail.mime and a module within that subpackage calledemail.mime.text.

所有模块都有一个名字。子包的名字与它们的父包的名字用一个点分开,类似于 Python 的标准属性访问语法。因此,你可能有一个叫 sys 的模块和一个叫 email 的包,而这个包又有一个叫 email.mime 的子包和这个子包中的一个叫 email.mime.text 的模块。

5.2.1. Regular packages   常规包

Python defines two types of packages,regular packages andnamespace packages. Regular packages are traditional packages as they existed in Python 3.2 and earlier. A regular package is typically implemented as a directory containing an__init__.py file. When a regular package is imported, this__init__.py file is implicitly executed, and the objects it defines are bound to names in the package’s namespace. The__init__.py file can contain the same Python code that any other module can contain, and Python will add some additional attributes to the module when it is imported.

Python 定义了两种类型的包,常规包和命名空间包。常规包是传统的包,因为它们存在于 Python 3.2 和更早的版本中。一个常规包通常被实现为一个包含 __init__.py 文件的目录。当一个常规包被导入时,这个 __init__.py 文件被隐式执行,它所定义的对象被绑定到包的名字空间中。__init__.py 文件可以包含和其他模块一样的 Python 代码,当模块被导入时,Python 会给它添加一些额外的属性。

For example, the following file system layout defines a top levelparent package with three subpackages:

例如,下面的文件系统布局定义了一个带有三个子包的顶级父包。

parent/
    __init__.py
    one/
        __init__.py
    two/
        __init__.py
    three/
        __init__.py

Importingparent.one will implicitly executeparent/__init__.py andparent/one/__init__.py. Subsequent imports ofparent.two orparent.three will executeparent/two/__init__.py andparent/three/__init__.py respectively.

导入 parent.one 将隐含地执行 parent/__init__.py 和 parent/one/__init__.py。随后导入 parent.two 或 parent.three 将分别执行 parent/two/__init__.py 和 parent/three/__init__.py。

5.2.2. Namespace packages     命名空间包

A namespace package is a composite of variousportions, where each portion contributes a subpackage to the parent package. Portions may reside in different locations on the file system. Portions may also be found in zip files, on the network, or anywhere else that Python searches during import. Namespace packages may or may not correspond directly to objects on the file system; they may be virtual modules that have no concrete representation.

一个命名空间包是由各种部分组成的复合体,其中每个部分为父包贡献一个子包。各部分可能驻扎在文件系统的不同位置。各部分也可以在 zip 文件中、网络上、或在导入过程中被 Python 搜索到的其他地方找到。命名空间包可能是也可能不是直接对应于文件系统上的对象;它们可能是没有具体表示的虚拟模块。

Namespace packages do not use an ordinary list for their__path__ attribute. They instead use a custom iterable type which will automatically perform a new search for package portions on the next import attempt within that package if the path of their parent package (orsys.path for a top level package) changes.

命名空间包不使用一个普通的列表作为它们的 __path__ 属性。相反,它们使用一个自定义的可迭代类型,如果它们的父包的路径(或顶层包的 sys.path)发生变化,在该包内的下一次导入尝试中,它将自动对包的部分执行新的搜索。

With namespace packages, there is noparent/__init__.py file. In fact, there may be multipleparent directories found during import search, where each one is provided by a different portion. Thusparent/one may not be physically located next toparent/two. In this case, Python will create a namespace package for the top-levelparent package whenever it or one of its subpackages is imported.

对于命名空间包,没有parent/__init__.py文件。事实上,在导入搜索过程中可能会发现多个父目录,其中每一个都是由不同的部分提供的。因此parent/one可能在物理上不在parent/two旁边。在这种情况下,每当顶级父包或其子包之一被导入时,Python 将为其创建一个命名空间包。

See alsoPEP 420 for the namespace package specification.

关于命名空间包的规范,也请参见 PEP 420。

5.3. Searching   搜索

To begin the search, Python needs thefully qualified name of the module (or package, but for the purposes of this discussion, the difference is immaterial) being imported. This name may come from various arguments to theimport statement, or from the parameters to theimportlib.import_module() or__import__() functions.

为了开始搜索,Python 需要被导入的模块 (或包,但在本讨论中,两者的区别并不重要) 的完全限定名称。这个名字可能来自 import 语句的各种参数,或者 importlib.import_module() 或 __import__() 函数的参数。

This name will be used in various phases of the import search, and it may be the dotted path to a submodule, e.g.foo.bar.baz. In this case, Python first tries to importfoo, thenfoo.bar, and finallyfoo.bar.baz. If any of the intermediate imports fail, aModuleNotFoundError is raised.

这个名字将在导入搜索的各个阶段使用,它可能是一个子模块的点状路径,例如:foo.bar.baz。在这种情况下,Python 首先尝试导入 foo,然后是 foo.bar,最后是 foo.bar.baz。如果任何一个中间导入失败,就会产生一个 ModuleNotFoundError。

5.3.1. The module cache 模块缓存

The first place checked during import search issys.modules. This mapping serves as a cache of all modules that have been previously imported, including the intermediate paths. So iffoo.bar.baz was previously imported,sys.modules will contain entries forfoo,foo.bar, andfoo.bar.baz. Each key will have as its value the corresponding module object.

在导入搜索过程中检查的第一个地方是sys.modules。这个映射作为所有之前被导入的模块的缓存,包括中间的路径。因此,如果 foo.bar.baz 之前被导入,sys.modules 将包含 foo, foo.bar, 和 foo.bar.baz 的条目。每个键的值都是对应的模块对象。

During import, the module name is looked up insys.modules and if present, the associated value is the module satisfying the import, and the process completes. However, if the value isNone, then aModuleNotFoundError is raised. If the module name is missing, Python will continue searching for the module.

在导入过程中,模块名称在sys.modules中被查找,如果存在,相关的值就是满足导入的模块,这个过程就结束了。然而,如果这个值是 None,那么就会产生一个 ModuleNotFoundError。如果模块名称丢失,Python 将继续搜索该模块。

sys.modules is writable. Deleting a key may not destroy the associated module (as other modules may hold references to it), but it will invalidate the cache entry for the named module, causing Python to search anew for the named module upon its next import. The key can also be assigned toNone, forcing the next import of the module to result in aModuleNotFoundError.

sys.modules 是可写的。删除一个键可能不会破坏相关的模块 (因为其他模块可能持有对它的引用),但它会使命名的模块的缓存条目失效,导致 Python 在下次导入时重新搜索命名的模块。这个键也可以分配给 None,迫使下一次导入模块时出现 ModuleNotFoundError。

Beware though, as if you keep a reference to the module object, invalidate its cache entry insys.modules, and then re-import the named module, the two module objects willnot be the same. By contrast,importlib.reload() will reuse thesame module object, and simply reinitialise the module contents by rerunning the module’s code.

但要注意,如果你保留对模块对象的引用,使其在 sys.modules 中的缓存条目无效,然后重新导入指定的模块,两个模块对象就不一样了。相比之下,importlib.reload()将重新使用同一个模块对象,并通过重新运行模块的代码简单地重新初始化模块内容。

5.3.2. Finders and loaders   查找器和加载器

If the named module is not found insys.modules, then Python’s import protocol is invoked to find and load the module. This protocol consists of two conceptual objects,finders andloaders. A finder’s job is to determine whether it can find the named module using whatever strategy it knows about. Objects that implement both of these interfaces are referred to asimporters - they return themselves when they find that they can load the requested module.

如果在 sys.modules 中没有找到命名的模块,那么将调用 Python 的导入协议来查找和加载模块。这个协议由两个概念性的对象组成,查找器和加载器。查找器的工作是确定是否可以使用它所知道的任何策略找到指定的模块。实现这两个接口的对象被称为导入器--当它们发现可以加载请求的模块时,它们会返回自己。

Python includes a number of default finders and importers. The first one knows how to locate built-in modules, and the second knows how to locate frozen modules. A third default finder searches animport path for modules. Theimport path is a list of locations that may name file system paths or zip files. It can also be extended to search for any locatable resource, such as those identified by URLs.

Python 包括一些默认的查找器和导入器。第一个知道如何定位内置模块,第二个知道如何定位冻结模块。第三个默认查找器在导入路径中搜索模块。导入路径是一个位置列表,可以命名为文件系统路径或压缩文件。它也可以被扩展为搜索任何可定位的资源,例如那些由URL识别的资源。

The import machinery is extensible, so new finders can be added to extend the range and scope of module searching.

导入机制是可扩展的,因此可以添加新的搜索器来扩展模块搜索的范围。

Finders do not actually load modules. If they can find the named module, they return amodule spec, an encapsulation of the module’s import-related information, which the import machinery then uses when loading the module.

搜索器实际上并不加载模块。如果它们能找到指定的模块,它们会返回一个模块规格,即模块的导入相关信息的封装,然后导入机制在加载模块时使用它。

The following sections describe the protocol for finders and loaders in more detail, including how you can create and register new ones to extend the import machinery.

下面几节更详细地描述了查找器和加载器的协议,包括你如何创建和注册新的查找器来扩展导入机制。

Changed in version 3.4: In previous versions of Python, finders returnedloaders directly, whereas now they return module specs whichcontain loaders. Loaders are still used during import but have fewer responsibilities.

在 3.4 版本中有所改变。在以前的 Python 版本中,finder 直接返回 loaders,而现在它们返回包含 loaders 的模块规格。在导入过程中仍然会用到加载器,但它的责任更少。

5.3.3. Import hooks  导入钩子

The import machinery is designed to be extensible; the primary mechanism for this are theimport hooks. There are two types of import hooks:meta hooks andimport path hooks.

导入机制被设计成可扩展的;其主要机制是导入钩子。有两种类型的导入钩子:元钩子和导入路径钩子。

Meta hooks are called at the start of import processing, before any other import processing has occurred, other thansys.modules cache look up. This allows meta hooks to overridesys.path processing, frozen modules, or even built-in modules. Meta hooks are registered by adding new finder objects tosys.meta_path, as described below.

元钩子在导入处理开始时被调用,在任何其他导入处理发生之前,除了sys.modules缓存查询之外。这允许元钩子覆盖sys.path处理,冻结的模块,甚至是内置模块。元钩子是通过向sys.meta_path添加新的查找对象来注册的,如下所述。

Import path hooks are called as part ofsys.path (orpackage.__path__) processing, at the point where their associated path item is encountered. Import path hooks are registered by adding new callables tosys.path_hooks as described below.

导入路径钩子是作为 sys.path (或 package.__path__) 处理的一部分被调用的,在遇到它们相关的路径项时。如下所述,通过向sys.path_hooks添加新的可调用项来注册导入路径钩子。

5.3.4. The meta path  元路径

When the named module is not found insys.modules, Python next searchessys.meta_path, which contains a list of meta path finder objects. These finders are queried in order to see if they know how to handle the named module. Meta path finders must implement a method calledfind_spec() which takes three arguments: a name, an import path, and (optionally) a target module. The meta path finder can use any strategy it wants to determine whether it can handle the named module or not.

当在 sys.modules 中找不到命名的模块时,Python 接下来会搜索 sys.meta_path,它包含一个元路径查找器对象的列表。这些查找器被查询,以查看它们是否知道如何处理命名的模块。元路径查找器必须实现一个名为find_spec()的方法,该方法需要三个参数:名称、导入路径和(可选择的)目标模块。元路径查找器可以使用任何它想要的策略来确定它是否可以处理指定的模块。

If the meta path finder knows how to handle the named module, it returns a spec object. If it cannot handle the named module, it returnsNone. Ifsys.meta_path processing reaches the end of its list without returning a spec, then aModuleNotFoundError is raised. Any other exceptions raised are simply propagated up, aborting the import process.

如果元路径查找器知道如何处理命名的模块,它将返回一个规格对象。如果它不能处理被命名的模块,它将返回None。如果sys.meta_path处理到其列表的末尾而没有返回一个spec,那么会引发ModuleNotFoundError。任何其他的异常都被简单地向上传播,中止导入过程。

Thefind_spec() method of meta path finders is called with two or three arguments. The first is the fully qualified name of the module being imported, for examplefoo.bar.baz. The second argument is the path entries to use for the module search. For top-level modules, the second argument isNone, but for submodules or subpackages, the second argument is the value of the parent package’s__path__ attribute. If the appropriate__path__ attribute cannot be accessed, aModuleNotFoundError is raised. The third argument is an existing module object that will be the target of loading later. The import system passes in a target module only during reload.

元路径搜索器的find_spec()方法被调用时有两个或三个参数。第一个参数是被导入模块的全称,例如foo.bar.baz。第二个参数是用于模块搜索的路径条目。对于顶级模块,第二个参数是 None,但对于子模块或子包,第二个参数是父包的 __path__ 属性的值。如果适当的 __path__ 属性不能被访问,会产生一个 ModuleNotFoundError。第三个参数是一个现有的模块对象,它将成为以后加载的目标。导入系统只在重载时传入一个目标模块。

The meta path may be traversed multiple times for a single import request. For example, assuming none of the modules involved has already been cached, importingfoo.bar.baz will first perform a top level import, callingmpf.find_spec("foo", None, None) on each meta path finder (mpf). Afterfoo has been imported,foo.bar will be imported by traversing the meta path a second time, callingmpf.find_spec("foo.bar", foo.__path__, None). Oncefoo.bar has been imported, the final traversal will callmpf.find_spec("foo.bar.baz", foo.bar.__path__, None).

对于一个导入请求,元路径可能会被多次遍历。例如,假设所涉及的模块都没有被缓存,导入 foo.bar.baz 将首先执行顶层导入,在每个元路径查找器(mpf)上调用 mpf.find_spec("foo", None, None) 。在 foo 被导入后,foo.bar 将通过第二次遍历元路径来导入,调用 mpf.find_spec("foo.bar", foo.__path__, None) 。一旦foo.bar被导入,最后的遍历将调用mpf.find_spec("foo.bar.baz", foo.bar.__path__, None)。

Some meta path finders only support top level imports. These importers will always returnNone when anything other thanNone is passed as the second argument.

一些元路径查找器只支持顶层导入。当第二个参数不是 None 的时候,这些导入器将总是返回 None。

Python’s defaultsys.meta_path has three meta path finders, one that knows how to import built-in modules, one that knows how to import frozen modules, and one that knows how to import modules from animport path (i.e. thepath based finder).

Python 默认的 sys.meta_path 有三个元路径查找器,一个知道如何导入内置模块,一个知道如何导入冻结模块,还有一个知道如何从导入路径导入模块 (即基于路径的查找器)。

Changed in version 3.4: Thefind_spec() method of meta path finders replacedfind_module(), which is now deprecated. While it will continue to work without change, the import machinery will try it only if the finder does not implementfind_spec().

在3.4版本中有所改变。元路径查找器的find_spec()方法取代了find_module(),现在已经废弃了。虽然它将继续工作而不改变,但只有在查找器没有实现find_spec()时,导入机制才会尝试它。

5.4. Loading

If and when a module spec is found, the import machinery will use it (and the loader it contains) when loading the module. Here is an approximation of what happens during the loading portion of import:

如果找到了模块规范,导入机制将在加载模块时使用它(以及它所包含的加载器)。下面是导入过程中加载部分发生的近似情况。

module = None
if spec.loader is not None and hasattr(spec.loader, 'create_module'):
    # It is assumed 'exec_module' will also be defined on the loader.
    module = spec.loader.create_module(spec)
if module is None:
    module = ModuleType(spec.name)
# The import-related module attributes get set here:
_init_module_attrs(spec, module)

if spec.loader is None:
    # unsupported
    raise ImportError
if spec.origin is None and spec.submodule_search_locations is not None:
    # namespace package
    sys.modules[spec.name] = module
elif not hasattr(spec.loader, 'exec_module'):
    module = spec.loader.load_module(spec.name)
    # Set __loader__ and __package__ if missing.
else:
    sys.modules[spec.name] = module
    try:
        spec.loader.exec_module(module)
    except BaseException:
        try:
            del sys.modules[spec.name]
        except KeyError:
            pass
        raise
return sys.modules[spec.name]

Note the following details:

注意以下细节。

  • If there is an existing module object with the given name insys.modules, import will have already returned it.

  • 如果在sys.modules中存在一个给定名称的模块对象,import将已经返回它。

  • The module will exist insys.modules before the loader executes the module code. This is crucial because the module code may (directly or indirectly) import itself; adding it tosys.modules beforehand prevents unbounded recursion in the worst case and multiple loading in the best.

  • 在加载器执行模块代码之前,该模块将存在于sys.modules中。这一点很关键,因为模块代码可能(直接或间接)导入自己;事先将其添加到sys.modules中,在最坏的情况下可以防止无限制的递归,在最好的情况下可以防止多次加载。

  • If loading fails, the failing module – and only the failing module – gets removed fromsys.modules. Any module already in thesys.modules cache, and any module that was successfully loaded as a side-effect, must remain in the cache. This contrasts with reloading where even the failing module is left insys.modules.

  • 如果加载失败,失败的模块--而且只有失败的模块--会从sys.modules中被删除。任何已经在sys.modules缓存中的模块,以及任何作为副作用成功加载的模块,都必须留在缓存中。这与重载不同,在重载时,即使是失败的模块也会留在sys.modules中。

  • After the module is created but before execution, the import machinery sets the import-related module attributes (“_init_module_attrs” in the pseudo-code example above), as summarized in alater section.

  • 在模块创建之后但在执行之前,导入机制会设置与导入有关的模块属性(上面的伪代码例子中的"_init_module_attrs"),这在后面的章节中有所总结。

  • Module execution is the key moment of loading in which the module’s namespace gets populated. Execution is entirely delegated to the loader, which gets to decide what gets populated and how.

  • 模块的执行是加载的关键时刻,模块的命名空间被填充。执行是完全委托给加载器的,它可以决定什么被填充以及如何填充。

  • The module created during loading and passed to exec_module() may not be the one returned at the end of import2.

  • 在加载过程中创建并传递给exec_module()的模块可能不是导入2结束时返回的那个。

Changed in version 3.4: The import system has taken over the boilerplate responsibilities of loaders. These were previously performed by theimportlib.abc.Loader.load_module() method.

在3.4版本中有所改变。导入系统已经接管了加载器的模板责任。这些以前是由 importlib.abc.Loader.load_module() 方法执行的。

5.4.1. Loaders

Module loaders provide the critical function of loading: module execution. The import machinery calls theimportlib.abc.Loader.exec_module() method with a single argument, the module object to execute. Any value returned fromexec_module() is ignored.

模块加载器提供了加载的关键功能:模块执行。导入机制调用 importlib.abc.Loader.exec_module() 方法,该方法只有一个参数,即要执行的模块对象。从exec_module()返回的任何值都被忽略了。

Loaders must satisfy the following requirements:

装载器必须满足以下要求。

  • If the module is a Python module (as opposed to a built-in module or a dynamically loaded extension), the loader should execute the module’s code in the module’s global name space (module.__dict__).

  • 如果模块是一个 Python 模块 (相对于内置模块或动态加载的扩展),加载器应该在模块的全局名称空间 (module.__dict__) 执行模块的代码。

  • If the loader cannot execute the module, it should raise anImportError, although any other exception raised duringexec_module() will be propagated.

  • 如果加载器不能执行该模块,它应该引发一个 ImportError,尽管在 exec_module() 过程中引发的任何其他异常都会被传播。

In many cases, the finder and loader can be the same object; in such cases thefind_spec() method would just return a spec with the loader set toself.

在许多情况下,查找器和加载器可以是同一个对象;在这种情况下,find_spec()方法将只是返回一个加载器设置为self的规格。

Module loaders may opt in to creating the module object during loading by implementing acreate_module() method. It takes one argument, the module spec, and returns the new module object to use during loading.create_module() does not need to set any attributes on the module object. If the method returnsNone, the import machinery will create the new module itself.

模块加载器可以通过实现create_module()方法选择在加载过程中创建模块对象。它需要一个参数,即模块规格,并返回在加载过程中使用的新模块对象。create_module()不需要在模块对象上设置任何属性。如果该方法返回 None,导入机制将自己创建新模块。

New in version 3.4: Thecreate_module() method of loaders.

Changed in version 3.4: Theload_module() method was replaced byexec_module() and the import machinery assumed all the boilerplate responsibilities of loading.

在3.4版本中改变了。load_module() 方法被 exec_module() 替换,导入机制承担了所有加载的模板责任。

For compatibility with existing loaders, the import machinery will use theload_module() method of loaders if it exists and the loader does not also implementexec_module(). However,load_module() has been deprecated and loaders should implementexec_module() instead.

为了与现有的加载器兼容,如果加载器存在load_module()方法,并且加载器没有实现exec_module(),那么导入机制将使用该方法。然而,load_module()已经被弃用,加载器应该实现exec_module()来代替。

Theload_module() method must implement all the boilerplate loading functionality described above in addition to executing the module. All the same constraints apply, with some additional clarification:

load_module()方法除了执行模块外,还必须实现上面描述的所有模板加载功能。所有相同的限制条件都适用,并有一些额外的说明。

  • If there is an existing module object with the given name insys.modules, the loader must use that existing module. (Otherwise,importlib.reload() will not work correctly.) If the named module does not exist insys.modules, the loader must create a new module object and add it tosys.modules.

  • 如果在 sys.modules 中有一个给定名称的现有模块对象,加载器必须使用该现有模块。(否则,importlib.reload()将不能正常工作)。) 如果命名的模块不存在于sys.modules中,加载器必须创建一个新的模块对象并将其添加到sys.modules中。

  • The modulemust exist insys.modules before the loader executes the module code, to prevent unbounded recursion or multiple loading.

  • 在加载器执行模块代码之前,该模块必须存在于sys.modules中,以防止无限制的递归或多重加载。

  • If loading fails, the loader must remove any modules it has inserted intosys.modules, but it must removeonly the failing module(s), and only if the loader itself has loaded the module(s) explicitly.

  • 如果加载失败,加载器必须删除它插入到sys.modules中的任何模块,但它必须只删除失败的模块,而且只有在加载器自己明确加载了该模块的情况下。

Changed in version 3.5: ADeprecationWarning is raised whenexec_module() is defined butcreate_module() is not.

在 3.5 版中改变了:当 exec_module() 被定义而 create_module() 没有被定义时,会产生一个 DeprecationWarning。

Changed in version 3.6: AnImportError is raised whenexec_module() is defined butcreate_module() is not.

在3.6版中改变了。当exec_module()被定义但create_module()没有被定义时,会产生一个ImportError。

5.4.5. module.__path__
根据定义,如果一个模块有一个 __path__ 属性,它就是一个包。

5.4.2. Submodules

When a submodule is loaded using any mechanism (e.g.importlib APIs, theimport orimport-from statements, or built-in__import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if packagespam has a submodulefoo, after importingspam.foo,spam will have an attributefoo which is bound to the submodule. Let’s say you have the following directory structure:

当一个子模块使用任何机制(例如importlib APIs,import或import-from语句,或内置的__import__())被加载时,在父模块的名字空间中会有一个绑定到子模块对象。例如,如果包 spam 有一个子模块 foo,在导入 spam.foo 后,spam 会有一个属性 foo,它被绑定到子模块上。假设你有以下的目录结构。

spam/
    __init__.py
    foo.py
    bar.py

andspam/__init__.py has the following lines in it:

而 spam/__init__.py 中有以下几行。

from .foo import Foo
from .bar import Bar

then executing the following puts a name binding tofoo andbar in thespam module:

然后执行下面的命令,在垃圾邮件模块中对foo和bar进行名称绑定。

>>>

>>> import spam
>>> spam.foo
<module 'spam.foo' from '/tmp/imports/spam/foo.py'>
>>> spam.bar
<module 'spam.bar' from '/tmp/imports/spam/bar.py'>

Given Python’s familiar name binding rules this might seem surprising, but it’s actually a fundamental feature of the import system. The invariant holding is that if you havesys.modules['spam'] andsys.modules['spam.foo'] (as you would after the above import), the latter must appear as thefoo attribute of the former.

5.4.3. Module spec

The import machinery uses a variety of information about each module during import, especially before loading. Most of the information is common to all modules. The purpose of a module’s spec is to encapsulate this import-related information on a per-module basis.

在导入过程中,尤其是在加载之前,导入机制会使用关于每个模块的各种信息。大部分信息对所有模块都是通用的。模块规范的目的是在每个模块的基础上封装这些与导入有关的信息。

Using a spec during import allows state to be transferred between import system components, e.g. between the finder that creates the module spec and the loader that executes it. Most importantly, it allows the import machinery to perform the boilerplate operations of loading, whereas without a module spec the loader had that responsibility.

在导入过程中使用规范允许状态在导入系统组件之间转移,例如在创建模块规范的查找器和执行它的加载器之间。最重要的是,它允许导入机制执行加载的模板操作,而在没有模块规范的情况下,加载器有这个责任。

The module’s spec is exposed as the__spec__ attribute on a module object. SeeModuleSpec for details on the contents of the module spec.

模块的规格以模块对象上的 __spec__ 属性的形式暴露。参见 ModuleSpec 以了解关于模块规格内容的细节。

New in version 3.4.

3.4版本中的新内容。

5.4.4. Import-related module attributes

The import machinery fills in these attributes on each module object during loading, based on the module’s spec, before the loader executes the module.

在加载过程中,在加载器执行模块之前,导入机制根据模块的规格在每个模块对象上填入这些属性。

__name__

The__name__ attribute must be set to the fully-qualified name of the module. This name is used to uniquely identify the module in the import system.

__name__ 属性必须被设置为模块的全称名称。这个名字用于在导入系统中唯一地识别模块。

__loader__

The__loader__ attribute must be set to the loader object that the import machinery used when loading the module. This is mostly for introspection, but can be used for additional loader-specific functionality, for example getting data associated with a loader.

__loader__ 属性必须被设置为导入系统在加载模块时使用的加载器对象。这主要是为了自省,但也可以用于额外的加载器特定功能,例如,获取与加载器相关的数据。

__package__

The module’s__package__ attribute must be set. Its value must be a string, but it can be the same value as its__name__. When the module is a package, its__package__ value should be set to its__name__. When the module is not a package,__package__ should be set to the empty string for top-level modules, or for submodules, to the parent package’s name. SeePEP 366 for further details.

模块的 __package__ 属性必须被设置。它的值必须是一个字符串,但它可以与 __name__ 的值相同。当模块是一个包时,它的 __package__ 值应该被设置为它的 __name__。当模块不是一个包时,对于顶级模块,__package__应该被设置为空字符串,对于子模块,应该设置为父包的名字。更多细节见 PEP 366。

This attribute is used instead of__name__ to calculate explicit relative imports for main modules, as defined inPEP 366. It is expected to have the same value as__spec__.parent.

这个属性被用来代替 __name__ 来计算主模块的显式相对导入,如 PEP 366 所定义。它被期望具有与 __spec__.parent 相同的值。

Changed in version 3.6: The value of__package__ is expected to be the same as__spec__.parent.

在 3.6 版中有所改变。__package__ 的值被期望与 __spec__.parent 相同。

__spec__

The__spec__ attribute must be set to the module spec that was used when importing the module. Setting__spec__ appropriately applies equally tomodules initialized during interpreter startup. The one exception is__main__, where__spec__ isset to None in some cases.

__spec__ 属性必须被设置为导入模块时使用的模块规格。适当地设置 __spec__ 同样适用于在解释器启动时初始化的模块。一个例外是 __main__,在某些情况下 __spec__ 被设置为 None。

When__package__ is not defined,__spec__.parent is used as a fallback.

当 __package__ 没有被定义时,__spec__.parent 被用作后备。

New in version 3.4.

3.4版中的新内容。

Changed in version 3.6:__spec__.parent is used as a fallback when__package__ is not defined.

在3.6版中有所改变。当 __spec__.parent 不被定义时,会被用作后备。

__path__

If the module is a package (either regular or namespace), the module object’s__path__ attribute must be set. The value must be iterable, but may be empty if__path__ has no further significance. If__path__ is not empty, it must produce strings when iterated over. More details on the semantics of__path__ are givenbelow.

如果模块是一个包 (无论是常规的还是命名空间的),模块对象的 __path__ 属性必须被设置。这个值必须是可迭代的,但是如果 __path__ 没有进一步的意义,可以为空。如果 __path__ 不是空的,当它被迭代时必须产生字符串。关于 __path__ 的语义的更多细节将在下面给出。

Non-package modules should not have a__path__ attribute.

非软件包模块不应该有一个 __path__ 属性。

__file__

__cached__

__file__ is optional. If set, this attribute’s value must be a string. The import system may opt to leave__file__ unset if it has no semantic meaning (e.g. a module loaded from a database).

__file__是可选的。如果设置,这个属性的值必须是一个字符串。如果 __file__ 没有语义,导入系统可以选择不设置 __file__ (例如,一个从数据库加载的模块)。

If__file__ is set, it may also be appropriate to set the__cached__ attribute which is the path to any compiled version of the code (e.g. byte-compiled file). The file does not need to exist to set this attribute; the path can simply point to where the compiled file would exist (seePEP 3147).

如果 __file__ 被设置了,可能也应该设置 __cached__ 属性,它是任何代码的编译版本的路径 (例如,字节编译的文件)。设置这个属性时,文件不需要存在;路径可以简单地指向编译后的文件的位置(见PEP 3147)。

It is also appropriate to set__cached__ when__file__ is not set. However, that scenario is quite atypical. Ultimately, the loader is what makes use of__file__ and/or__cached__. So if a loader can load from a cached module but otherwise does not load from a file, that atypical scenario may be appropriate.

当 __file__ 没有被设置时,设置 __cached__ 也是合适的。然而,这种情况是非常不典型的。最终,加载器是使用 __file__ 和/或 __cached__ 的地方。因此,如果一个加载器可以从缓存的模块中加载,但不从文件中加载,这种非典型的情况可能是合适的。

5.4.5. module.__path__

By definition, if a module has a__path__ attribute, it is a package.

A package’s__path__ attribute is used during imports of its subpackages. Within the import machinery, it functions much the same assys.path, i.e. providing a list of locations to search for modules during import. However,__path__ is typically much more constrained thansys.path.

__path__ must be an iterable of strings, but it may be empty. The same rules used forsys.path also apply to a package’s__path__, andsys.path_hooks (described below) are consulted when traversing a package’s__path__.

A package’s__init__.py file may set or alter the package’s__path__ attribute, and this was typically the way namespace packages were implemented prior toPEP 420. With the adoption ofPEP 420, namespace packages no longer need to supply__init__.py files containing only__path__ manipulation code; the import machinery automatically sets__path__ correctly for the namespace package.

5.4.6. Module reprs

By default, all modules have a usable repr, however depending on the attributes set above, and in the module’s spec, you can more explicitly control the repr of module objects.

If the module has a spec (__spec__), the import machinery will try to generate a repr from it. If that fails or there is no spec, the import system will craft a default repr using whatever information is available on the module. It will try to use themodule.__name__,module.__file__, andmodule.__loader__ as input into the repr, with defaults for whatever information is missing.

Here are the exact rules used:

  • If the module has a__spec__ attribute, the information in the spec is used to generate the repr. The “name”, “loader”, “origin”, and “has_location” attributes are consulted.

  • If the module has a__file__ attribute, this is used as part of the module’s repr.

  • If the module has no__file__ but does have a__loader__ that is notNone, then the loader’s repr is used as part of the module’s repr.

  • Otherwise, just use the module’s__name__ in the repr.

Changed in version 3.4: Use ofloader.module_repr() has been deprecated and the module spec is now used by the import machinery to generate a module repr.

For backward compatibility with Python 3.3, the module repr will be generated by calling the loader’smodule_repr() method, if defined, before trying either approach described above. However, the method is deprecated.

5.4.7. Cached bytecode invalidation

Before Python loads cached bytecode from a.pyc file, it checks whether the cache is up-to-date with the source.py file. By default, Python does this by storing the source’s last-modified timestamp and size in the cache file when writing it. At runtime, the import system then validates the cache file by checking the stored metadata in the cache file against the source’s metadata.

Python also supports “hash-based” cache files, which store a hash of the source file’s contents rather than its metadata. There are two variants of hash-based.pyc files: checked and unchecked. For checked hash-based.pyc files, Python validates the cache file by hashing the source file and comparing the resulting hash with the hash in the cache file. If a checked hash-based cache file is found to be invalid, Python regenerates it and writes a new checked hash-based cache file. For unchecked hash-based.pyc files, Python simply assumes the cache file is valid if it exists. Hash-based.pyc files validation behavior may be overridden with the--check-hash-based-pycs flag.

Changed in version 3.7: Added hash-based.pyc files. Previously, Python only supported timestamp-based invalidation of bytecode caches.

namespace:存储变量的地方。在python中用字典来表示namespace。

module:

An object that serves as an organizational unit of Python code. Modules have a namespace containing arbitrary Python objects. Modules are loaded into Python by the process ofimporting.

module是一个对象,用于组织python代码。module有一个包含任意python对象的namespace。通过import来加载到python中。

package:

package是一种特殊的module,拥有__path__属性的module是package。package可以包含package。

python中有两类package:常规的package与namespace package。

regular package:

常规package在python3.2之前就已经存在了,通常用包含__init__.py文件的目录来实现。当常规包导入的时候,会悄悄地执行__init__.py文件,它定义的对象被绑定到包的名称空间。

namespace package:

A namespace package is a composite of variousportions<

  • 作者:steveinchina
  • 原文链接:https://blog.csdn.net/u011239746/article/details/119903181
    更新时间:2022-10-11 09:06:37