Skip to content

Conversation

@loic-simon
Copy link
Contributor

@loic-simon loic-simon commented Sep 30, 2025

This PR handles an edge-case when suggesting module imports in PyREPL (see this comment):

If a module is already imported, the import machinery will only get submodules from it, even if a module of the same name is higher in the path. This can happen if eg. sys.path has been updated since the module was originaly imported.

The easiest way to reproduce this issue (and how I stumbled on it) is with stdlib modules imported during interpreter startup (before site customisation, so ignoring potential shadowing):

% mkdir collections
% touch collections/__init__.py collections/foo.py
% python
Python 3.14.0rc2 [...]
>>> from collections import <TAB>
>>> from collections import foo
Traceback (most recent call last):
  File "<python-input-0>", line 1, in <module>
    from collections import foo
ImportError: cannot import name 'foo' from 'collections' (/Users/loic/cpython/Lib/collections/__init__.py)
>>>

This PR add a special case in _pyrepl.ModuleCompleter._find_modules that

  • check if the module to complete imports from is already imported
  • if yes, search for imports from the imported module location only, instead of using the global cache

It was a little tricky to get right, I'm not sure about everything 😅 but it pass all tests I could think off!

cc @tomasr8

@loic-simon
Copy link
Contributor Author

Ok, one of the new test fails on Windows x64/32 (but pass on arm)

I just setup and built Python on a Win 11 x64 to try to reproduce, the test pass 😵‍💫 (and the fix works)
If anyone has an idea of what may interfere? (Python 3.15.0a0 (main, Sep 30 2025, 23:36:53) [MSC v.1944 64 bit (AMD64)] on win32)

@python-cla-bot
Copy link

python-cla-bot bot commented Oct 1, 2025

All commit authors signed the Contributor License Agreement.

CLA signed

@picnixz picnixz marked this pull request as draft October 3, 2025 13:21
@picnixz
Copy link
Member

picnixz commented Oct 3, 2025

I'm going to convert it into a draft until you deem this PR ready. That will also reduce the probability of someone clicking on the PR to review it (like me...) while it's not yet ready for.

@loic-simon
Copy link
Contributor Author

loic-simon commented Oct 3, 2025

I'm going to convert it into a draft until you deem this PR ready. That will also reduce the probability of someone clicking on the PR to review it (like me...) while it's not yet ready for.

Thanks, I just fixed the failing test, this should be ready for review!

Sorry again for the noise, I was dealing with what turned out to be a importer cache issue, which I couldn't reproduce outside of the CI buildbots 😓

[edit] I just noticed I can convert to draft / mark ready myself, will do it next times!

@loic-simon loic-simon marked this pull request as ready for review October 5, 2025 13:38
@tomasr8 tomasr8 self-requested a review December 28, 2025 15:29
if os.path.basename(imported_path) == "__init__.py": # package
imported_path = os.path.dirname(imported_path)
import_location = os.path.dirname(imported_path)
modules = list(pkgutil.iter_modules([import_location]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is correct, it's going to find submodules but we actually want this to be the top-level modules

Copy link
Contributor Author

@loic-simon loic-simon Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's OK, because os.path.dirname make us search in the folder containing the module origin, not on the origin itself (that's why we need to do it twice for a package):

>>> import os, pkgutil, typing, concurrent
>>>
>>> typing.__spec__.origin  # single-file module
'<venv>/lib/python3.14/typing.py'
>>> concurrent.__spec__.origin  # package
'<venv>/lib/python3.14/concurrent/__init__.py'
>>>
>>> loc = os.path.dirname(typing.__spec__.origin)  # or do it twice for concurrent
>>> [mod for mod in pkgutil.iter_modules([loc]) if mod.name in ("typing", "concurrent")]
[ModuleInfo(module_finder=FileFinder('<venv>/lib/python3.14'), name='concurrent', ispkg=True), 
ModuleInfo(module_finder=FileFinder('<venv>/lib/python3.14'), name='typing', ispkg=False)]

While refactoring this into a separate function I ended up rewriting the whole thing, it should be more explicit now (and it checks module.__package__ instead of the os.path.basename(imported_path) == "__init__.py" hack)

Copy link
Contributor Author

@loic-simon loic-simon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also realized we will need the same logic for my other PR about attributes completion, so I factored the logic into a private method, I suggest we merge this one first so I can rebase merge main into the other and call it!

EDIT: ok that's not true, since attributes completion checks directly the module object, I forgot 😅 but the refactor is a good thing anyway I think

if os.path.basename(imported_path) == "__init__.py": # package
imported_path = os.path.dirname(imported_path)
import_location = os.path.dirname(imported_path)
modules = list(pkgutil.iter_modules([import_location]))
Copy link
Contributor Author

@loic-simon loic-simon Jan 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe it's OK, because os.path.dirname make us search in the folder containing the module origin, not on the origin itself (that's why we need to do it twice for a package):

>>> import os, pkgutil, typing, concurrent
>>>
>>> typing.__spec__.origin  # single-file module
'<venv>/lib/python3.14/typing.py'
>>> concurrent.__spec__.origin  # package
'<venv>/lib/python3.14/concurrent/__init__.py'
>>>
>>> loc = os.path.dirname(typing.__spec__.origin)  # or do it twice for concurrent
>>> [mod for mod in pkgutil.iter_modules([loc]) if mod.name in ("typing", "concurrent")]
[ModuleInfo(module_finder=FileFinder('<venv>/lib/python3.14'), name='concurrent', ispkg=True), 
ModuleInfo(module_finder=FileFinder('<venv>/lib/python3.14'), name='typing', ispkg=False)]

While refactoring this into a separate function I ended up rewriting the whole thing, it should be more explicit now (and it checks module.__package__ instead of the os.path.basename(imported_path) == "__init__.py" hack)

@loic-simon loic-simon requested a review from tomasr8 January 1, 2026 18:14
return None
if not spec.has_location:
if spec.origin == "frozen": # See Tools/build/freeze_modules.py
return os.path.join(self._stdlib_path, f"{spec.name}.py")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not always going to be a real path right? e.g. _frozen_importlib.. in any case it does not seem that useful to create the full path when we strip it away with os.path.dirname right after.

Perhaps we can have a smaller diff if instead of reverse-engineering the location, we simply look for the right ModuleInfo in the global cache. After all if the module has already been imported, it should be visible to pkgutil.iter_modules. Something like this:

modules: Iterable[pkgutil.ModuleInfo] = self.global_cache
imported_module = sys.modules.get(path.split('.')[0])
if imported_module is not None:
    # Filter modules to those whose name and spec matches the imported module
    modules = ...

We can compare the module name and origin (maybe with something like mod_info.module_finder.find_spec(mod_info.name)) to make sure we have the right one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can compare the module name and origin (maybe with something like mod_info.module_finder.find_spec(mod_info.name)) to make sure we have the right one.

Clever! That will however provide no completions if the imported module exists, but is not the one returned by pkgutil.iter_modules anymore. But it's a quite degenerate case, and already half-supported because of the global_cache not updating... So I think having no suggestions in this case is fine, and the diff is much smaller indeed! I was probably a bit over zealous 😅

(frozen packages are also left out (__phello__.__spec__.origin == 'frozen' vs mod_info.find_spec('__phello__').origin == '<venv>/Lib/__phello__/__init__.py'), but that's an even more niche issue -- I'm not ever sure there is frozen packages except __phello__!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will however provide no completions if the imported module exists, but is not the one returned by pkgutil.iter_modules anymore

I considered it, but I think that's an edge case not worth trying to work around..

frozen packages are also left out

damn, that's unfortunate, we might need some special handling for frozen modules then. Note that there's also another special origin for built-in modules:

>>> import builtins
>>> builtins.__spec__.origin
'built-in'

I'm not ever sure there is frozen packages except phello

There are quite a bit. Here's an example of those that are imported at startup:

>>> frozen = {name for name, m in sys.modules.items() if m.__spec__ and m.__spec__.origin == 'frozen'}
>>> frozen
{'posixpath', 'site', 'importlib._bootstrap_external', 'importlib._bootstrap', 'io', 'zipimport', '_sitebuiltins', '__phello__', '_frozen_importlib', 'collections.abc', '_collections_abc', 'os', 'stat', 'genericpath', 'abc', 'os.path', '_frozen_importlib_external', 'codecs', 'importlib.util', 'importlib.machinery'}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants