Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
There is a quite a lot of redundancy in code, function and generator objects.
Intra-object redundancy
Code objects have four fields ,co_nlocalsplus, co_nplaincellvars, co_nlocals, co_nfreevars. Any of these fields can be computed from the other three.
Code objects have a qualified name and a name. The name is always the suffix of the qualified name. Changing this to qualifying prefix and name would save space and allow sharing.
The defaults and keyword defaults for a function are separate and the keyword defaults are a dict. They could be combined into a single array.
Generator objects have a gi_code field, which is redundant, as the frame contains a reference to the code object.
Inter-object redundancy
Functions and generators have qualified name and name fields, which are almost always the same as the underlying code object. These should be lazily initialized
I have no opinion on whether these changes are good or bad, but they are surely breaking changes. Code objects aren't well documented, but they're not private implementation details. How do you plan to manage these changes?
markshannon commentedJan 3, 2023
•
edited by bedevere-bot
There is a quite a lot of redundancy in code, function and generator objects.
Intra-object redundancy
co_nlocalsplus
,co_nplaincellvars
,co_nlocals
,co_nfreevars
. Any of these fields can be computed from the other three.gi_code
field, which is redundant, as the frame contains a reference to the code object.Inter-object redundancy
Linked PRs
co_nplaincellvars
field from code objects. #100721gi_code
field from generator object. #100749The text was updated successfully, but these errors were encountered: