-
Notifications
You must be signed in to change notification settings - Fork 327
Moving to a new masking approach #13341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
✨ GUI Checks ResultsSummary
See individual check results for more details. |
Currently just a single continuous block.
…tem`. Java format.
Make `BuilderForType` be more type specific on `getType` and `seal`.
Fix in DataQualityMetrics which was toInt for no reason.
Update joins to long indexes. Workaround for the visitor - will revisit in next PR.
71957e1
to
d215ce8
Compare
First set of IndexMapper tests.
builder.append(item); | ||
yield size == 1 | ||
? builder.seal() | ||
: MaskOperation.getSlicedStorage(builder.seal(), new IndexMapper.Constant(size)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it.
|
||
IndexMapper merge(IndexMapper other) { | ||
if (_mergeCache == null) { | ||
_mergeCache = new LeastRecentlyUsedCache<>(1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want a WeakLeastRecentlyUsedCache
derived from WeakHashMap
so collected elements don't take up slots that other elements count use.
std-bits/table/src/main/java/org/enso/table/data/column/operation/masks/IndexMapper.java
Show resolved
Hide resolved
*/ | ||
public static Column slice(Column column, long start, long length) { | ||
long currentSize = column.getSize(); | ||
long newSize = Math.max(0, Math.min(currentSize - start, length)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we shouldn't consider it an error to ask for a slice that is outside the original range -- we might tolerate this in the Enso code, but at this level it perhaps should be considered a logic error.
No significant decrease in performance on this PR but the last one does appear to have. Will look into what I can get back in later PRs. |
Pull Request Description
LongArrayList
allowing for initial capacity and copying to an array.Storage
(slice
,applyMask
andapplyFilter
).newInstance
andnewUnderlyingArray
fromSpecialisedStorage
as only used by these.iteratorWithIndex
from baseColumnStorage<?>
as unused. Specialised versions are still kept.Builder
interface return the builder allowing for a more fluid use.BuilderForType<T>
to use the typed argument more.makeEmpty
into theBuilder
class as a static method producing a masked single value storage.BoolBuilder.makeConstant
, use theBuilder
method instead.DoubleStorageFacade
insideNumericColumnAdapter
and privatised.StorageIterators.forEachOverBooleanStorage
as this was missing.ColumnStorageIterators
for Boolean, Long and Double storages.ComputedNullableLongStorage
.getData
methods onDoubleStorage
,LongStorage
andBoolStorage
togetArray
.IndexMapper
(similar to the oldOrderMask
) keeps a single mask layer.IndexMapper
.NOT_FOUND_INDEX
thatOrderMask
had.MaskOperation
:MaskedStorage
with anIndexMapper
.MaskedStorage
has a parent storage and index mapper.ColumnStorage
.SliceRange
to be long-based and to have a static method to produce along[]
array.Column
now usesColumnStorage<T>
notStorage<T>
.Column.applyFilter
removed - only used byTable.filter
and replaced byColumn.mask
.Column.applyMask
,Table.applyMask
andOrderMask
.slice
andmask
to use theMaskOperation
.Table.filter
to use theMaskOperation
.MultiValueKey
andMultiValueIndex
to have long row indexes.List<Integer>
will fix in next PR.JoinResult
to useIndexMapper
instead ofOrderMask
.CrossJoin
to have a progress handler.SortJoin
,CompoundHashJoin
andSimpleHashJoin
to be long-indexed.LookupJoin
to be long indexed.Column.reverse
andTable.reverse
usingIndexMapper.Reversed
.OrderBuilder.buildMask
to take a singleOrderRule
instead of aVector
.long[]
and then lets you callmask
on the column.Fan_Out
to use mask approach.Checklist
Please ensure that the following checklist has been satisfied before submitting the PR:
Scala,
Java,
TypeScript,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
or the Snowflake database integration, a run of the Extra Tests has been scheduled.