The Wayback Machine - https://web.archive.org/web/20220503141909/https://github.com/msgpack/msgpack-ruby/pull/244
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support serializing bigint as extension types #244

Merged
merged 1 commit into from Feb 22, 2022

Conversation

Copy link

@casperisfine casperisfine commented Jan 27, 2022

Fix: #243

factory = MessagePack::Factory.new
factory.register_type(
  0x42,
  Integer,
  packer: ->(int) { int.to_s },
  unpacker: -> (payload) { Integer(payload) }
)

factory.dump(42) # Use the native integer type
factory.dump(2**400) # Use the registered packer

@tagomoris
Copy link

@tagomoris tagomoris commented Jan 28, 2022

The motivation about BigInt is acceptable for me.
But it's not clear to me why Integer is registered as an ext type but 42 should not be encoded into an ext type binary. Can you explain it clearly to users of msgpack-ruby?

In other words, 2 ** 400 is an Integer, (2**400).is_a?(Integer) is true and (2**400).class is Integer. How can we know whether it should be encoded as int or ext 0x42 (in your sample code)?

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Jan 28, 2022

But it's not clear to me why Integer is registered as an ext type but 42 should not be encoded into an ext type binary. Can you explain it clearly to users of msgpack-ruby?

Of course. The rationale is that using the native type when the integer fits in it is:

  • more performant.
  • backward compatible.

IMO extension types only make sense if the data doesn't fit a native type, and I see no benefit into being able to replace native types.

We could however have it has an option, but again, I'd rather not do it unless someone see a use case for it (I don't).

@casperisfine casperisfine force-pushed the max-big-num branch 3 times, most recently from 7cdfe71 to 79e4fdc Compare Jan 28, 2022
@tagomoris
Copy link

@tagomoris tagomoris commented Feb 8, 2022

@casperisfine Could you rebase this change too?
(Sorry, I'll see other pull-requests tomorrow or later)

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 8, 2022

Rebased.

(Sorry, I'll see other pull-requests tomorrow or later)

No need to apologize, thanks for all the reviews!

Copy link
Author

@casperisfine casperisfine left a comment

@tagomoris I updated this PR with your comments from #243

if (hasBigintExtType && tryAppendWithExtTypeLookup(object)) {
return;
}
throw runtime.newRangeError(String.format("Cannot pack big integer: %s", value));
Copy link
Author

@casperisfine casperisfine Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MRI extension raises RangeError on bigint, but the JRuby one ArgumentError. I assume it is a bug.

@@ -209,6 +211,10 @@ static VALUE Factory_register_type(int argc, VALUE* argv, VALUE self)
}
}

if (ext_module == rb_cInteger && RB_TEST(rb_hash_aref(options, ID2SYM(rb_intern("bigint"))))) {
Copy link
Author

@casperisfine casperisfine Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No idea what's the best key to pass, so I went with bigint: true, but I'm open to suggestions.

# frozen_string_literal: true

module MessagePack
module Bigint
Copy link
Author

@casperisfine casperisfine Feb 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to provide a decent BigInt packer

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 15, 2022

Option renamed as oversized_integer_extension: true

Copy link
Member

@tagomoris tagomoris left a comment

Could you change to ... ?:

  • raise ArgumentError if oversized_integer_extension: true is specified with non-Integer class
  • not to require msgpack/bigint in default because its serialization format is not any agreed format - it should be required by users explicitly

if(required_size > 8 && pk->has_bigint_ext_type) {
if(msgpack_packer_try_write_with_ext_type_lookup(pk, v)) {
return;
}
Copy link
Member

@tagomoris tagomoris Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this code let msgpack_packer_write_u64 raise RangeError unless pk->has_bigint_ext_type, right?
If so, I want a comment to explain it here.

Copy link
Author

@casperisfine casperisfine Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I wrote a spec for it, that's how I noticed JRuby was inconsistent.

Copy link
Member

@tagomoris tagomoris Feb 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, sorry, I want a code comment at just after L482

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 15, 2022

@tagomoris it should be good now.

Copy link
Member

@tagomoris tagomoris left a comment

LGTM!

I've created a commit to minimize if-clause for older rubies as small as possible.
WDYT? 7f586b4

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 22, 2022

WDYT?

It cause two methods to be defined instead of one. Not the end of the world, but seems unnecessary. But if you really don't like def inside if, it's up to you.

@tagomoris
Copy link

@tagomoris tagomoris commented Feb 22, 2022

Using define_method with Proc is alternative. Hmm.

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 22, 2022

I'd rather not, define_method create different ISeq that are slower than regular ones defined with def.

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 22, 2022

Is this only a stylistic issue? Because this pattern is fairly common.

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 22, 2022

define_method create different ISeq that are slower than regular ones defined with def.

You can search for the bmethod term (as in block methods)

@tagomoris
Copy link

@tagomoris tagomoris commented Feb 22, 2022

I want to decrease points to confuse myself in the future. def in if makes me confused when I check code without memory of this change (for code about Ruby 2.6, will be EoL next month!)

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 22, 2022

Then maybe some extra commenting could help?

@tagomoris
Copy link

@tagomoris tagomoris commented Feb 22, 2022

Could you write the implementation for the current Ruby versions at first (with appropriate code comments)?
Then, the compatibility follows.
I think it should decrease my confusion.

@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 22, 2022

Of course.

Fix: msgpack#243

```ruby
factory = MessagePack::Factory.new
factory.register_type(
  0x42,
  Integer,
  packer: ->(int) { int.to_s },
  unpacker: -> (payload) { Integer(payload) }
)

factory.dump(42) # Use the native integer type
factory.dump(2**400) # Use the registered packer
```
@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 22, 2022

Done.

Copy link
Member

@tagomoris tagomoris left a comment

LGTM. Thanks!

@tagomoris tagomoris merged commit 9aa71da into msgpack:master Feb 22, 2022
23 checks passed
@casperisfine
Copy link
Author

@casperisfine casperisfine commented Feb 22, 2022

Thanks for the merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants