Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: MinishLab/model2vec
Failed to load repositories. Confirm that selected base ref is valid, then try again.
Loading
base: v0.5.0
Choose a base ref
...
head repository: MinishLab/model2vec
Failed to load repositories. Confirm that selected head ref is valid, then try again.
Loading
compare: v0.6.0
Choose a head ref
  • 16 commits
  • 25 files changed
  • 4 contributors

Commits on May 3, 2025

  1. Configuration menu
    Copy the full SHA
    aef3c42 View commit details
    Browse the repository at this point in the history

Commits on May 8, 2025

  1. Configuration menu
    Copy the full SHA
    647fa80 View commit details
    Browse the repository at this point in the history

Commits on May 14, 2025

  1. Configuration menu
    Copy the full SHA
    562c14d View commit details
    Browse the repository at this point in the history

Commits on May 21, 2025

  1. Configuration menu
    Copy the full SHA
    15207a8 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    3ac27c6 View commit details
    Browse the repository at this point in the history

Commits on May 22, 2025

  1. feat: update lock (#246)

    * feat: update lock
    
    * pin transformers
    stephantul authored May 22, 2025
    Configuration menu
    Copy the full SHA
    eb3ac19 View commit details
    Browse the repository at this point in the history
  2. feat: allow passing validation set explicitly (#245)

    * feat: allow passing validation set explicitly
    
    * fix: typing
    
    * fix: input validation
    
    * add tests, extra check, pre-commit
    
    ---------
    
    Co-authored-by: stephantul <[email protected]>
    JarbasAl and stephantul authored May 22, 2025
    Configuration menu
    Copy the full SHA
    af6f67c View commit details
    Browse the repository at this point in the history

Commits on May 23, 2025

  1. docs: Added multilingual results (#247)

    * Added multilingual results
    
    * Updated readme
    
    * Updated readme
    
    * Updated readme
    
    * Updated readme
    
    * Updated readme
    Pringled authored May 23, 2025
    Configuration menu
    Copy the full SHA
    a8e0edc View commit details
    Browse the repository at this point in the history

Commits on May 25, 2025

  1. Configuration menu
    Copy the full SHA
    86d5378 View commit details
    Browse the repository at this point in the history

Commits on May 26, 2025

  1. feat: add supertokenizers (#236)

    * remove multiword warning
    
    * add superbpe tokenizers
    
    * fix issue with mwe
    
    * form
    
    * working version
    
    * first pass
    
    * small fixes, many comments
    
    * fix e5 bug
    
    * Adjust arcane formulae
    
    * fix: logging
    
    * wip
    
    * wip
    
    * wip
    
    * lower complexity
    
    * add lock file
    
    * fix: metaspace pretokenizer
    
    * fix: bug in vocab
    
    * feat: spaces/commas etc.
    
    * turn tokenizer into package
    
    * add annotations
    
    * feat: turn tokenizer into package
    
    * fix: future
    
    * add tokenizer function
    
    * update lockfile
    
    * feat: improve segmentation of unigram
    
    * fix: broken merge
    
    * fix interpunct tokens
    
    * fix tests, make tokenizer changes better
    
    * update lock file
    
    * fix comment, add additional check for pad token
    
    * tests: add a lot of tests
    
    * fix: 3.9 error
    stephantul authored May 26, 2025
    Configuration menu
    Copy the full SHA
    80338f2 View commit details
    Browse the repository at this point in the history

Commits on May 27, 2025

  1. Configuration menu
    Copy the full SHA
    e1bf00d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    76985bd View commit details
    Browse the repository at this point in the history
  3. docs: Added new logo (#252)

    * Added new logo
    
    * Added new logo
    
    * Added new logo
    
    * Added new logo
    
    * Added new logo
    Pringled authored May 27, 2025
    Configuration menu
    Copy the full SHA
    22011b7 View commit details
    Browse the repository at this point in the history

Commits on May 28, 2025

  1. fix: missing unk, fix bug (#251)

    * wip
    
    * fix: tokenizer bug
    
    * add correct scaling for byte
    stephantul authored May 28, 2025
    Configuration menu
    Copy the full SHA
    c4b8254 View commit details
    Browse the repository at this point in the history

Commits on Jun 2, 2025

  1. bump version (#258)

    stephantul authored Jun 2, 2025
    Configuration menu
    Copy the full SHA
    a3c42a0 View commit details
    Browse the repository at this point in the history

Commits on Jun 3, 2025

  1. Configuration menu
    Copy the full SHA
    06a478c View commit details
    Browse the repository at this point in the history
Loading