Skip to content

problems found during the ws package upgrade that make ci error #7067

Open
@kryon-7

Description

@kryon-7

To be honest, the CI issue is not related to ws, but rather the randomDelayStorage itself has a problem with handling concurrency. The issue actually existed before, but the upgrade of the ws library caused changes in the speed of message transmission, which amplified the problem. The issue itself exists, but the ws upgrade increased the likelihood of the problem occurring. There are many possibilities for the problem to appear. When I delete other cases and only run the doing insert after subscribe should end with the correct results case, the probability of the problem occurring decreases... I don't know why.

    const c = await humansCollection.create(1);
    let result = [];
    c.insert(schemaObjects.humanData()); // do not await here!
    c.find().$.subscribe((r) => {
        console.log(`subscribe get, ${JSON.stringify(r, null, 2)}`);
        result = r;
    });

    await c.insert(schemaObjects.humanData());
    console.log('wait');
    await waitUntil(() => result.length === 3);

We mark the three documents inserted in this code as 0, 1, 2 according to the insertion order. Document 0 is created during the create operation and is awaited, so there's no problem with it.

The issue occurs with the last insert of document 1,2 and the find operation with subscription.
randomDelayStorage randomly delays for a period of time before actually using memory storage. Therefore, even though the network data for the bulkWrite method is sent before the query method's network data, it's still possible that the server processes the query first, then reads the current document data as [0]. Then the insert method executes very quickly, rapidly returning the changeStream and bulkWrite data packets to the client. After sending these, it then sends the query return data back to the client via websocket. In this situation, the client first gets the insert result and discards one piece of data from the changeStream that it should have received, then gets the query result [0]. This depends on the processing speed of insertion and query. It's even possible to only get [0, 1] data, with data for 2 being lost because it was returned too quickly. These extreme possibilities can occur.

Solutions:

  1. Implement read-write lock control for storage to ensure atomicity of read and write operations and guarantee data consistency. Performance will generally decrease, but ensuring data consistency is necessary.
  2. Specify in advance that this method still has extreme cases (maybe, I just want to share these two feasible solutions). That is, when querying, start reading the changeStream and cache it, return a checkpoint with the query result, first get the checkpoint and then execute the query operation. The checkpoint data is included in the query return, and all changeStream data after the checkpoint is applied. However, there are still issues because getting the query result and getting the checkpoint information are not atomic operations, so there will still be concurrency problems. However, since create, update, and delete operations are essentially overwrite operations on the doc, implementing idempotent processing is not difficult. Redundant data is simpler to handle than missing data.

The first solution is very simple and won't cause any problems. For the second solution, I'm not sure if it's feasible or if there are extreme cases that need to be considered. Additionally, since I'm new to rxdb and have only read a small portion of the source code, I'm not clear how much modification to the source code would be required. I hope to get some advice on this.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions