Skip to content

Fix race conditions involving this.options being overwritten during execution in async mode #489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Feb 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,10 +177,10 @@ To customize the notion of token equality used, use the `comparator` option to `

For even more customisation of the diffing behavior, you can create a `new Diff.Diff()` object, overwrite its `castInput`, `tokenize`, `removeEmpty`, `equals`, and `join` properties with your own functions, then call its `diff(oldString, newString[, options])` method. The methods you can overwrite are used as follows:

* `castInput(value)`: used to transform the `oldString` and `newString` before any other steps in the diffing algorithm happen. For instance, `diffJson` uses `castInput` to serialize the objects being diffed to JSON. Defaults to a no-op.
* `tokenize(value)`: used to convert each of `oldString` and `newString` (after they've gone through `castInput`) to an array of tokens. Defaults to returning `value.split('')` (returning an array of individual characters).
* `castInput(value, options)`: used to transform the `oldString` and `newString` before any other steps in the diffing algorithm happen. For instance, `diffJson` uses `castInput` to serialize the objects being diffed to JSON. Defaults to a no-op.
* `tokenize(value, options)`: used to convert each of `oldString` and `newString` (after they've gone through `castInput`) to an array of tokens. Defaults to returning `value.split('')` (returning an array of individual characters).
* `removeEmpty(array)`: called on the arrays of tokens returned by `tokenize` and can be used to modify them. Defaults to stripping out falsey tokens, such as empty strings. `diffArrays` overrides this to simply return the `array`, which means that falsey values like empty strings can be handled like any other token by `diffArrays`.
* `equals(left, right)`: called to determine if two tokens (one from the old string, one from the new string) should be considered equal. Defaults to comparing them with `===`.
* `equals(left, right, options)`: called to determine if two tokens (one from the old string, one from the new string) should be considered equal. Defaults to comparing them with `===`.
* `join(tokens)`: gets called with an array of consecutive tokens that have either all been added, all been removed, or are all common. Needs to join them into a single value that can be used as the `value` property of the [change object](#change-objects) for these tokens. Defaults to simply returning `tokens.join('')`.

### Change Objects
Expand Down
1 change: 1 addition & 0 deletions release-notes.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
- [#480](https://github.com/kpdecker/jsdiff/pull/480) Passing `maxEditLength` to `createPatch` & `createTwoFilesPatch` now works properly (i.e. returns undefined if the max edit distance is exceeded; previous behavior was to crash with a `TypeError` if the edit distance was exceeded).
- [#486](https://github.com/kpdecker/jsdiff/pull/486) The `ignoreWhitespace` option of `diffLines` behaves more sensibly now. `value`s in returned change objects now include leading/trailing whitespace even when `ignoreWhitespace` is used, just like how with `ignoreCase` the `value`s still reflect the case of one of the original texts instead of being all-lowercase. `ignoreWhitespace` is also now compatible with `newlineIsToken`. Finally, `diffTrimmedLines` is deprecated (and removed from the docs) in favour of using `diffLines` with `ignoreWhitespace: true`; the two are, and always have been, equivalent.
- [#490](https://github.com/kpdecker/jsdiff/pull/490) When calling diffing functions in async mode by passing a `callback` option, the diff result will now be passed as the *first* argument to the callback instead of the second. (Previously, the first argument was never used at all and would always have value `undefined`.)
- [#489](github.com/kpdecker/jsdiff/pull/489) `this.options` no longer exists on `Diff` objects. Instead, `options` is now passed as an argument to methods that rely on options, like `equals(left, right, options)`. This fixes a race condition in async mode, where diffing behaviour could be changed mid-execution if a concurrent usage of the same `Diff` instances overwrote its `options`.

## v5.2.0

Expand Down
47 changes: 23 additions & 24 deletions src/diff/base.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@ Diff.prototype = {
callback = options;
options = {};
}
this.options = options;

let self = this;

Expand All @@ -21,11 +20,11 @@ Diff.prototype = {
}

// Allow subclasses to massage the input prior to running
oldString = this.castInput(oldString);
newString = this.castInput(newString);
oldString = this.castInput(oldString, options);
newString = this.castInput(newString, options);

oldString = this.removeEmpty(this.tokenize(oldString));
newString = this.removeEmpty(this.tokenize(newString));
oldString = this.removeEmpty(this.tokenize(oldString, options));
newString = this.removeEmpty(this.tokenize(newString, options));

let newLen = newString.length, oldLen = oldString.length;
let editLength = 1;
Expand All @@ -39,10 +38,10 @@ Diff.prototype = {
let bestPath = [{ oldPos: -1, lastComponent: undefined }];

// Seed editLength = 0, i.e. the content starts with the same values
let newPos = this.extractCommon(bestPath[0], newString, oldString, 0);
let newPos = this.extractCommon(bestPath[0], newString, oldString, 0, options);
if (bestPath[0].oldPos + 1 >= oldLen && newPos + 1 >= newLen) {
// Identity per the equality and tokenizer
return done(buildValues(self, bestPath[0].lastComponent, newString, oldString, self.useLongestToken));
return done(buildValues(self, bestPath[0].lastComponent, newString, oldString, self.useLongestToken, options));
}

// Once we hit the right edge of the edit graph on some diagonal k, we can
Expand Down Expand Up @@ -97,16 +96,16 @@ Diff.prototype = {
// path whose position in the old string is the farthest from the origin
// and does not pass the bounds of the diff graph
if (!canRemove || (canAdd && removePath.oldPos < addPath.oldPos)) {
basePath = self.addToPath(addPath, true, false, 0);
basePath = self.addToPath(addPath, true, false, 0, options);
} else {
basePath = self.addToPath(removePath, false, true, 1);
basePath = self.addToPath(removePath, false, true, 1, options);
}

newPos = self.extractCommon(basePath, newString, oldString, diagonalPath);
newPos = self.extractCommon(basePath, newString, oldString, diagonalPath, options);

if (basePath.oldPos + 1 >= oldLen && newPos + 1 >= newLen) {
// If we have hit the end of both strings, then we are done
return done(buildValues(self, basePath.lastComponent, newString, oldString, self.useLongestToken));
return done(buildValues(self, basePath.lastComponent, newString, oldString, self.useLongestToken, options));
} else {
bestPath[diagonalPath] = basePath;
if (basePath.oldPos + 1 >= oldLen) {
Expand Down Expand Up @@ -147,9 +146,9 @@ Diff.prototype = {
}
},

addToPath(path, added, removed, oldPosInc) {
addToPath(path, added, removed, oldPosInc, options) {
let last = path.lastComponent;
if (last && !this.options.oneChangePerToken && last.added === added && last.removed === removed) {
if (last && !options.oneChangePerToken && last.added === added && last.removed === removed) {
return {
oldPos: path.oldPos + oldPosInc,
lastComponent: {count: last.count + 1, added: added, removed: removed, previousComponent: last.previousComponent }
Expand All @@ -161,36 +160,36 @@ Diff.prototype = {
};
}
},
extractCommon(basePath, newString, oldString, diagonalPath) {
extractCommon(basePath, newString, oldString, diagonalPath, options) {
let newLen = newString.length,
oldLen = oldString.length,
oldPos = basePath.oldPos,
newPos = oldPos - diagonalPath,

commonCount = 0;
while (newPos + 1 < newLen && oldPos + 1 < oldLen && this.equals(oldString[oldPos + 1], newString[newPos + 1])) {
while (newPos + 1 < newLen && oldPos + 1 < oldLen && this.equals(oldString[oldPos + 1], newString[newPos + 1], options)) {
newPos++;
oldPos++;
commonCount++;
if (this.options.oneChangePerToken) {
if (options.oneChangePerToken) {
basePath.lastComponent = {count: 1, previousComponent: basePath.lastComponent, added: false, removed: false};
}
}

if (commonCount && !this.options.oneChangePerToken) {
if (commonCount && !options.oneChangePerToken) {
basePath.lastComponent = {count: commonCount, previousComponent: basePath.lastComponent, added: false, removed: false};
}

basePath.oldPos = oldPos;
return newPos;
},

equals(left, right) {
if (this.options.comparator) {
return this.options.comparator(left, right);
equals(left, right, options) {
if (options.comparator) {
return options.comparator(left, right);
} else {
return left === right
|| (this.options.ignoreCase && left.toLowerCase() === right.toLowerCase());
|| (options.ignoreCase && left.toLowerCase() === right.toLowerCase());
}
},
removeEmpty(array) {
Expand All @@ -213,7 +212,7 @@ Diff.prototype = {
}
};

function buildValues(diff, lastComponent, newString, oldString, useLongestToken) {
function buildValues(diff, lastComponent, newString, oldString, useLongestToken, options) {
// First we convert our linked list of components in reverse order to an
// array in the right order:
const components = [];
Expand Down Expand Up @@ -265,9 +264,9 @@ function buildValues(diff, lastComponent, newString, oldString, useLongestToken)
componentLen > 1
&& typeof finalComponent.value === 'string'
&& (
(finalComponent.added && diff.equals('', finalComponent.value))
(finalComponent.added && diff.equals('', finalComponent.value, options))
||
(finalComponent.removed && diff.equals(finalComponent.value, ''))
(finalComponent.removed && diff.equals(finalComponent.value, '', options))
)
) {
components[componentLen - 2].value += finalComponent.value;
Expand Down
8 changes: 4 additions & 4 deletions src/diff/json.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,13 @@ export const jsonDiff = new Diff();
jsonDiff.useLongestToken = true;

jsonDiff.tokenize = lineDiff.tokenize;
jsonDiff.castInput = function(value) {
const {undefinedReplacement, stringifyReplacer = (k, v) => typeof v === 'undefined' ? undefinedReplacement : v} = this.options;
jsonDiff.castInput = function(value, options) {
const {undefinedReplacement, stringifyReplacer = (k, v) => typeof v === 'undefined' ? undefinedReplacement : v} = options;

return typeof value === 'string' ? value : JSON.stringify(canonicalize(value, null, null, stringifyReplacer), stringifyReplacer, ' ');
};
jsonDiff.equals = function(left, right) {
return Diff.prototype.equals.call(jsonDiff, left.replace(/,([\r\n])/g, '$1'), right.replace(/,([\r\n])/g, '$1'));
jsonDiff.equals = function(left, right, options) {
return Diff.prototype.equals.call(jsonDiff, left.replace(/,([\r\n])/g, '$1'), right.replace(/,([\r\n])/g, '$1'), options);
};

export function diffJson(oldObj, newObj, options) { return jsonDiff.diff(oldObj, newObj, options); }
Expand Down
16 changes: 8 additions & 8 deletions src/diff/line.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@ import Diff from './base';
import {generateOptions} from '../util/params';

export const lineDiff = new Diff();
lineDiff.tokenize = function(value) {
if(this.options.stripTrailingCr) {
lineDiff.tokenize = function(value, options) {
if(options.stripTrailingCr) {
// remove one \r before \n to match GNU diff's --strip-trailing-cr behavior
value = value.replace(/\r\n/g, '\n');
}
Expand All @@ -20,7 +20,7 @@ lineDiff.tokenize = function(value) {
for (let i = 0; i < linesAndNewlines.length; i++) {
let line = linesAndNewlines[i];

if (i % 2 && !this.options.newlineIsToken) {
if (i % 2 && !options.newlineIsToken) {
retLines[retLines.length - 1] += line;
} else {
retLines.push(line);
Expand All @@ -30,23 +30,23 @@ lineDiff.tokenize = function(value) {
return retLines;
};

lineDiff.equals = function(left, right) {
lineDiff.equals = function(left, right, options) {
// If we're ignoring whitespace, we need to normalise lines by stripping
// whitespace before checking equality. (This has an annoying interaction
// with newlineIsToken that requires special handling: if newlines get their
// own token, then we DON'T want to trim the *newline* tokens down to empty
// strings, since this would cause us to treat whitespace-only line content
// as equal to a separator between lines, which would be weird and
// inconsistent with the documented behavior of the options.)
if (this.options.ignoreWhitespace) {
if (!this.options.newlineIsToken || !left.includes('\n')) {
if (options.ignoreWhitespace) {
if (!options.newlineIsToken || !left.includes('\n')) {
left = left.trim();
}
if (!this.options.newlineIsToken || !right.includes('\n')) {
if (!options.newlineIsToken || !right.includes('\n')) {
right = right.trim();
}
}
return Diff.prototype.equals.call(this, left, right);
return Diff.prototype.equals.call(this, left, right, options);
};

export function diffLines(oldStr, newStr, callback) { return lineDiff.diff(oldStr, newStr, callback); }
Expand Down
6 changes: 3 additions & 3 deletions src/diff/word.js
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,12 @@ const extendedWordChars = /^[a-zA-Z\u{C0}-\u{FF}\u{D8}-\u{F6}\u{F8}-\u{2C6}\u{2C
const reWhitespace = /\S/;

export const wordDiff = new Diff();
wordDiff.equals = function(left, right) {
if (this.options.ignoreCase) {
wordDiff.equals = function(left, right, options) {
if (options.ignoreCase) {
left = left.toLowerCase();
right = right.toLowerCase();
}
return left === right || (this.options.ignoreWhitespace && !reWhitespace.test(left) && !reWhitespace.test(right));
return left === right || (options.ignoreWhitespace && !reWhitespace.test(left) && !reWhitespace.test(right));
};
wordDiff.tokenize = function(value) {
// All whitespace symbols except newline group into one token, each newline - in separate token
Expand Down
14 changes: 14 additions & 0 deletions test/diff/character.js
Original file line number Diff line number Diff line change
Expand Up @@ -38,5 +38,19 @@ describe('diff/character', function() {
expect(convertChangesToXML(diffResult)).to.equal('New value<del>s</del>.');
});
});

it('should not be susceptible to race conditions in async mode when called with different options', function(done) {
// (regression test for https://github.com/kpdecker/jsdiff/issues/477)
diffChars('wibblywobbly', 'WIBBLYWOBBLY', {ignoreCase: false, callback: (diffResult) => {
expect(convertChangesToXML(diffResult)).to.equal('<del>wibblywobbly</del><ins>WIBBLYWOBBLY</ins>');
done();
}});

// Historically, doing this while async execution of the previous
// diffChars call was ongoing would overwrite this.options and make the
// ongoing diff become case-insensitive partway through execution.
diffChars('whatever', 'whatever', {ignoreCase: true});
diffChars('whatever', 'whatever', {ignoreCase: true, callback: () => {}});
});
});
});