Bug 203898 - Add FuzzerAgent that reads predictions from a file
Summary: Add FuzzerAgent that reads predictions from a file
Status: RESOLVED FIXED
Alias: None
Product: WebKit
Classification: Unclassified
Component: JavaScriptCore (show other bugs)
Version: Other
Hardware: Unspecified Unspecified
: P2 Normal
Assignee: Nobody
URL:
Keywords: InRadar
Depends on:
Blocks:
 
Reported: 2019-11-06 06:23 PST by Tuomas Karkkainen
Modified: 2019-12-01 22:45 PST (History)
9 users (show)

See Also:


Attachments
proposed patch (26.88 KB, patch)
2019-11-06 06:28 PST, Tuomas Karkkainen
sbarati: review-
sbarati: commit-queue-
Details | Formatted Diff | Diff
update patch (28.50 KB, patch)
2019-11-13 06:28 PST, Tuomas Karkkainen
no flags Details | Formatted Diff | Diff
proposed patch (42.33 KB, patch)
2019-11-13 23:52 PST, Tuomas Karkkainen
sbarati: review+
Details | Formatted Diff | Diff
proposed patch (42.89 KB, patch)
2019-11-19 07:04 PST, Tuomas Karkkainen
no flags Details | Formatted Diff | Diff
proposed patch (42.61 KB, patch)
2019-11-26 04:57 PST, Tuomas Karkkainen
sbarati: review+
sbarati: commit-queue-
Details | Formatted Diff | Diff
proposed patch (41.99 KB, patch)
2019-12-01 21:22 PST, Tuomas Karkkainen
no flags Details | Formatted Diff | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tuomas Karkkainen 2019-11-06 06:23:21 PST
Add FuzzerAgent that reads predictions from a file
Comment 1 Tuomas Karkkainen 2019-11-06 06:28:53 PST
Created attachment 382922 [details]
proposed patch
Comment 2 Saam Barati 2019-11-07 14:11:06 PST
Comment on attachment 382922 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=382922&action=review

> Source/JavaScriptCore/runtime/FileBasedFuzzerAgent.cpp:59
> +    BytecodeIndex bytecodeIndex = codeOrigin.bytecodeIndex();
> +    codeBlock->expressionRangeForBytecodeIndex(bytecodeIndex, divot, startOffset, endOffset, line, column);

I think you need to handle inline call frames here, right?
Comment 3 Saam Barati 2019-11-07 14:21:28 PST
Comment on attachment 382922 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=382922&action=review

> Source/JavaScriptCore/ChangeLog:7
> +

you should describe the details of what this patch is doing here.

> Source/JavaScriptCore/runtime/FileBasedFuzzerAgent.cpp:167
> +String FileBasedFuzzerAgent::createLookupKey(const String& sourceFilename, OpcodeID opcodeId, int startLocation, int endLocation)
> +{
> +    StringBuilder lookupKey;
> +    lookupKey.append(sourceFilename);
> +    lookupKey.append("|");
> +    lookupKey.append(toString(opcodeAliasForLookupKey(opcodeId)));
> +    lookupKey.append("|");
> +    lookupKey.append(startLocation);
> +    lookupKey.append("|");
> +    lookupKey.append(endLocation);
> +    return lookupKey.toString();
> +}

nit: This is slightly hokey. We tend to define a struct for hash map keys, and use a custom hash and equality function. Turning it into a String is convenient, but slightly weird. I guess since we don't care about perf, maybe this is ok.

> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:50
> +    while ((line = fgets(buffer, sizeof(buffer), f))) {
> +        SpeculatedType prediction;
> +        char lookupKey[BUFSIZ];
> +        int numberOfItemsFound = sscanf(line, "%[^:]:%llx\n", lookupKey, &prediction);
> +        RELEASE_ASSERT(numberOfItemsFound == 2);
> +        const String& functionNameString = String(lookupKey, strlen(lookupKey));
> +        m_predictions.set(functionNameString, prediction);
> +    }

can you comment on the format of this file?

Also, how do you produce this file?

> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:61
> +        RELEASE_ASSERT((predicted & SpecFullTop) != SpecNone);

why can't we sometimes produce SpecNone?

> Source/JavaScriptCore/runtime/FuzzerPredictions.h:32
> +#define NO_PREDICTION_FOUND 1ull << 55

Why does this work? This seems slightly wrong.

Maybe we should just fall back to random fuzzer when we don't have the key?
Comment 4 Tuomas Karkkainen 2019-11-07 21:57:14 PST
(In reply to Saam Barati from comment #3)
> Comment on attachment 382922 [details]
> proposed patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=382922&action=review
> > Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:61
> > +        RELEASE_ASSERT((predicted & SpecFullTop) != SpecNone);
> 
> why can't we sometimes produce SpecNone?

I had some issues with the fuzzer producing a predictions file that was parsed wrong which resulted in a predictions of SpecNone when I hadn't intended it. And currently the fuzzer never produces SpecNone. I should rework this. And if I keep this check I should move it to where the file is read in.

> > Source/JavaScriptCore/runtime/FuzzerPredictions.h:32
> > +#define NO_PREDICTION_FOUND 1ull << 55
> 
> Why does this work? This seems slightly wrong.

I just picked something larger than SpecDataViewObject (1<<42) and I made it a lot larger in case there are other speculated types added later on. I would like to detect the case of no prediction being found.

What would be a good way to let the FuzzerAgent know that no prediction was found?

> Maybe we should just fall back to random fuzzer when we don't have the key?

I want to detect when I have a missing key for sanity checking purposes, and if we were to fall back to random predictions in this case I would to it in the FuzzerAgent rather than in FuzzerPredictions.

One idea behind reading predictions from a file is to make the predictions stabler, that's why it's returning the original prediction when none is found rather than a random one.
Comment 5 Tuomas Karkkainen 2019-11-08 06:00:30 PST
(In reply to Saam Barati from comment #3)
> Comment on attachment 382922 [details]
> proposed patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=382922&action=review
> 
> > Source/JavaScriptCore/ChangeLog:7
> > +
> 
> you should describe the details of what this patch is doing here.

I'll add a description for this. 

> > Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:50
> > +    while ((line = fgets(buffer, sizeof(buffer), f))) {
> > +        SpeculatedType prediction;
> > +        char lookupKey[BUFSIZ];
> > +        int numberOfItemsFound = sscanf(line, "%[^:]:%llx\n", lookupKey, &prediction);
> > +        RELEASE_ASSERT(numberOfItemsFound == 2);
> > +        const String& functionNameString = String(lookupKey, strlen(lookupKey));
> > +        m_predictions.set(functionNameString, prediction);
> > +    }
> 
> can you comment on the format of this file?
> 
> Also, how do you produce this file?

I'll add a comment describing the format.

The file is produced by the fuzzer, which is not part of the WebKit repo.

> > Source/JavaScriptCore/runtime/FileBasedFuzzerAgent.cpp:167
> > +String FileBasedFuzzerAgent::createLookupKey(const String& sourceFilename, OpcodeID opcodeId, int startLocation, int endLocation)
> > +{
> > +    StringBuilder lookupKey;
> > +    lookupKey.append(sourceFilename);
> > +    lookupKey.append("|");
> > +    lookupKey.append(toString(opcodeAliasForLookupKey(opcodeId)));
> > +    lookupKey.append("|");
> > +    lookupKey.append(startLocation);
> > +    lookupKey.append("|");
> > +    lookupKey.append(endLocation);
> > +    return lookupKey.toString();
> > +}
> 
> nit: This is slightly hokey. We tend to define a struct for hash map keys,
> and use a custom hash and equality function. Turning it into a String is
> convenient, but slightly weird. I guess since we don't care about perf,
> maybe this is ok.
 
I figured since this is only for fuzzing I wouldn't spend too much effort on parsing the config file. The string here matches the format of the file containing the predictions, which I will document.
Comment 6 Tuomas Karkkainen 2019-11-08 06:01:14 PST
(In reply to Saam Barati from comment #2)
> Comment on attachment 382922 [details]
> proposed patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=382922&action=review
> 
> > Source/JavaScriptCore/runtime/FileBasedFuzzerAgent.cpp:59
> > +    BytecodeIndex bytecodeIndex = codeOrigin.bytecodeIndex();
> > +    codeBlock->expressionRangeForBytecodeIndex(bytecodeIndex, divot, startOffset, endOffset, line, column);
> 
> I think you need to handle inline call frames here, right?

Yeah. I have to study the whole concept and I will post an updated patch once I understand it better.
Comment 7 Tuomas Karkkainen 2019-11-13 06:28:56 PST
Created attachment 383452 [details]
update patch

updated patch that addresses the concerns from the review.
Comment 8 Tuomas Karkkainen 2019-11-13 06:30:43 PST
(In reply to Saam Barati from comment #2)
> Comment on attachment 382922 [details]
> proposed patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=382922&action=review
> 
> > Source/JavaScriptCore/runtime/FileBasedFuzzerAgent.cpp:59
> > +    BytecodeIndex bytecodeIndex = codeOrigin.bytecodeIndex();
> > +    codeBlock->expressionRangeForBytecodeIndex(bytecodeIndex, divot, startOffset, endOffset, line, column);
> 
> I think you need to handle inline call frames here, right?

As discussed offline, it turns out that codeBlock->expressionRangeForBytecodeIndex(codeOrigin.bytecodeIndex()) returns the correct source range even for inline call frames.
Comment 9 Tuomas Karkkainen 2019-11-13 23:52:45 PST
Created attachment 383545 [details]
proposed patch

adds a second FuzzerAgent that can be used to generate the prediction files.
Comment 10 Saam Barati 2019-11-18 11:02:48 PST
Comment on attachment 383545 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=383545&action=review

> Source/JavaScriptCore/runtime/FileBasedFuzzerAgent.cpp:79
> +        // FIXME: these can not be targeted at all due to the bugs below

style nit: indentation should be four spaces to the left

> Source/JavaScriptCore/runtime/FileBasedFuzzerAgent.cpp:85
> +        // FIXME: the output of codeBlock->expressionRangeForBytecodeIndex() allows for some of
> +        // these opcodes to have predictions, but not all instances can be reliably targeted.

style nit: indentation should be four spaces to the left

> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:49
> +        size_t length = strlen(line);
> +        if (line[length - 1] == '\n') {
> +            line[length - 1] = '\0';
> +            length--;
> +        }

this processing might be nicer if you just read the entire file into a WTF string, which has a split function

> Source/JavaScriptCore/runtime/PredictionFileCreatingFuzzerAgent.cpp:95
> +    default:
> +        RELEASE_ASSERT_WITH_MESSAGE(false, "unhandled opcode: %s", toString(opcodeId).utf8().data());

why so few opcodes above?
Comment 11 Saam Barati 2019-11-18 11:12:20 PST
Comment on attachment 383545 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=383545&action=review

>> Source/JavaScriptCore/runtime/PredictionFileCreatingFuzzerAgent.cpp:95
>> +        RELEASE_ASSERT_WITH_MESSAGE(false, "unhandled opcode: %s", toString(opcodeId).utf8().data());
> 
> why so few opcodes above?

never mind, these are just opcodes with heap predictions. Is still seems like we should just include all opcodes unconditionally since anytime someone adds a new opcode with a heap prediction, they'll need to fill in a case in this switch statement. However, we don't do anything special for different opcodes.
Comment 12 Saam Barati 2019-11-18 11:14:44 PST
Comment on attachment 383545 [details]
proposed patch

r=me
Comment 13 Tuomas Karkkainen 2019-11-18 11:22:21 PST
(In reply to Saam Barati from comment #11)
> Comment on attachment 383545 [details]
> proposed patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=383545&action=review
> 
> >> Source/JavaScriptCore/runtime/PredictionFileCreatingFuzzerAgent.cpp:95
> >> +        RELEASE_ASSERT_WITH_MESSAGE(false, "unhandled opcode: %s", toString(opcodeId).utf8().data());
> > 
> > why so few opcodes above?
> 
> never mind, these are just opcodes with heap predictions. Is still seems
> like we should just include all opcodes unconditionally since anytime
> someone adds a new opcode with a heap prediction, they'll need to fill in a
> case in this switch statement. However, we don't do anything special for
> different opcodes.

I want to map the JavaScript source code to the opcode and prediction. Therefore I want to know which opcodes occur, and what the source range is for those opcodes, and 
1) It turned out that there were bugs in getting the source range for a number of opcodes
2) I need code in the fuzzer to target a specific prediction location. 

If someone adds a new opcode that has a heap prediction, this FuzzerAgent will crash and that’s how I will immediately find out about the opcode and I’ll be able to add code to handle it.
Comment 14 Tuomas Karkkainen 2019-11-19 07:04:09 PST
Created attachment 383865 [details]
proposed patch

address comments from review
Comment 15 Mark Lam 2019-11-19 09:30:37 PST
Comment on attachment 383865 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=383865&action=review

Drive by comment.

> Source/JavaScriptCore/runtime/VM.cpp:460
>          setFuzzerAgent(makeUnique<RandomizingFuzzerAgent>(*this));
>      else if (Options::useDoublePredictionFuzzerAgent())
>          setFuzzerAgent(makeUnique<DoublePredictionFuzzerAgent>(*this));
> +    else if (Options::useFileBasedFuzzerAgent())
> +        setFuzzerAgent(makeUnique<FileBasedFuzzerAgent>(*this));
> +    else if (Options::usePredictionFileCreatingFuzzerAgent())
> +        setFuzzerAgent(makeUnique<PredictionFileCreatingFuzzerAgent>(*this));

Since these fuzzer agent settings are mutually exclusive, I think it would be beneficial to let the user know if more of these options are enabled at the same time rather than failing silently.  I'm thinking something like this:

    if (UNLIKELY(Options::useRandomizingFuzzerAgent() || Options::useDoublePredictionFuzzerAgent() || Options::useFileBasedFuzzerAgent() || Options::usePredictionFileCreatingFuzzerAgent())) {
        size_t numberOfFuzzerAgentsEnabled = 0;
        if (Options::useRandomizingFuzzerAgent()) {
            setFuzzerAgent(makeUnique<RandomizingFuzzerAgent>(*this));
            numberOfFuzzerAgentsEnabled++;
        }
        if (Options::useDoublePredictionFuzzerAgent()) {
            setFuzzerAgent(makeUnique<DoublePredictionFuzzerAgent>(*this));
            numberOfFuzzerAgentsEnabled++;
        }
        if (Options::useFileBasedFuzzerAgent()) {
            setFuzzerAgent(makeUnique<FileBasedFuzzerAgent>(*this));
            numberOfFuzzerAgentsEnabled++;
        }
        if (Options::usePredictionFileCreatingFuzzerAgent()) {
            setFuzzerAgent(makeUnique<PredictionFileCreatingFuzzerAgent>(*this));
            numberOfFuzzerAgentsEnabled++;
        }
        RELEASE_ASSERT_WITH_MESSAGE(numberOfFuzzerAgentsEnabled == 1, "Cannot use more than one fuzzer agent at the same time");
    }
Comment 16 Mark Lam 2019-11-19 13:59:30 PST
Comment on attachment 383865 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=383865&action=review

> Source/JavaScriptCore/runtime/FileBasedFuzzerAgent.cpp:74
> +    ScriptExecutable* ownerExecutable = codeBlock->ownerExecutable();
> +    auto sourceURL = ownerExecutable->sourceURL();
> +    if (sourceURL.isEmpty())
> +        return original;
> +
> +    int divot = 0;
> +    int startOffset = -1;
> +    int endOffset = -1;
> +    unsigned line = 0;
> +    unsigned column = 0;
> +    BytecodeIndex bytecodeIndex = codeOrigin.bytecodeIndex();
> +    codeBlock->expressionRangeForBytecodeIndex(bytecodeIndex, divot, startOffset, endOffset, line, column);
> +
> +    SourceProvider* provider = codeBlock->source().provider();
> +    auto sourceUpToDivot = provider->source().substring(divot - startOffset, startOffset);
> +    auto sourceAfterDivot = provider->source().substring(divot, endOffset);
> +
> +    Vector<String> urlParts = sourceURL.split('/');
> +    auto sourceFilename = urlParts.isEmpty() ? sourceURL : urlParts.last();
> +
> +    const InstructionStream& instructions = codeBlock->instructions();
> +    const Instruction* anInstruction = instructions.at(bytecodeIndex).ptr();
> +    OpcodeID opcodeId = anInstruction->opcodeID();
> +    int startLocation = divot - startOffset;
> +    int endLocation = divot + endOffset;
> +    String key = createLookupKey(sourceFilename, opcodeId, startLocation, endLocation);

nit: A lot of this is common to the version in PredictionFileCreatingFuzzerAgent::getPrediction().  Since both PredictionFileCreatingFuzzerAgent and FileBasedFuzzerAgent derives from FileBasedFuzzerAgentBase, would it be possible and make sense to factor out this code into a common utility function in FileBasedFuzzerAgentBase?

> Source/JavaScriptCore/runtime/FileBasedFuzzerAgent.cpp:150
> +#if PLATFORM(COCOA)
> +            StringBuilder missingPredictionCrashLogMessage;
> +            missingPredictionCrashLogMessage.append("missing prediction for: ", key);
> +            WTF::setCrashLogMessage(missingPredictionCrashLogMessage.toString().utf8().data());
> +#endif
> +            RELEASE_ASSERT_NOT_REACHED();

Instead of adding PLATFORM(COCOA) specific code here, how about adding a RELEASE_ASSERT_NOT_REACHED_WITH_MESSAGE() to Assertions.h, so that this can this functionality can be re-used.  For the implementation of RELEASE_ASSERT_NOT_REACHED_WITH_MESSAGE(), I think you can just define it as RELEASE_ASSERT_WITH_MESSAGE(assertionFailureDueToUnreachableCode, __VA_ARGS__), and define "constexpr bool assertionFailureDueToUnreachableCode = false;" in Assertions.h.

> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:53
> +    FILE* f = fopen(fileName, "r");
> +    if (!f)
> +        return WTF::nullopt;
> +
> +    auto readContents = [&]() -> Optional<String> {
> +        if (fseek(f, 0, SEEK_END) == -1)
> +            return WTF::nullopt;
> +
> +        long bufferCapacity = ftell(f);
> +        if (bufferCapacity == -1)
> +            return WTF::nullopt;
> +
> +        if (fseek(f, 0, SEEK_SET) == -1)
> +            return WTF::nullopt;
> +
> +        Vector<char> buffer;
> +        buffer.resize(bufferCapacity);
> +        size_t readSize = fread(buffer.data(), 1, buffer.size(), f);
> +        if (readSize == buffer.size())
> +            return String(buffer.data(), buffer.size());
> +        return WTF::nullopt;

In all of these error cases, I suggest that you print a short message about the error that occurred.  This will help the user diagnose what went wrong instead of a silent failure where the user may not realize that a prediction file was never loaded at all (and hence, hours of test runs may have been in vain).  Better yet, do a RELEASE_ASSERT_WITH_MESSAGE so that the user is guaranteed to notice the error.  I can't think of any reason why we would want to allow the test to proceed if the user asked for a predictions file to be loaded but we failed to do so.

> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:65
> +    if (!filename)
> +        return;

FuzzerPredictions is only used by FileBasedFuzzerAgent::getPrediction(), right?  I think it's a misconfiguration if the user specifies the option to use a FileBasedFuzzerAgent but fails to specify a predictions file to use.  You should just RELEASE_ASSERT_WITH_MESSAGE here and let the user know that he/she forgot to specify a predictions file.

> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:71
> +    if (!predictions) {
> +        dataLogF("Failed to read file %s\n", filename);
> +        return;
> +    }

I think you can remove this after adding the RELEASE_ASSERTs in readFileIntoString().  It's better to assert it in readFileIntoString() than here because we can get a more fine grain error message as to what went wrong there.

> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:81
> +        // The start and end offsets are 7-bit unsigned integers.
> +        // If start offset > 127, then both start and end offsets are 0.
> +        // If end offset > 127, then the end offset is 0.

Why are start and end 7-bit ints?  They came from the character position in the source string.  Given a long enough source string, they can be greater than 7-bits.  Am I missing something?

> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:109
> +    if (m_predictions.contains(key))
> +        return m_predictions.get(key);

I think you can just call get() directly, and check whether it returns the emptyValue().  That way, you don't do the look up twice for each successful case.
Comment 17 Tuomas Karkkainen 2019-11-19 22:04:55 PST
(In reply to Mark Lam from comment #16)
> Comment on attachment 383865 [details]
> proposed patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=383865&action=review
> 
> > Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:81
> > +        // The start and end offsets are 7-bit unsigned integers.
> > +        // If start offset > 127, then both start and end offsets are 0.
> > +        // If end offset > 127, then the end offset is 0.
> 
> Why are start and end 7-bit ints?  They came from the character position in
> the source string.  Given a long enough source string, they can be greater
> than 7-bits.  Am I missing something?

They are defined as 7-bit unsigned integers in ExpressionRangeInfo.h. The behavior described is implemented in UnlinkedCodeBlock::addExpressionInfo(). (There's 25 bits worth of instructionOffset and divotPoint each.)
Comment 18 Mark Lam 2019-11-19 23:13:37 PST
(In reply to Tuomas Karkkainen from comment #17)
> (In reply to Mark Lam from comment #16)
> > Comment on attachment 383865 [details]
> > proposed patch
> > 
> > View in context:
> > https://bugs.webkit.org/attachment.cgi?id=383865&action=review
> > 
> > > Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:81
> > > +        // The start and end offsets are 7-bit unsigned integers.
> > > +        // If start offset > 127, then both start and end offsets are 0.
> > > +        // If end offset > 127, then the end offset is 0.
> > 
> > Why are start and end 7-bit ints?  They came from the character position in
> > the source string.  Given a long enough source string, they can be greater
> > than 7-bits.  Am I missing something?
> 
> They are defined as 7-bit unsigned integers in ExpressionRangeInfo.h. The
> behavior described is implemented in UnlinkedCodeBlock::addExpressionInfo().
> (There's 25 bits worth of instructionOffset and divotPoint each.)

Ah, thanks.  I worked with this code many years ago, but I've since forgotten about it.  The comments in UnlinkedCodeBlock::addExpressionInfo() explains why 7 bits is fine:

    ...
    } else if (startOffset > ExpressionRangeInfo::MaxOffset) {
        // If the start offset is out of bounds we clear both offsets
        // so we only get the divot marker. Error message will have to be reduced
        // to line and charPosition number.
        startOffset = 0;
        endOffset = 0;
    } else if (endOffset > ExpressionRangeInfo::MaxOffset) {
        // The end offset is only used for additional context, and is much more likely
        // to overflow (eg. function call arguments) so we are willing to drop it without
        // dropping the rest of the range.
        endOffset = 0;
    }

1. startOffset and endOffset is always relative to the divot of an expression string, not absolute positions.  Hence, it is likely that they will fit within 7 bits.
2. startOffset and endOffset are only used for helping to get more intelligent error strings.  Hence, it is not critical for them to be accurate.  They are a best effort attempt, and we're willing to drop them (set to 0) if we can't record the actual offsets.

What this means is that you shouldn't be using the start and end offsets in your lookup key.  Instead, just use the line and column values that are fetched by expressionRangeForBytecodeIndex().  But maybe I'm not getting your intent.  Why are you using the start and end positions anyway?
Comment 19 Mark Lam 2019-11-19 23:39:14 PST
Comment on attachment 383865 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=383865&action=review

>>>> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:81
>>>> +        // If end offset > 127, then the end offset is 0.
>>> 
>>> Why are start and end 7-bit ints?  They came from the character position in the source string.  Given a long enough source string, they can be greater than 7-bits.  Am I missing something?
>> 
>> They are defined as 7-bit unsigned integers in ExpressionRangeInfo.h. The behavior described is implemented in UnlinkedCodeBlock::addExpressionInfo(). (There's 25 bits worth of instructionOffset and divotPoint each.)
> 
> Ah, thanks.  I worked with this code many years ago, but I've since forgotten about it.  The comments in UnlinkedCodeBlock::addExpressionInfo() explains why 7 bits is fine:
> 
>     ...
>     } else if (startOffset > ExpressionRangeInfo::MaxOffset) {
>         // If the start offset is out of bounds we clear both offsets
>         // so we only get the divot marker. Error message will have to be reduced
>         // to line and charPosition number.
>         startOffset = 0;
>         endOffset = 0;
>     } else if (endOffset > ExpressionRangeInfo::MaxOffset) {
>         // The end offset is only used for additional context, and is much more likely
>         // to overflow (eg. function call arguments) so we are willing to drop it without
>         // dropping the rest of the range.
>         endOffset = 0;
>     }
> 
> 1. startOffset and endOffset is always relative to the divot of an expression string, not absolute positions.  Hence, it is likely that they will fit within 7 bits.
> 2. startOffset and endOffset are only used for helping to get more intelligent error strings.  Hence, it is not critical for them to be accurate.  They are a best effort attempt, and we're willing to drop them (set to 0) if we can't record the actual offsets.
> 
> What this means is that you shouldn't be using the start and end offsets in your lookup key.  Instead, just use the line and column values that are fetched by expressionRangeForBytecodeIndex().  But maybe I'm not getting your intent.  Why are you using the start and end positions anyway?

Tuomas explained to me offline that this is not an issue.  For the purpose of the predictions generator, this works adequately.  We don't need something more accurate.  Carry on.
Comment 20 Tuomas Karkkainen 2019-11-26 04:57:31 PST
Created attachment 384346 [details]
proposed patch

address concerns from review.
Comment 21 Saam Barati 2019-11-30 16:23:35 PST
Comment on attachment 384346 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=384346&action=review

r=me with a couple nits

> Source/JavaScriptCore/runtime/FuzzerPredictions.cpp:33
> +    FILE* f = fopen(fileName, "r");

Nit: in WebKit style, we tend to avoid single letter variable names. I think a name more inline with the style guide would be “file”

> Source/JavaScriptCore/runtime/OptionsList.h:402
> +    v(OptionString, fuzzerPredictions, nullptr, Normal, "file with list of predictions for FileBasedFuzzerAgent") \

Maybe “fuzzerPredictionsPath” or “fuzzerPredictionsFile”?

> Source/WTF/ChangeLog:3
> +        add ASSERT_NOT_REACHED_WITH_MESSAGE and RELEASE_ASSERT_NOT_REACHED_WITH_MESSAGE

Delete this changelog in your diff. I think you already did this in another patch
Comment 22 Tuomas Karkkainen 2019-12-01 21:22:39 PST
Created attachment 384592 [details]
proposed patch

resolve issues raised by Saam
Comment 23 Mark Lam 2019-12-01 21:37:07 PST
Comment on attachment 384592 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=384592&action=review

> Source/JavaScriptCore/runtime/VM.cpp:471
>          setFuzzerAgent(makeUnique<RandomizingFuzzerAgent>(*this));
>      if (Options::useDoublePredictionFuzzerAgent())
>          setFuzzerAgent(makeUnique<DoublePredictionFuzzerAgent>(*this));
> +    if (Options::useFileBasedFuzzerAgent())
> +        setFuzzerAgent(makeUnique<FileBasedFuzzerAgent>(*this));
> +    if (Options::usePredictionFileCreatingFuzzerAgent())
> +        setFuzzerAgent(makeUnique<PredictionFileCreatingFuzzerAgent>(*this));

If the user sets more than 1 fuzzer agent, only the last fuzzer agent will take effect, and we'll silently fail to use the earlier fuzzer agents.  I think we should not be silently failing, and commented about this in https://bugs.webkit.org/show_bug.cgi?id=203898#c15.  Did you disagree with that assessment?
Comment 24 Tuomas Karkkainen 2019-12-01 21:53:52 PST
(In reply to Mark Lam from comment #23)
> Comment on attachment 384592 [details]
> proposed patch
> 
> View in context:
> https://bugs.webkit.org/attachment.cgi?id=384592&action=review
> 
> > Source/JavaScriptCore/runtime/VM.cpp:471
> >          setFuzzerAgent(makeUnique<RandomizingFuzzerAgent>(*this));
> >      if (Options::useDoublePredictionFuzzerAgent())
> >          setFuzzerAgent(makeUnique<DoublePredictionFuzzerAgent>(*this));
> > +    if (Options::useFileBasedFuzzerAgent())
> > +        setFuzzerAgent(makeUnique<FileBasedFuzzerAgent>(*this));
> > +    if (Options::usePredictionFileCreatingFuzzerAgent())
> > +        setFuzzerAgent(makeUnique<PredictionFileCreatingFuzzerAgent>(*this));
> 
> If the user sets more than 1 fuzzer agent, only the last fuzzer agent will
> take effect, and we'll silently fail to use the earlier fuzzer agents.  I
> think we should not be silently failing, and commented about this in
> https://bugs.webkit.org/show_bug.cgi?id=203898#c15.  Did you disagree with
> that assessment?

I made that into a separate patch at https://bugs.webkit.org/show_bug.cgi?id=204607
Comment 25 Mark Lam 2019-12-01 22:00:18 PST
Comment on attachment 384592 [details]
proposed patch

View in context: https://bugs.webkit.org/attachment.cgi?id=384592&action=review

r=me

>>> Source/JavaScriptCore/runtime/VM.cpp:471
>>> +        setFuzzerAgent(makeUnique<PredictionFileCreatingFuzzerAgent>(*this));
>> 
>> If the user sets more than 1 fuzzer agent, only the last fuzzer agent will take effect, and we'll silently fail to use the earlier fuzzer agents.  I think we should not be silently failing, and commented about this in https://bugs.webkit.org/show_bug.cgi?id=203898#c15.  Did you disagree with that assessment?
> 
> I made that into a separate patch at https://bugs.webkit.org/show_bug.cgi?id=204607

OK.  That works.
Comment 26 WebKit Commit Bot 2019-12-01 22:44:24 PST
Comment on attachment 384592 [details]
proposed patch

Clearing flags on attachment: 384592

Committed r252978: <https://trac.webkit.org/changeset/252978>
Comment 27 WebKit Commit Bot 2019-12-01 22:44:26 PST
All reviewed patches have been landed.  Closing bug.
Comment 28 Radar WebKit Bug Importer 2019-12-01 22:45:22 PST
<rdar://problem/57550508>