Help: get a full picture of input values of a java method call systematically #14660
Unanswered
oriana19993926782
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello CodeQL Community,
I'm now doing a research on assessing Large language models'(LLaMAs) capabilities in code analysis. More specifically, I give a LLaMA a method and an input value, then ask them what this method returns, and compare its answer to the correct expected output.
The methods I intend for LLaMA to analyze are those that are called by assertions in unit tests. And the inputs/expected outputs of such methods can be in different forms :
JobInstance.class
"Hello world"
or8
(no need to get full picture for this kind)In the above unit test example, the method in question is
marshal
, and input isactual
(a local variable) and the correct expected output isYAML
(a global variable). However, I cannot simply provide LLaMA withactual
as the input value for this method, nor can I directly compare its answer toYAML
. My final objective is to extract a full picture of both input values and expected output here.I have made some trials for extracting the full picture of input variables. Here's the query I've written so far:
Take the previous unit test for example, this query successfully identifies "marshal" as the method being tested, its input variable
actual
, the declaration ofactual
, and only part of the subsequent changes toactual
after its declaration(the changes that directly affectactual
), marked by "✅", but left out some changes indirectly affectingactual
, marked by "❌":So, the 1st problem with the above query is that it cannot capture the full subsequent changes(only direct changes) to the variable(
actual
) after its declaration.And the 2rd problem with the above query is that even when you uncomment all the content related to
preNode
, this query cannot capture any potential statement that could influence the state of the variable to be declared. For a start, in the following unit test, thepreNode
in my query could not capture the statements, marked by "❌", before the declaration ofproxy
, which clearly determine whatproxy
is.Unfortunately, things can get more complex :
In this complex unit test above,
successEvent
is the input variable here. To extract the full picture, we need to know whatstartEvent
andexecutionSuccess()
is, and also whatJobExecutionEvent.ExecutionSource.NORMAL_TRIGGER
(input ofstartEvent
instantiation) is, bringing about the 3rd problem with the above query, which is it cannot get the further dependency of the statements captured by the query if any, either a method call likeexecutionSuccess()
or another variable likeJobExecutionEvent.ExecutionSource.NORMAL_TRIGGER
here. I am quite new to codeQL, I don't even have an idea of how to get the full picture ofstartEvent
in this complex situation.Below is another complex unit test, the method tested here has 2 layers of inputs and it used a global variable in the 2nd layer(this is just to show how the inputs under study can have various forms, and thus different kinds of complexity):
new JobExecutionEvent(...)
"localhost", "127.0.0.1", "fake_task_id", "test_job", JobExecutionEvent.ExecutionSource.NORMAL_TRIGGER, 0
The java project under study is https://github.com/apache/shardingsphere-elasticjob.git
I am struggling and completely overwhelmed by the complexity here! I would greatly appreciate any guidance on how to modify my query to get a full picture at least for input values. And thank you for taking the time to read this far. 💗💗
Beta Was this translation helpful? Give feedback.
All reactions