Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A standard way to pass around extracted execution info #265

Open
yxliang01 opened this issue Jan 23, 2020 · 6 comments
Open

A standard way to pass around extracted execution info #265

yxliang01 opened this issue Jan 23, 2020 · 6 comments
Labels

Comments

@yxliang01
Copy link
Contributor

For many repair approaches, gathering extra execution information is required. One example is to determine the test-equivalence relationships between tests. In Darjeeling.coverage package, it provides a class for gathering coverage info. I feel we can make a more generic package Darjeeling.execution, we can have a collector-like class under it, then the coverage package can be a subpackage having its classes inherited from the classes in Darjeeling.execution. Essentially, Darjeeling.execution is just a generic version of the current Darjeeling.coverage. With Darjeeling.execution, it allows Darjeeling to support more repair approaches requiring extra execution info extractions. Then, we simply treat coverage as one kind of execution info that helps the repair process while there can be a few of them.

@ChrisTimperley Would you like to comment on this design? If it sounds good, are you ok to do these changes? (maybe prioritize for our project)

@yxliang01
Copy link
Contributor Author

With this, maybe having symbolic information passing around is also possible? This would enlarge the application scope of Darjeeling a lot feels like.

@ChrisTimperley
Copy link
Collaborator

I like this idea. I suppose that learned invariants (e.g., mined by Daikon) or data traces would fall under the same umbrella. It would be great if we could extend this to symbolic information, too.

It sounds like you have a rough API design in mind. Would you mind identifying some of the interfaces (e.g., Executor and Execution?) and their most important methods?

@yxliang01
Copy link
Contributor Author

I actually think we can just go for single class for being responsible for one type of execution info or one source.

Two designs here:

  1. One type execution info one class

The logic would be clear.

  1. One source of different execution info one class

This would be efficient to execute. Also, might be more efficient to implement. One example is that one source could be something like KLEE. It would make sense if we simply group the execution info getting from one KLEE run into one instance. On the other hand, it is generally required to extract different kinds of execution info that could be generated in only one run. If we go for the 1st design, it's rather difficult to reuse the same execution to extract all possible info wanted.

For the concrete interface in the package, I think having prepare, build, execute and collect would be generally sufficient.

Take gcov as an example (I haven't been using for few years, I might be mistaken about some details), instrument would be prepare, add specific build argument goes to build, execute will not need to do anything special, collect is to parse the coverage data file generated.

Take KLEE as an example, feels like only execute is required to override for having the program running in the KLEE vm.

I feel one software architecture challenge is how to make not interfering execution info generation processes able to be run concurrently, but the implementation remains elegant. e.g. while generating the coverage info, we might also generate the test case passing/failing info concurrently. @ChrisTimperley Maybe would be great if you can comment on this.

For the concrete design, I haven't had a very clear view in mind yet, I can give more feedback next week if you wish. :) But, for #268 , it is required to be able to trace the value flow during test case executions, as far as I think.

@yxliang01
Copy link
Contributor Author

@ChrisTimperley Hey Chris, would you like to comment or agree/disagree with this? As I plan to have demo running soon, I might also need to determine whether to take this generic approach versus a hard-coded fork for resolving #268 , taking time into consideration.

Thanks

@ChrisTimperley
Copy link
Collaborator

I think that this is a good idea that would extend Darjeeling's capabilities, and that the API you sketched above is a good starting point. Based on my understanding, it seems that we need at least two new interfaces:

  • AnnotatedTestExecution: this would provide additional information about a single test execution (e.g., coverage).
  • AnnotatedTestExecutor: this would perform annotated test execution for a single, prepared container. The AnnotatedTestExecutor instance for a given container could be built by either: a static build(container) method in the AnnotatedTestExecutor class, or by a build(container) method in an AnnotatedTestExecutorFactory, which would provide greater ability to customise and vary the creation of AnnotatedTestExecutor instances. For now, I would probably go with the static build(container) method, and if that proves too limiting, then it wouldn't be too much trouble to introduce the factory.

Thoughts?

@yxliang01
Copy link
Contributor Author

From above, sounds like AnnotatedTestExecution would provide one fixed set of additional information (correct me if wrong). Otherwise, I am not clear about its purpose.
If now the changes is on the test executor, I don't really see a point to have the annotated version and the normal version, after all the normal version is simply one without providing any additional information.

This proposed solution would be easier to do and less difficult to extend than my previous proposals. I am fine with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants