The purpose of this C# analyzer is to see all possible execution paths from any entry point inside the code. Currently the entry point can be only a method.
There are multiple ways to deal with this. The approach would be to rewrite all existing code and rewrite all conditional statement so that they are true all the time. This approach has numerous disadvantages.
Firstly, it will actually execute the code and any calls to external resources such as databases will be executed. Since the code has been altered, there is also no guarantee that it won't produce exceptions at runtime. And for really big code bases, it will be incredibly slow and it will require actual compilation of the modified code which may take even more time.
My approach is to write a C# interpreter with some special conditions that will go through all possible execution paths. In order to go through all the execution paths, our interpreter will need to ignore all the conditional statements and jump directly in the code that is executed when a given statement is true.
Still this is not enough, for example if we have an "if" statement and in each branch of it we assign a new variable to the same reference, then what should happen? If we overwrite the value of that reference each time we assign something to it, then if someone calls a method on that reference, it might ignore some methods of the variables that were previously replaced. So we need to modify the reference assignment behavior and the actual references themselves. The references in the code become collections of variables and when we assign a new value to it, we add another variable inside the collection.
So a variable or better said reference in our interpreter will look like this:
Since now a reference contains multiple variables, the way we call methods is also changed.
If we have multiple variables inside a reference, what methods do we call? We actually need to call the method on all the variables contained in that reference.
Above is just a part of the code, but we can see that iterate through all the variables that are stored in the reference, get the accessed method and then call the method with the variable. All methods by default have an extra parameter to them, the object on which the method is called. This parameter is passed on the last line of the code above. Inside the method call it will become the "this" variable.
There are multiple ways to deal with this. The approach would be to rewrite all existing code and rewrite all conditional statement so that they are true all the time. This approach has numerous disadvantages.
Firstly, it will actually execute the code and any calls to external resources such as databases will be executed. Since the code has been altered, there is also no guarantee that it won't produce exceptions at runtime. And for really big code bases, it will be incredibly slow and it will require actual compilation of the modified code which may take even more time.
My approach is to write a C# interpreter with some special conditions that will go through all possible execution paths. In order to go through all the execution paths, our interpreter will need to ignore all the conditional statements and jump directly in the code that is executed when a given statement is true.
Still this is not enough, for example if we have an "if" statement and in each branch of it we assign a new variable to the same reference, then what should happen? If we overwrite the value of that reference each time we assign something to it, then if someone calls a method on that reference, it might ignore some methods of the variables that were previously replaced. So we need to modify the reference assignment behavior and the actual references themselves. The references in the code become collections of variables and when we assign a new value to it, we add another variable inside the collection.
So a variable or better said reference in our interpreter will look like this:
Since now a reference contains multiple variables, the way we call methods is also changed.
If we have multiple variables inside a reference, what methods do we call? We actually need to call the method on all the variables contained in that reference.
Above is just a part of the code, but we can see that iterate through all the variables that are stored in the reference, get the accessed method and then call the method with the variable. All methods by default have an extra parameter to them, the object on which the method is called. This parameter is passed on the last line of the code above. Inside the method call it will become the "this" variable.
Comments
Post a Comment