Feature SS-2318
2 votes

Preprocessing of symbolic data / functions' arguments

Created by Davide on 4/16/2016 11:18 AM Last Updated by Davide on 11/4/2016 2:33 AM
%
 (hrs)
Logged: 0   (hrs)

 Description

Hello Andrey,

I think we have found something to improve in some functions, namely those capable to process both numeric and symbolic data (i.e. like UnitsOf(...));

Consider the following problem:

f(x):UnitsOf(x)+UnitsOf(y)

I think that ideally only UnitsOf(y) should be preprocessed, because UnitsOf(x) contains the argument of the function f, that is an unknown not yet defined for his usage, thus the evaluation should be done only when the function is called.

I know it is possible to use line and eval to accomplish the task, but seems to me that would be more consistent to have this natively. The question is: it is possible for you to teach the assignment operator to skip the preprocessing of whatever(x), where x is argument of the LHS? Alternatively, there is a way for developers to distinguish if the input argument contains the argument of a parent function (something like context.IsArgument(Term) / context.IsArgument(Entry))?

 

 

    Davide (Friday, November 4, 2016 2:33 AM) #

Reopened because this report. As visible in last screenshot of comment #11 user function is "too much preprocessd" on definition (since r is LHS argument, NR shouldn't be asked to trigger a result).

    smath (Monday, September 19, 2016 1:34 AM) #

Finally fixed.

Thank you all for help!

    Davide (Tuesday, May 17, 2016 6:10 PM) #

First checks all fine! Thank you Andrey!!! :)

    smath (Tuesday, May 17, 2016 3:36 PM) #

Finally implemented! It was really complex task... Good news is that result didn't affect much on performance and we still have good level of back-compatibility - at least no test files failed during automated tests execution.

Functionality can be tested with a new beta SMath Studio build (available via Extensions Manager or here: http://smath.info/file/XbN7z).

    smath (Tuesday, May 10, 2016 2:12 PM) #

Thank you! Reproduced and investigated.

Continue to work on it...

    Mike Kaganski (Tuesday, May 10, 2016 1:59 AM) #

    f(x):cases(1,x>0,error("!"))

gives error. This shows that error() should always return itself in symbolic phase, and only actually throw error in numeric evaluation.

    f(x):if(x>0,1,error("!"))

somehow avoids the problem, but I suppose that making mentioned change to error() logic is more general and consistent solution.

Even if cases() will learn to avoid evaluation of its arguments, the existing mode may prevent user from creating their own custom functions with similar effects.

 

And yet, I suppose that (given the example that Andrey given at May 7), the following code

    y:8*'kg

    f(x):UnitsOf(x)+UnitsOf(y)+UnitsOf(z)='kg+UnitsOf(x)+UnitsOf(z)@#

shoudl give

    f(6*'km)=UnitsOf(z)+'kg+'m (in symbolic mode)

    f(6*'km)=error (in numeric mode)

because it's unknown what are the units of uninitialized variable!

 

Davide, I suppose you meant RHS <-> LHS?

    Davide (Tuesday, May 10, 2016 1:16 AM) #

mhhh.... oversimplifying the logic of the assignment operator should act in two/three steps:

step 1) on right side, preprocess everything except the functions and the operands that are arguments (smath can handle that, it knows what is a function and what is an argument/operand)

step 2) on the right side preprocessed @ step 1, preprocess the functions (from the inner to the outer, only if in the arguments there aren't unknowns)

step 3) [optional] if there aren't unknowns, do a global preprocessing (to have a full preprocessing, just in case you have passed only dummy arguments on the LHS)

This should produce what was shown by Andrey in the screenshot below even for more complex cases.

 

note that even f(x):num2str(2*x) fails (the function in the RHS contains the unknown).

 

edit: eheh, too tired yesterday, swapped RHS & LHS

    Mike Kaganski (Tuesday, May 10, 2016 12:13 AM) #

Davide, I suppose this means that if a function returns itself (or excluded from processing) because of uninitialized arguments, then this result should also be regarded uninitialized as input to other functions.

    Davide (Monday, May 9, 2016 7:46 PM) #

Seems that the issue is related to nesting functions;

Consider:

f1(x):num2str(g1(x))

f2(x):vectorize(g2(x))

In both the cases the old preprocessing approach is applied.

 

 

    Davide (Monday, May 9, 2016 7:00 PM) #

I get an error "Argument must be a string." in the attchment.

f(x):concat("output value: ",num2str(x))

    smath (Monday, May 9, 2016 6:37 PM) #

Please test requested functionality in http://en.smath.info/forum/yaf_postsm33219_SMath-Studio-0-98-5973--09-May-2016.aspx

Thanks in advance!

    Davide (Saturday, May 7, 2016 2:14 AM) #

Excellent!!!

    Mike Kaganski (Saturday, May 7, 2016 2:00 AM) #

Wow! You are great!

    smath (Saturday, May 7, 2016 1:30 AM) #

    Mike Kaganski (Friday, May 6, 2016 7:10 PM) #

This is a Good Thing. Yes. And may have consequences.

But postponing it (or even refusing) will continue to collect hacks and cludges designed to overcome current imperfect state, and will make following change even more painful, because it will multiply some strange side effects that may become "features" breaking which will break some documents.

I understand what is it like. I am a developer, too (I work with LibreOffice), and I know how painful it is to break compatibility. Though, you (1) always have previous versions of program for those who need compatibility at all cost; (b) may use compatibility flag (a checkbox in settings) to enable old style.

Your program gains popularity. It's not the most popular math program yet, and this is good for changes like this: it's relatively small amount of documents out there that will be broken, and large portion of users are quite knowledgeable (this is always the case at early stages). Later, it will be harder.

You have now great experience that can help you re-define some design decisions. Please don't postpone this for too long! Please!

    smath (Friday, May 6, 2016 6:48 PM) #

Mike, as you mentioned earlier, what you are saying is not only about UnitsOf. So, I see it has much larger scope.

My "impossible" was about changing behavior of this specific function (UnitsOf). But if we will agree current approach to work with functions at right of the definition operator is incorrect, then there are serious changes required for SMath Studio.

Currently we have functions that supports:

  1. Numeric calculations only;
  2. Symbolic calculations only;
  3. Work with expression before any calculation engine handled it.

We are talking regarding 3-rd case. These functions do know nothing about is arguments defined or not and this detection always performed manually by developer (complex and non-standard task) if required. In most cases developer don't handle it, because detection of undefined variables/functions will be done automatically by numerical engine afterwards.

To change it, mechanizm of handling 3-rd case functions must be seriously reviewed:

  • Every function should have a property SupportsUndefinedArguments which is False by default. This means that SMath Studio will detect if all arguments are defined and will initiate future calculations if this statement is True only. Otherwise function must be returned as is;
  • If developer wants to work with arguments even if they are not defined, then (s)he should set SupportsUndefinedArguments to True.

I think it makes sence. But it is a big change which may (actually not may, but will) break back compatibility...

    Mike Kaganski (Friday, May 6, 2016 6:15 PM) #

Andrey, I understand that it is designed this way. I just ask to describe why is it designed that way, and isn't it possible to change its design so that it would produce more useful results and more consistent behaviour.

There is a function IsDefined(), anyone may use it to prevent passing undefined value to UnitsOf(). There's no value in returning fake units for undefined name! Just no sense here.

    smath (Friday, May 6, 2016 3:36 PM) #

UnitsOf(x) not designed to detect if variables inside are defined or not. It tries to substitute all variables and functions inside and just tries to find units in result. It works with anything written inside the argument.

 

    Mike Kaganski (Friday, May 6, 2016 3:30 PM) #

f(x)=UnitsOf(x)+1 will be inconsistent again, because it will behave differently than in all other cases where the function result depends on "y" changed later.

    Mike Kaganski (Friday, May 6, 2016 3:29 PM) #

But why?

Just make UnitsOf return itself always when its result is undefined, and optimization is symbolic - and make it return error if its arg is undefined and optimization is numeric. That's all.

    smath (Friday, May 6, 2016 3:18 PM) #

Mike, it is impossible.

As Davide have mentioned result will be 

f(x)=UnitsOf(x)+1

    Mike Kaganski (Friday, May 6, 2016 3:16 PM) #

Just to be clear: I suppose that your example:

f(x):=UnitsOf(x)+UnitsOf(y)

where y isn't defined earlier, should return symbolically just that:

f(x)=UnitsOf(x)+UnitsOf(y)

 

Function arguments should only serve one purpose: to define the interface, i.e. in this case, to declare that the name "x" is internal, and has nothing to do with external (global or other function's local) "x". So, for all uses, x is just undefined at the time of function definition's symbolical processing, regardless of if there is an "x" defined above. Together with consistent treatment of undefined input in symbolic mode of all functions in SMath, this will give logical and predictable results.

    smath (Friday, May 6, 2016 2:54 PM) #

I found your thoughts very useful. Thank you!

And I'm agree now that current behavior is inconsistent because list of function's arguments must be used by the program as hints for further calculations.

Will check what I can do...

    Mike Kaganski (Thursday, May 5, 2016 3:47 PM) #

I suppose that some functions must be redefined to return themselves symbolically, and return an error numerically, when their input is undefined.

This is a related issue to SS-123, and would achieve the consistency here. I don't understand the meaning of UnitsOf(undefined)=1, this just doesn't make sense. Calculating numerically, I expect here an error - why presume some units for unknown input? But when processed symbolically - which is the typical case in function definitions - returnind itself is just fine: the error will be catched later in numerical function invocation, if the input is still undefined.

The same for eval: when it is processed symbolically - which is the case inside typical function definition - it may safely return itself when its input is undefined - and everything will be OK, because numerical error will be caught on invocation.

Maybe just every function should behave this way - actually I don't see a reason why it shouldn't be that way; even IsDefined() isn't an exception - we use it numerically to actually find out if a name is defined, that is, at function invocation; no need to do this when it is used symbolically...

Of course, I may be wrong, but then please share what are drawbacks of this approach, so that we could try brain-storming to work out a better way dealing with this.

    Davide (Thursday, May 5, 2016 2:27 PM) #

Since y will be defined after the function, currently we have stored f(x)=2 (1+1);

hopefully I'd expect to see stored f(x)=UnitsOf(x)+1 (since UnitsOf is designed to play even with symbolic data --> it is ok that y is preprocessed, if one doesn't want it, he can define the function as f(x,y); in the other hand preprocessing of x is more like an error of parsing order, the purpose of the argument in a function is to define his value later, here this is ignored)

The same for f(x):vectorize(x)+vectorize(y) - stored as f(x)=x+y; imho vectorize(x) shouldn't be preprocessed (that is the issue SS-2319 probably), or any other function able to handle symbolic data (another example, f(x):strtoupper(num2str(x)) returns always "X").

 

I know, there are a lot of features and both consistency & back-compatibility are really important. Just trying to understand if might improve more than what it breaks or the opposite; maybe my proposal is a dead end way, still the issue exist (I consider workarounds the abuse of  eval() and one-argument line() to make working something that in theory should works without).

 

    smath (Thursday, May 5, 2016 1:37 PM) #

I'm just trying to say that proposed improvement is only workaround for specific cases and does not take into account all behaviors of SMath Studio.

    smath (Thursday, May 5, 2016 1:35 PM) #

My example was simplified.

Please try to play with another one:

f(x):=UnitsOf(x)+UnitsOf(y)

Note: y may be difined AFTER definition of this function.

    Davide (Thursday, May 5, 2016 1:04 PM) #

mhhh.... If I understand the point correctly this issue cannot be addressed acting only on the assignment operator, I am right?

If I look at the dynamic assistant I see that the argument name is stored, so shouldn't be impossible to teach SS on evaluation that the variable's definition after the function's definition should be applied to everything except x.

    smath (Thursday, May 5, 2016 12:44 PM) #

Yes, it is possible to teach assignment operator to skip the preprocessing of whatever(x), BUT...

As you know SMath Studio supports definition of the function's variable after the function definition, f.e.:

f(x):=x+y

y:=5

f(3)=8

y:=6

f(3)=9

This example demonstrates that we cannot just handle function's argument. In order to have consistancy here we need to agree to use the same approach for both variables.