How to protect arbitrary variable names
4 views (last 30 days)
Show older comments
I am programming functions for large Matlab projects. Inside these functions, I sometimes use arbitrary variable names like 'response', 'val' and - of course - the infamous 'i'.
Since these functions often end up nested inside other functions and scripts, I am worried that me modifying 'i' may interfere with an outside loop, which wouldn't be that cool.
So, before I start giving IDs to every variable, is there a way to make variables unique? Is it even a problem to reassign a variable from the variable space inside a function?
1 Comment
Accepted Answer
dpb
on 4 May 2021
Functions have their own workspace although internal and anonymous functions share their context with the containing workspace...read all about scope starting at <Scope-variables> and related links. And, of course read the doc for function itself and the background information provided there.
Basically, it is not a problem--the variable i inside a local function knows nothing about the base workspace variable i and vice versa.
2 Comments
dpb
on 5 May 2021
A C function has access to file scope variables, but that's the fault of the coder in defining them at that level and not as local variables.
MATLAB avoids this in function files by not allowing code outside the function block itself and variables with global scope must be declared as global explicitly.
The exception is, of course, the "nested" function that is encapsulated in another function that does have access to the variables in the containing function.
More Answers (2)
Matt J
on 4 May 2021
Edited: Matt J
on 4 May 2021
Since these functions often end up nested inside other functions
I can't tell here if you genuinely mean the functions are "nested" in the sense of
or if you just mean that you have functions calling other functions.
It's never a problem to reassign a variable name inside a function which doesn't share its workspace with other functions. If it does share a workspace, as true nested functions do, then yes that can be a problem. It would be peculiar if you were doing that very often, though. The main point of having a function is normally so that you can do computations in the protected isolation of a separate workspace.
Jan
on 4 May 2021
I you work on a larger project (any code with > 20 lines), avoid scripts. They are a shot in your knee. With using functions, the problem of re-defined variables in external code vanishes automatically.
Keep the functions small to avoid confusing variables inside it. There are some rule of thumb to limit the size:
- 1000 lines is the maximum
- If you cannot explain, what the function does, in 1 sentence, it is too long.
There are no automatic ways to make variables unique. The programmer has to keep the overview.
The sadowing of built-in function is a more serious problem, because you cannot know, which toolboxes users of your code have installed. Then the rule is to append user-defined folders at the bottom of the path. Then at least Matlab's functions are preferred. For collisions with user defined functions see: https://www.mathworks.com/matlabcentral/fileexchange/27861-uniquefuncnames
In case of conflicts, using packages should help.
2 Comments
Jan
on 5 May 2021
Such "brownfield projects" are a serious problem. Codes, which are not controlled by unit tests, are not trustworthy. Combining them, cannot produce trustworthy software. In consequence a scientist should not trust the results. For large projects (here "large" means > 100'000 lines of code), it is usual to start a refactoring at a certain level of complexity: It has been proven, that the software is working and useful, then it is time to rewrite it in a stable, well documented and trustworthy way.
Object oriented methods helps to control the access of private properties. This can be done with a functional programming style also. I've written a tool for clinical decision making starting in 1999 with Matlab 4.1. It has more than 300'000 lines of code now and an exhaustive selftest. Many working parts have been rewritten completely and members of different labs participated to include some own tools. Because the OOP methods of Matlab 4.1 have been very limited, the code provide structs to all subfunctions, but the self test controls, which subfunctions are allowed to modify which fields. By this access restriction users can insert their own plug-ins without having the power to overwrite internal data. They can even modify internal code to adjust it to the needs of their labs, and then the software registers all modified lines, documents it in a version history and starts the self test automatically. This runs for about 24 hours and it found a lot of undocumented changes or bugs in Matlab's toolboxes.
I've seen too many brownfield projects failing. In Germany it was the new data base of the health insurance AOK, which used self tests containing 1000 members. Although it worked properly, expanding it to a million members failed tremendously, because the linear search methods let the runtime explode to hours for each interaction. So after investing 1e9 €, they have decided to stop the project.
IBM offered a database for the complete control of staff and goods to the supermarket Lidl. After working on it for years, the project was stopped, because of a fundamental incompatibility of the systems: IBM's database defines the value of a piece by the price of buying it, Lidl defines it by the price of selling it. It took years of work and billions of Euro until they gave up.
I've ssen too many code written for dissertations, which work with the examples used for the thesis, but fails for any other applications, because they are an undocumented bunch of scripts without any software engineering, version history, without defined interfaces and unit tests. The invested time, money and knowhow is lost for the team, when the member leaves the institute. The scientific institutes have own a pile of unusable software, which only waste space in the backups. The problem is a missing project management and a lack of knowledge of software engineering.
Therefore you hit a very important point: "some projects aren't all rainbows and unicorns". The reliable solution is clear and easy: Decide for a refacturing, or delete the codes.
The risk management of a project should tell you, when the time for a refacturing has come. See e.g. the agile Kanban methods .
See Also
Categories
Find more on Environment and Settings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!