I recently had a challenge – how to delimit an array of character data, when that the delimiter value itself may be in the data? Furthermore, that array had to fit in a SAS macro variable?
I was aware of nonprintables (hidden gremlins that can sometimes lurk in SAS program files), and attempted to use one as a delimiter. To my surprise this didn’t work, and it turned out that SAS does not fully support nonprintables in Macro. So, I turned to SAS-L for help.
There were two interesting things that came of that conversation, both from FriedEgg. Firstly, the nonprintables ARE stored in the symbol table – they just fail to resolve. However, they can be viewed by reading directly from sashelp.vmacro.
Secondly – a defined range of 6400 unicode characters already exist for ‘private use’ such as this! They are known as “Private Use Characters” (or “User Defined Characters”) and exist in the range U+E000 -> U+F8FF. The full spec is available here. To implement something like this in SAS, one uses hexadecimal notation – a quoted string with an even number of hex characters, followed by the letter x.
Here be dragons..
Watch out!! As pointed out by Roger DeAngelis on the same thread, when dealing with hex data in datastep one has to watch out for a common binary delimeter – ’00’x. This can cause unexpected results when delimiting missing values. So – you should basically avoid any hex value that has a double 0 (such as ‘E000’x). Use delimiters such as ‘E042’x or ‘E999’x instead. So – here is how one might go about the delimiting task:
data _null_; dlm='E042'x; call symputx('macarray','special!!'!!dlm!!'*,xx'); run; %put &macarray; filename tmp temp; data _null_; file tmp; length myvar $32767; myvar=symget('macarray'); put myvar; run; data _null_; infile tmp; input; put _infile_; run;
Which printed (in both instances):
One of the great things about the vastness of SAS is the never ending supply of problems to solve! How do you think you compare? Find out – at sasensei.com