07/03/2017
I have a variable node which uses the String data type. According to the specification string values are “encoded as a sequence of UTF-8 characters”. The thing is that my application only supports the Latin-1 character set. Is it acceptable to only support a subset of all unicode characters and reject write requests which includes unicode characters not within that subset? Would Bad_WriteNotSupported be the prefered result code in that case?
05/30/2017
You can treat every string as an opaque sequence of bytes which will allow you to return whatever is written.
UTF-8 is also an 8-bit encoding so it can be treated as a latin-1 string even if it is not.
Supporting UTF8 is not optional for servers. Can you explain more about why this is an issue?
07/03/2017
Storing and returning a sequence of bytes is not an issue. The problem is that the application might end up interpreting/displaying some characters incorrectly as it doesn’t support the UTF-8 character set. The UTF-8 character “水” (\xe6\xb0\xb4) would for example be displayed as something completely else if interpreted as Latin-1 (or more realistcally, the decoding would fail).
I mean, it must be up to the application to perform range checks? For integer values the application must be allowed to reject out of range values and the same must apply for strings, no?
05/30/2017
This has not come up before. A mantis would be best.
A write that unexpectedly fails for unknown reasons is not good for IOP. Failures for range checks usually have metadata that explains the behavior like EnumStrings. My inclination it is to say you should figure out some way to handle it within your server.
1 Guest(s)