Handle unsupported unicode characters|OPC UA Standard|Forum|OPC Foundation

Avatar
Search
Forum Scope


Match



Forum Options



Minimum search word length is 3 characters - maximum search word length is 84 characters
Lost password?
sp_Feed sp_PrintTopic sp_TopicIcon
Handle unsupported unicode characters
Avatar
Christoffer Lind
Member
Members
Forum Posts: 4
Member Since:
07/03/2017
sp_UserOfflineSmall Offline
1
11/16/2020 - 05:07
sp_Permalink sp_Print

I have a variable node which uses the String data type. According to the specification string values are "encoded as a sequence of UTF-8 characters". The thing is that my application only supports the Latin-1 character set. Is it acceptable to only support a subset of all unicode characters and reject write requests which includes unicode characters not within that subset? Would Bad_WriteNotSupported be the prefered result code in that case?

Avatar
Randy Armstrong
Admin
Forum Posts: 1564
Member Since:
05/30/2017
sp_UserOfflineSmall Offline
2
11/16/2020 - 22:54
sp_Permalink sp_Print

You can treat every string as an opaque sequence of bytes which will allow you to return whatever is written.

UTF-8 is also an 8-bit encoding so it can be treated as a latin-1 string even if it is not.

Supporting UTF8 is not optional for servers. Can you explain more about why this is an issue?

Avatar
Christoffer Lind
Member
Members
Forum Posts: 4
Member Since:
07/03/2017
sp_UserOfflineSmall Offline
3
11/17/2020 - 02:01
sp_Permalink sp_Print sp_EditHistory

Storing and returning a sequence of bytes is not an issue. The problem is that the application might end up interpreting/displaying some characters incorrectly as it doesn't support the UTF-8 character set. The UTF-8 character "水" (\xe6\xb0\xb4) would for example be displayed as  something completely else if interpreted as Latin-1 (or more realistcally, the decoding would fail).

I mean, it must be up to the application to perform range checks? For integer values the application must be allowed to reject out of range values and the same must apply for strings, no? 

Avatar
Randy Armstrong
Admin
Forum Posts: 1564
Member Since:
05/30/2017
sp_UserOfflineSmall Offline
4
11/17/2020 - 06:56
sp_Permalink sp_Print

This has not come up before. A mantis would be best.

A write that unexpectedly fails for unknown reasons is not good for IOP. Failures for range checks usually have metadata that explains the behavior like EnumStrings. My inclination it is to say you should figure out some way to handle it within your server.

Avatar
Randy Armstrong
Admin
Forum Posts: 1564
Member Since:
05/30/2017
sp_UserOfflineSmall Offline
5
11/17/2020 - 10:21
sp_Permalink sp_Print

The UA WG discussed this issue and agreed that failed UTF-8 to LATIN-1 conversion is no different from failing because particular variable does not allow spaces or punctuation. The error code to return is Bad_OutOfRange.

Forum Timezone: America/Phoenix
Most Users Ever Online: 510
Currently Online:
Guest(s) 29
Currently Browsing this Page:
1 Guest(s)
Top Posters:
Forum Stats:
Groups: 2
Forums: 10
Topics: 1435
Posts: 4855