As part of trying to track down a real answer to this issue: https://forum.copperspice.com/viewtopic.php?f=11&t=1755
Unraveling this
Code: Select all
for (unsigned int i = 0; i < commitStrLen;) {
const unsigned int ucWidth = commitStr.at(i).isHighSurrogate() ? 2 : 1;
const QString oneCharUTF16 = commitStr.mid(i, ucWidth);
const QByteArray oneChar = sqt->BytesForDocument(oneCharUTF16);
sqt->InsertCharacter(std::string_view(oneChar.data(), oneChar.length()), EditModel::CharacterSource::directInput);
i += ucWidth;
}
Under Qt
=====
Detailed Description. QString stores a string of 16-bit QChars, where each QChar corresponds to one UTF-16 code unit. (Unicode characters with code values above 65535 are stored using surrogate pairs, i.e., two consecutive QChars.).
=====
So, whoever did this
Code: Select all
QByteArray ScintillaQt::BytesForDocument(const QString &text) const
{
if (IsUnicodeMode()) {
return text.toUtf8();
} else {
QTextCodec *codec = QTextCodec::codecForName(
CharacterSetID(CharacterSetOfDocument()));
return codec->fromUnicode(text);
}
}
No problem. Hapless coder goes looking to see what fromUnicode() returns in its byte array under CopperSpice
Converts str from Unicode to the encoding of this codec and returns the result in a QByteArray. This method updates the state.
Is the content of this QChar32 so everything is 4-bytes wide and no need for high surrogate logic?
Is this still returning a QByteArray of UTF-16?
There is a similar issue for QTextCodec::convertFromUnicode
Basically, every place in the documentation where QByteArray is returned from something that could have been some form of a string the documentation needs to spell out what form the content is really in.
There is no safe assumption here.