std::string_view

Post Reply
seasoned_geek
Posts: 246
Joined: Thu Jun 11 2020 12:18 pm

std::string_view

Post by seasoned_geek »

All,

I'm porting Scintilla to CopperSpice for use in RedDiamond. Scintilla is really robust and many editors like Notepadqq and Juffed use it.

I'm down to the last source file and two compilation errors in the input method, both with code like this.

Code: Select all

    if ( !event->commitString().isEmpty() )
    {
        const QString commitStr = event->commitString();
        const unsigned int commitStrLen = commitStr.length();

        for ( unsigned int i = 0; i < commitStrLen; )
        {
            const unsigned int ucWidth = commitStr.at( i ).isHighSurrogate() ? 2 : 1;
            const QString oneCharUTF16 = commitStr.mid( i, ucWidth );
            const QByteArray oneChar = sqt->BytesForDocument( oneCharUTF16 );

            sqt->InsertCharacter( std::string_view( oneChar.data(), oneChar.length() ), EditModel::CharacterSource::directInput );
            i += ucWidth;
        }

    }
I know the intent here. InsertCharacter() is ultimately handing this off to some Python stuff. There is no changing that. We should be able to do something like this:

Code: Select all

for ( QChar32 oneChar : commitStr)
{
    sqt->InsertCharacter( oneChar.toStdStringView(), EditModel::CharacterSource::directInput );
}
I've got a strong suspicion this code doesn't work on with Unicode sets where a single UTF16 doesn't old the character value.

So, given my complete inability to focus right now, what is the solution? It __has__ to be something simple that I'm walking right past.
barbara
Posts: 443
Joined: Sat Apr 04 2015 2:32 am
Contact:

Re: std::string_view

Post by barbara »

Please provide the error messages you are seeing, hopefully just a few lines. If this is a long error message please email and we will review and then post something here.

Thanks,

Barbara
seasoned_geek
Posts: 246
Joined: Thu Jun 11 2020 12:18 pm

Re: std::string_view

Post by seasoned_geek »

barbara wrote: Wed Mar 31 2021 2:11 am Please provide the error messages you are seeing, hopefully just a few lines. If this is a long error message please email and we will review and then post something here.
I did not include the error because I didn't want to go down that rabbit hole. Here it is, per request.

Code: Select all

/home/roland/sf_projects/roland_hughes-scintilla/copperspice/ScintillaEditBase/ScintillaEditBase.cpp:596:49: error: ‘class QChar32’ has no member named ‘isHighSurrogate’
  596 |    const unsigned int ucWidth = commitStr.at(i).isHighSurrogate() ? 2 : 1;
      |                                                 ^~~~~~~~~~~~~~~
/home/roland/sf_projects/roland_hughes-scintilla/copperspice/ScintillaEditBase/ScintillaEditBase.cpp:621:50: error: ‘class QChar32’ has no member named ‘isHighSurrogate’
  621 |    const unsigned int ucWidth = preeditStr.at(i).isHighSurrogate() ? 2 : 1;
      |                                                  ^~~~~~~~~~~~~~~
ninja: build stopped: subcommand failed.
(END)
Now please ignore it.

The algorithm they are using is bad.

I was hoping that, given CopperSpice is wrapping standard C++ for many things, that we had a direct method like QChar32::toStringView().

After some sleep, I poked around this morning.

https://docs.microsoft.com/en-us/cpp/standard-library/string-view?view=msvc-160

Type name Description
string_view A specialization of the class template basic_string_view with elements of type char.
wstring_view A specialization of the class template basic_string_view with elements of type wchar_t.
u16string_view A specialization of the class template basic_string_view with elements of type char16_t.
u32string_view A specialization of the class template basic_string_view with elements of type char32_t.

I don't pay close attention to the C++ standard. Been using Qt so long I've not had to care (not to be confused with char)

I didn't think the C++ standard was still trapped in 1989 like that code. What worries me about the above snippet (and perhaps this is just the MS implementation) is that string_view doesn't appear to be a generic morphable entity. That anything written to receive a string_view could recieve u32string_view just the same with narry a hiccup.

The semi-direct replacement hack would be to make a QString out of each character then
std::wstring QString8::toStdWString ( ) const
and std::string_view() that. It still couldn't handle Unicode sets requiring more than two bytes which is the current code limitation.

What I don't know and am still poking around on and pondering a bit is if CopperSpice has something along these lines.

https://docs.microsoft.com/en-us/cpp/standard-library/string-view-typedefs?view=msvc-160
=====
u32string_view

A type that describes a specialization of the class template basic_string_view with elements of type char32_t.
C++

typedef basic_string_view<char32_t, char_traits<char32_t>> u32string_view;
=====

Such that

Code: Select all

for ( QChar32 oneChar : commitStr)
{
    sqt->InsertCharacter( std::u32string_view(oneChar), EditModel::CharacterSource::directInput );
}
would/should actually work. This eliminates all issues with Unicode up to 4-bytes wide. I can live with screwing the 5+byte wide Unicode sets (if there are any) because they need to go on a diet.

To summarize:

It's not the compiler error I'm trying to fix, it's the busted algorithm. I wanted to ask someone before "just throwing code" at this. Either my understanding of string_view is horribly wrong or this algorithm never worked beyond ASCII anyway. Via the MS documentation (first in the search results) it should have only accepted a char unless it can morph based on content.

If it can morph based on content I just need a way that I cannot find in the documentation of going from QChar32 to u32string_view and life is good.
Post Reply