I feel like this will be overlooked if I won't explain few things first.
Information about RTL languages, The case for RTL support, it's importance and why it shouldn't be overlooked
Here is a quick note on RTL writing systems from Wikipeida:
In a right-to-left, top-to-bottom script (commonly shortened to right to left or abbreviated RTL), writing starts from the right of the page and continues to the left. This can be contrasted against left-to-right writing systems, where writing starts from the left of the page and continues to the right.
Arabic, Hebrew, Persian, and Urdu Sindhi are the most widespread RTL writing systems in modern times.
Right-to-left can also refer to Text direction top-to-bottom, right-to-left (TB-RL or TBRL) scripts such as Chinese, Japanese, and Korean, though they are also commonly written Text direction left to right. Books designed for predominately TBRL vertical text open in the same direction as those for RTL horizontal text: the spine is on the right and pages are numbered from right-to-left. (1)
The Arabic script is the writing system used for writing Arabic and several other languages of Asia and Africa, such as Persian, Kurdish, Azerbaijani, Sindhi, Pashto, Lurish, Urdu, Mandinka, and others... It is the second-most widely used writing system in the world by the number of countries using it and the third by the number of users, after Latin and Chinese characters. (2)
Here is a map to illustrate this info:
- All varieties of Arabic combined are spoken by perhaps as many as 422 million speakers (native and non-native) in the Arab world. (3)
- There are approximately 110 million Persian speakers worldwide, with the language holding official status in Iran, Afghanistan, and Tajikistan. (4)
- According to Nationalencyklopedin's 2010 estimates, Urdu is the 21st most spoken first language in the world, with approximately 66 million speakers. (5)
Hebrew is a Northwest Semitic language native to Israel; the modern version of which is spoken by over 9 million people worldwide.
As a foreign language, it is studied mostly by Jews and students of Judaism and Israel, and by archaeologists and linguists specializing in the Middle East and its civilizations, as well as by theologians in Christian seminaries. (6)
My take on the issue (AKA The case for RTL support, it's importance and why it shouldn't be overlooked)
It seems that hundreds of millions of people (or even more than a billion - if taking into account TBRL in languages such as Chinese, Japanese and Korean) - live, speak, use and interact with RTL language systems.
Language is one of the basic means of communication and interaction. When users can't do simple things with our software, because we don't respect their language, we reduce our reach to other communities. I believe the goal we set for ourselves (KDE Usability & Productivity, AKA "Top-notch usability and productivity for basic software") means exactly that - We should attend to problems that undermine _Usability & Productivity_. Not only for the current users of KDE software, but for the billions of potential users out there. Not allowing hundreds of millions or billions of potential users, to interact with our software, is a serious problem. Fixing it should be a top priority.
If someone is missing a shortcut button in Kate, that might be annoying to some users of Kate, but if Kate (or Okular) doesn't support RTL all the way, then this is going to be annoying to billions of potential users.
Be honest: Would continue on using Kate, if the bugs that I mention now, were happening to you? If you couldn't delete/add a letter in Kate because every time you tried doing it, it would delete/add it somewhere else on the text. Imagine this: Would you continue on using Okular, if you couldn't use the search function, because Okular (contrary to Atril and other gtk3 pdf readers) treats RTL text in a messy way - Which means that for you a PDF file is like an image file - You can't interact with it (can't search & can't copy and paste). Makes using a PDF file obsolete.
Maybe you're a power user that can find lots of workarounds - use Atril instead of Okular, use Pluma instead of Kate, and patch a working suitable system that can handle your native RTL language. But if you're sincere, would you recommend your beginner-compatriot-friends, people that you care for and are counting on you, when it comes to computers, to use a KDE system like Kubuntu?
I'm not a professional critic like Igor Ljubuncic / Dedoimedo, but my critique is sincere and I believe not-fixing-issues-like-this undermines the adoption of KDE software that we hope to achieve.
I believe that if open source software and especially KDE software is to grow in markets outside LTR language-systems, it cannot happen without extending the support for RTL language systems as well. Microsoft did it during the 1990's... I mean, fixing RTL issues for things like notepad, office suite, etc. There are no major RTL issues in their software in the last 20 years. Basic things must work first. As @ngraham said in his proposal that was adopted:
We will need to focus on adding productivity-related features, fixing bugs, and addressing quality-of-life issues in our software's GUI interfaces, particularly the most basic and commonly-used KDE software and the frameworks that power them: ..., Kate, Okular,.. (7)
We could say "Let those other people use other basic application that we didn't write", we could say "We can't solve this problem until x,y,z is solved first - this is not in our hands", we could say "We don't care for people from communities of non-European languages" or "We don't care for scholars/students that interact with the history/culture of those communities" - I'm not saying that we should, but it's perfectly OK to decide so, but this should be debated, thought-thoroughly, understood and agreed - if we don't see those hundreds of millions as potential KDE users.
Finally, Kate's bug/s
The first bug is the most major one I found in Kate (I believe that the other ones are related to it, and I'm mentioning them so it might help figure out how to solve this)
- If a line that is written in a RTL language is long enough to slide to a second line, Kate marker will not be able to understand the location of the letters. It will show itself between two letters but in fact it is a _mirage_ since there's no correlation between the position we see and the position it actually has.
This is an example given with Latin characters just to help someone who's not familiar with RTL text, understand what I'm about to show. Imagine you have a long sentence which is like this:
am am am am am am am am am am am am am am am am am am am am am
(and it's long enough that the line continues and slides to a second line. If you're placing the marker at the middle of the word, and add a character, for example "r", you expect to see the word "arm", what happens in kate (with a RTL language) is that the letter will add one character after the one where the marker stands. So you would get the word "amr" (which means nothing...) The same thing happens when you're trying to delete. For example, placing the marker at the end of a word, "am", and deleting the last character (with backspace), you would expect that the "m" would get deleted, but what happens is that Kate will delete the next character, in this example, the space (" "). If the line is short (not sliding to a second line), then all will work as it should.
Now I'll show an example with Hebrew (A RTL language) characters. In this example I will write the word "אם" few times, until the sentence slides to the next line, and then I'll try to add a character in the middle of two letters, and it will add after the next letter instead of where the marker is. Then I'll try to delete the last character "ם". Instead of it being deleted, the space that's after it will get deleted, and will join the two words. Afterwards, I will shorten the line, so it's not sliding, and then make the same deletion, and you'll see that everything works as expected.
You can copy this line and try it for yourself:
אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם אם
This bug is extremely annoying if you're a user of Kate. Imagine that every correction that you wish to make in your text, every word you want to add, just create more mess. At the end you start highlighting whole words just to fix one character or delete a whole word just to add the same word and other words.
- In this kind of problematic line, the marker will sometimes "leave" a trace - a dot on top of where it was. Maybe this can help someone figure out why this is happening...
- When you hit the "Home" button and get to the beginning of the line, and then hit the left-arrow-key, it would do "nothing". The expected behavior would be to get behind the first character. Only when hitting the left-arrow-key for the second time, the marker will move and be placed after the first character. It seems like there's an invisible character in "weird" long lines like this.
- Another problem with Kate is the upward movement when using the up-arrow-key. The upward movement, using the up-arrow-key, will stop functioning sometimes when getting to a line of that sort (a long line that slides). The marker will move up, until it gets to this kind of problematic line. Then it will move to the beginning of the line (even if the initial movement didn't begin in the beginning of a line). The marker will then stay there. Only when pressing the right-arrow-key (moving the marker to the last letter in the line above it), will free the marker. The down-arrow-key will always work.
Thank you for your patience, I hope this can help someone in fixing this. It took me around 4 hours to compile this bug report with all the videos and links. So I also hope this is appreciated.