Changeset View
Changeset View
Standalone View
Standalone View
src/core/guesslanguage.cpp
Show First 20 Lines • Show All 577 Lines • ▼ Show 20 Line(s) | 566 | { | |||
---|---|---|---|---|---|
578 | 578 | | |||
579 | // Load the model on demand | 579 | // Load the model on demand | ||
580 | if (d->s_knownModels.isEmpty()) { | 580 | if (d->s_knownModels.isEmpty()) { | ||
581 | d->loadModels(); | 581 | d->loadModels(); | ||
582 | } | 582 | } | ||
583 | 583 | | |||
584 | QStringList candidateLanguages = d->identify(text, d->findRuns(text)); | 584 | QStringList candidateLanguages = d->identify(text, d->findRuns(text)); | ||
585 | 585 | | |||
586 | // Hack for some bad dictionary names | 586 | // Hack for some bad dictionary names | ||
apol: Shouldn't identify be taking care of the scripts already? | |||||
identify() fails if a certain language is not present in the trigrams. waqar:
`identify()` fails if a certain language is not present in the trigrams. | |||||
Why did you handle it here rather than in identify then? is it a problem doing it there? I see that identify is being used elsewhere too. It will be wrong there. apol: Why did you handle it here rather than in identify then? is it a problem doing it there? I see… | |||||
Initially I handled it in identify () but then @mludwig suggested that I do it here instead waqar: Initially I handled it in identify () but then @mludwig suggested that I do it here instead | |||||
587 | for (int i = 0; i < candidateLanguages.count(); i++) { | 587 | for (int i = 0; i < candidateLanguages.count(); i++) { | ||
588 | if (d->s_dictionaryNameMap.contains(candidateLanguages[i])) { | 588 | if (d->s_dictionaryNameMap.contains(candidateLanguages[i])) { | ||
589 | candidateLanguages[i] = d->s_dictionaryNameMap.value(candidateLanguages[i]); | 589 | candidateLanguages[i] = d->s_dictionaryNameMap.value(candidateLanguages[i]); | ||
Couldn't this if-statement can be dropped? I guess one can argue that sometimes there may be a language without trigrams that would even be a better language guess? mludwig: Couldn't this if-statement can be dropped? I guess one can argue that sometimes there may be a… | |||||
waqar: Yeah, I think so too | |||||
590 | } | 590 | } | ||
591 | } | 591 | } | ||
592 | 592 | | |||
593 | if (candidateLanguages.count() == 1) { | 593 | if (candidateLanguages.count() == 1) { | ||
594 | return candidateLanguages.first(); | 594 | return candidateLanguages.first(); | ||
595 | } | 595 | } | ||
596 | 596 | | |||
597 | // Wasn't able to get a good guess with the trigrams, try checking all | 597 | // Wasn't able to get a good guess with the trigrams, try checking all | ||
▲ Show 20 Lines • Show All 107 Lines • ▼ Show 20 Line(s) | 704 | if (sample.size() < MIN_LENGTH) { | |||
705 | return QStringList(); | 705 | return QStringList(); | ||
706 | } | 706 | } | ||
707 | 707 | | |||
708 | QStringList guesses; | 708 | QStringList guesses; | ||
709 | for (const QChar::Script script : scripts) { | 709 | for (const QChar::Script script : scripts) { | ||
710 | guesses.append(guessFromTrigrams(sample, s_scriptLanguages.values(script))); | 710 | guesses.append(guessFromTrigrams(sample, s_scriptLanguages.values(script))); | ||
711 | } | 711 | } | ||
712 | 712 | | |||
713 | //if guesses are empty, we just append the languages of the scripts | ||||
714 | if (guesses.isEmpty() && !scripts.isEmpty()) { | ||||
715 | for (const QChar::Script script : scripts) { | ||||
716 | guesses.append(s_scriptLanguages.values(script)); | ||||
717 | } | ||||
718 | } | ||||
719 | | ||||
713 | return guesses; | 720 | return guesses; | ||
714 | } | 721 | } | ||
715 | 722 | | |||
716 | QStringList GuessLanguagePrivate::guessFromTrigrams(const QString &sample, | 723 | QStringList GuessLanguagePrivate::guessFromTrigrams(const QString &sample, | ||
717 | const QStringList &languages) | 724 | const QStringList &languages) | ||
718 | { | 725 | { | ||
719 | QStringList ret; | 726 | QStringList ret; | ||
720 | 727 | | |||
▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines |
Shouldn't identify be taking care of the scripts already?