out of the box.
sentencesor
structures) exist.
correctphonemes when adding the words automatically.
normalinstallation and will assume that you use a local Simond server that will be started automatically and stopped with Simon.
One One Oneto quickly ramp up the
recognition rateproperty. Such recordings often decrease recognition performance because the pronunciation differs greatly from saying the word in isolation.
clip(
microphone boostoption in the system mixer.
[filename] [content]. Filenames are without file extensions and the content has to be uppercase. For example:
Demo.
This is a testand nothing else. Numbers and special characters (
.,
-,...) in the filename are ignored and stripped.
voice activity detection.
silence(background noise).
filtersone could define commands to suppress background noise in the training data or normalize the recordings.
apply filters to recordings recorded with Simonenables the postprocessing chains for samples recorded during the training (including the initial training while adding the word). If you don't select this switch the postprocessing commands are only applied to imported samples (through the import training data wizard).
connect on starton a very slow machine, you may want to increase this value if you keep getting timeout errors and can resolve them by trying again repeatedly.
:) or use the dialog that appears when you select the blue arrow next to the input field.
synchronization. Only after the speech model is synchronized the changes take effect and a new restore point is set. This is why per default Simon will always synchronize the model with the server when it changes. This is called
source files(the vocabulary, grammar, &etc;) -
template(the system wide configuration) can be changed without affecting other users.
word. In contrast to the common use of the word
word, in Simon
wordmeans one unique combination of the following:
Noun,
Verb, &etc;)
wordsto Simon. This is an important design decision to allow more control when using a sophisticated grammar.
shadow dictionary. This shadow dictionary is not created by the user but can be imported from various sources.
Computer, Internetto open a browser
Computer, Mail
Computer, close
Noun Nounfor sentences like
Computer Internet
Noun Verbfor sentences like
Computer close
Computer Computer,
Internet Computer,
Internet Internet, &etc; which are obviously bogus. To improve the recognition accuracy, we can try to create a grammar that better reflects what we are trying to do with Simon.
languagewhen using Simon. That means that you are not bound to grammar rules that exist in whatever language you want to use Simon with. For a simple command and control use-case it would for example be advisable to invent new grammatical rules to eliminate the differences between different commands imposed by grammatical information not relevant for this use case.
closeis a verb or that
Computerand
Internetare nouns. Instead, why not define them as something that better reflects what we want them to be:
Trigger Command
closeconsists of the following sounds:
phonemes.
clothesto the language model, your acoustic model already has an idea how the
clopart is going to sound as they share the same phonemes (
k,
l,
ow) at the beginning.
trainwords from your language model. That means that Simon displays a word which you read out loud. Because the word is listed in your vocabulary, Simon already knows what phonemes it contains and can thus
learnfrom your pronunciation of the word.
[<language>/<base model>] <name>. If, for example you create a scenario in English that works with the Voxforge base model and controls Mozilla Firefox this becomes:
[EN/VF] Firefox. If your scenario is not specifically tailored to one phoneme set (base model), just omit the second tag like this:
[EN] Firefox.
[<language>/<base model>] <name>. If, for example you create a scenario in English that works with the Voxforge base model and controls Mozilla Firefox this becomes:
[EN/VF] Firefox. If your scenario is not specifically tailored to one phoneme set (base model), just omit the second tag like this:
[EN] Firefox.
-denoting that you are not willing to divulge this information.
recognition ratewhich at the moment is just a counter of how often the word has been recorded (alone or together with other words).
Firefox(to launch firefox) which is of course not listed in our shadow dictionary.
Firefoxis not listed in our shadow dictionary so we do not get any suggestion at all.
fireand
foxput together. So let's just open the vocabulary (you can keep the wizard open) by selecting
Fire:
Fireis transcribed as
f ay r. Now filter for
foxinstead of
Fireand we can see that
Foxis transcribed as
f ao k s. We can assume, that firefox should be transcribed as
f ay r f ao k s.
Trigger Commandand have the word
Computerof the category
Triggerin your vocabulary. You then add a new word
Firefoxof the category
Command. Simon will now automatically prompt you for
Computer Firefoxas it is - according to your grammar - a valid sentence.
newword with the same name the values of the moved word will be suggested to you. Therefore, no data will be lost.
Computer Internet!. So we either enter the text using the
Computer Internet!(any punctuation mark would work) and save it as
Computer Internet. Simon would find out that
Computeris of the category
Internetof the category
learnthat
Trigger Commandis a valid sentence and add it to its grammar.
.,
-,
!, &etc;) so any natural text should work. The importer will automatically merge duplicate sentence structures (even across different files) and add multiple sentence (all possible combinations) when a word has multiple categories assigned to it.
pagesthe text consists of. Each page represents one recording.
pages(recordings). The algorithm treats text between
normalpunctuation (
.,
!,
?,
...,
",...) and line breaks as
sentences. Each
sentencewill be on its own page.
Karlknows how to start a program and
Joeknows how to open a folder, &etc;). Whenever Simon recognizes something it is given to
Karlwho then checks if this instruction is meant for him. If he doesn't know what to do with it, it is handed over to
Joeand so on. If none of the loaded plugins know how to process the input it is ignored. The order in which the recognition result is given to the individual commands (people) is configurable in the command options (
trigger. Using triggers, the responsibility of each plugin can be easily be divided.
Open my home folderyou say
Joe, open my home folderand
Joe(the plugin responsible for opening folders) will instantly know that the request is meant for him.
Firefoxto open the popular browser and the place command
Startto the executable plugin and the trigger
Opento the place command you would have to say
Start Firefox(instead of just
Firefoxif you don't use a trigger for the executable plugin) and
Open Googleto open the search engine (instead of just
Computerset which you would have to remove). But even if you use just one trigger for all your commands (like
Computerto say
Computer, Firefoxand
Computer, Googlelike) it has the advantage of greatly limiting the number of false-positives.
Program) which is started when the command is invoked.
remote:/(on &Linux;/&kde;) or even &kde;'s
Web-Shortcutsare supported.
writtenby simulating keystrokes.
Startmenuto present a list of programs to launch. That way the specific executable commands can still retain very descriptive names (like
OpenOffice.org Writer 3.1) without the user having to include these words in his vocabulary and consider them in the grammar just to trigger them.
Oneto launch Mozilla Firefox).
Cancel.
Nextand
Backoptions to the list (
Zerowill be associated with
Backand
Ninewith
Next).
sentences:
Zero
One
Two
Three
Four
Five
Six
Seven
Eight
Nine
Cancel
macros. The screenshot above - for example - does the following:
Mathias(Text-Macro Command) which will select Mathias in my contact list
Hi!(Text-Macro Command); the text associated to this command contains a newline at the end so that the message will be send.
Cancelat any time to abort the process.
realor
faketransparency. If your graphical environment allows for compositing effects (
desktop effects) then you can safely use
realtransparency which will make the desktop grid transparent. If your platform does not support compositing Simon will simulate transparency by taking a screenshot of the screen before displaying the desktop grid and display that picture behind the desktop grid.
sentences:
One
Two
Three
Four
Five
Six
Seven
Eight
Nine
Cancel
eleven,
twelve, &etc;
fivehundredseventytwowe can easily see that it would be quite a problem to add all these words - let alone train them. What about
twothousandninehundredtwo? Where to stop?
Five (pause) Two. Because of the needed pause, the application (like the mouseless browsing plugin) would consider the input of
Fivecomplete.
Back. It features a decimal point accessible by saying
Comma. When saying
Okthe number will be typed out. As all the voice-input and the correction is handled by the plugin itself the application that finally receive the input will only get couple of milliseconds between the individual digits.
Cancelat any time to abort the process.
sentences:
Zero
One
Two
Three
Four
Five
Six
Seven
Eight
Nine
Back
Comma
Ok
Cancel
processed inputand thus not be relayed to other plugins. This means that if you loaded the dictation plugin and defined no trigger for it, all plugins
talkwith the user.
intelligence. Most AIML sets should be supported. The popular
feelfor the conversation.
eatall results that match the configured pattern. By default this means every result that Simon recognizes will be accepted by the filter and therefore not relayed to any of the plugins following the filter plugin.
sets.
select allkey or a
Passwordkey (typing your password).
Simon is the main front end for the Simon open source speech recognition solution. It is a Simond client and provides a graphical user interface for managing the speech model and the commands.
El Simon és el frontal principal per a la solució del reconeixement de la veu de codi obert. Aquest és un client del Simond i proporciona una interfície gràfica d'usuari per a gestionar el model de pronunciació i les ordres.
-El Simon és el frontal principal per a la solució del reconeixement de la veu de codi obert. Este és un client del Simond i proporciona una interfície gràfica d'usuari per a gestionar el model de pronunciació i les ordes.
+El Simon és el frontal principal per a la solució del reconeixement de la veu de codi obert. Aquest és un client del Simond i proporciona una interfície gràfica d'usuari per a gestionar el model de pronunciació i les ordres.
Simon ist die Haupt-Bedienungsoberfläche zu der Open-Source-Spracherkennungslösung „Simon“. Es ist ein Simond-Client und bietet eine grafische Schnittstelle, um das Sprachmodell und die Befehle zu verwalten.
Το Simon είναι η κύρια εφαρμογή για την ανοιχτού κώδικα λύση αναγνώρισης ομιλίας Simon. Είναι ένας πελάτης του Simond και παρέχει ένα γραφικό περιβάλλον για τη διαχείριση του μοντέλου ομιλίας και των εντολών.
+Simon is the main front end for the Simon open source speech recognition solution. It is a Simond client and provides a graphical user interface for managing the speech model and the commands.
Simon es la principal interfaz para la solución abierta de reconocimiento de voz Simon. Es un cliente de Simond y proporciona una interfaz gráfica de usuario para gestionar el modelo de voz y las órdenes.
+Simon est l'interface principale pour la solution open source de reconnaissance vocale Simon. C'est un client à Simond, qui offre une interface graphique pour gérer le modèle vocal et les commandes.
+Simon è l'interfaccia principale per la soluzione di riconoscimento vocale open source Simon. Si tratta di un client per Simond che fornisce un'interfaccia utente grafica per la gestione dei modelli vocali e dei comandi.
Simon is het hoofdfrontend voor het Simon open source oplossing voor spraakherkenning. Het is een Simond-client en levert een grafisch gebruikersinterface voor het beheren van het spraakmodel en de commando's.
O Simon é a interface principal para a solução de reconhecimento de fala em código aberto Simon. É um cliente do Simond e oferece uma interface gráfica para gerir o modelo de fala e os comandos.
Simon je hlavný frontend pre riešenie na rozpoznávanie reči Simon. Je to klient pre Simond a poskytuje grafické používateľské rozhranie pre správu modelu reči a príkazy.
Simon är huvudgränssnittet för taligenkänningslösningen Simon med öppen källkod. Den är en Simond-klient och tillhandahåller ett grafiskt användargränssnitt för att hantera talmodellen och kommandona.
Simon є основною графічною оболонкою комплексу програм з відкритим кодом для розпізнавання мовлення Simon. Програма є клієнтською частиною Simond, вона надає графічний інтерфейс користувача для керування моделлю мовлення та командами.
xxSimon is the main front end for the Simon open source speech recognition solution. It is a Simond client and provides a graphical user interface for managing the speech model and the commands.xx
Features:
Característiques:
Característiques:
Vlastnosti:
Funktionen:
Χαρακτηριστικά:
+Features:
Características:
Fonctionnalités :
+Caratteristiche:
Mogelijkheden:
Funcionalidades:
+Recursos:
Funkcie:
Funktioner:
Можливості:
xxFeatures:xx