.. _sphinx_evaluation_tool: Sphinx4 Evaluation Tool ======================= `Sphinx 4 `_ is an open source speech recognition toolkit, written in Java. This tool can recognize data-sets and can compare them with a keyword-file. It is very similar to :ref:`speech_recognition_tool`. It needs some dependencies like speechmodels or a grammar to work. You can use your own dependencies or simply use the given Language Models. You will need an accoustic model, a dictionary and a language model or a grammar. The usage of grammar excludes the usage of a language model. These parameters are set via the configuration type. Also you can use a manipulated sphinx configuration to adjust the recognizer. Related resources ----------------- Related projects: - `Sphinx4 at sourceforge `_. - `Sphinx4 at github `_. - :ref:`speech_recognition_tool` Component repository: - Browse component repository: `microphone-evaluation `_. - ``git clone https://projects.cit-ec.uni-bielefeld.de/git/lsp-csra.microphone-evaluation.git/`` System startup: The start script sphinxevaluator.sh can be found at ``/vol/csra//releases/trusty/lsp-csra-nightly/opt/sphinx4-evaluation-tool/bin/sphinxevaluator``. The program can be invoked with the following parameters: -d , --data-path The location of your soundfiles you want to be recognized. Also location where results are stored. -k , --keywords Set the path to a keyword file. The result will then only contain keywords. -e , --folder-exclusion Name directories that you want to be ignored. -r , --recursive-folder-crawling Boolean if sphinx looks for soundfiles in every directory inside the data path. -m , --speechmodel-type Speechmodel configuration. You can use the speechmodel "-m verbmobil" or "-m cocolab". -g <'NAME PATH_TO_GRAMMAR_FILE'>, --grammar <'NAME PATH_TO_GRAMMAR_FILE'> The grammar sphinx should use. Usage of grammar excludes usage of a language model. -c , --sphinxconfig Configuration file to adjust the recognizers behaviour. Interfaces ----------- Input/Output: .. attention:: The soundfiles you want to recognize need to be a .wav, 16khz, mono file! - Sphinx recognizes speech from soundfiles and stores the result in a textfile. - You will find a new folder "hypotheses" in the data folder you specified with the -d parameter. In this folder you will find the recognition results. - The resultfiles will be named after your parameters. - If you choose to enable keyword-output, the resultfile will *only* contain recognized keywords. Language Model Information -------------------------- Language of the Language Model is german. The two speechmodels "Verbmobil" and "Cocolab" can be found at `speechmodel-repo `_, and are installed at ``/vol/csra/releases/trusty/lsp-csra-nightly/etc/speechconfig/``. Default model will be the Verbmobil-speechmodel. Examples ---------- The following scenario is the base for all following examples. You have got a folder ``/root/datafolder/`` which looks like the table beneath. +----------------------------------------------------------------------+ | /root/datafolder/ | +====================+===============+================+================+ | *set01* | *set02* | *set03* | *set04* | +--------------------+---------------+----------------+----------------+ | *set05* | *new_folder* | *crypt_folder* | *fileXY.wav* | +--------------------+---------------+----------------+----------------+ | *log.log* | *picture.png* | *set06* | *trash_folder* | +--------------------+---------------+----------------+----------------+ You stored soundfiles with speech in this folder that you want to evaluate. Example 1: Now you want to use sphinx to recognize them. To start sphinx, you can use ``./sphinxevaluator -d /root/datafolder/``. With the -d parameter you tell it, where your audio-data is. Sphinx will then look in ``/root/datafolder/`` for .wav datas and will try to recognize them after each other. It will ignore everything else that wont have the suffix .wav. After it is finished, you will find a new folder in which you will find the results. The resultfolder will be created in the same folder where the recognized data is. Example 2: You have also got some sound-sets stored in the *set0X* folders. So you could use ``./sphinxevaluator -d /root/datafolder/ -r true`` It will then also look for .wav files in all underlaying directories. So you will have in every *set0X* folder a new directory called *hypothesis* where the results are saved. If other folders contain matching .wav files, sphinx will try to recognize them aswell of course. Example 3: Also, you know that *trash_folder* and *crypt-folder* is not containing any interesting data. So you could enhance your command line to ``./sphinxevaluator -d /root/datafolder/ -r true -e 'trash_folder;crypt_folder'``. Now the recognizer will ignore the given directories! Example 4: If you want to know, if the recognizer understood designated keywords, you can start the tool like this ``./sphinxevaluator -d /root/datafolder/ -k /root/keywordfile``. It will then recognize the files as usual, but will compare its results with a keywordfile and will only print spotted keywords in the resultfile. .. todo:: how to set up and where to store keyword-file Things to keep in mind ---------------------- - If sphinx gives you no results concider the following: - soundfile in wrong format. - soundfile too noisy - soundfile too long (speech blurred) - The recognizer will take a maximum amount of ten seconds to recognize a file. If the limit is exceeded, the file will be skipped. - The usage of a grammar if notable faster than using a language model. - Grammar usage excludes language-model usage. - Sphinx has big problems recognizing files that contain noise. - The soundfiles you want to recognize need to be a .wav, 16khz, mono file!