Speech Transcription – PKI Electronic IntelligencePKI Electronic Intelligence

The PKI 2670 Speech Transcription automatically converts speech into plain text, which means that the entire content hidden in voice recordings can be easily searched. The technology details are based on state-of-the-art acoustic modelling techniques, including neural network-based functions. Among other things, the focus is also on spontaneous telephone conversations. PKI 2670 applies channel compensation techniques that are compatible with the widest possible range of audio sources: GSM/CDMA, 3G, VoIP, landline, satellite phone, etc. PKI 2670 Speech Transcription supports adding other words to the model in the latest generation of the model. New languages can be trained on request.

PKI 2670 has the following input requirements:WAV or RAW (unsigned PCM 8 or 16 bit, IEEE float 32 bit, A-law or Mu-law, ADPCM), FLAC, OPUS; 8kHz+ sampling (other audio formats are automatically converted).

PKI 2670 uses the following output formats: XML/ JSON format with all results or result files.

One-best transcription
N-best transcription

The following languages are supported by PKI 2670

5th generation

Russian

Polish

Dutch

English (US)

Spanisch

Croatian

French

Arabic

Czech

Slovak

4th and older genaration

German

English (US)

Farsi

Italian

Czech

Arabic

Russian

Polish

Chinese

Dutch

Processing speed

The 5th generation PKI 2670 is approximately 7x faster than real-time processing on a CPU core.
The 4th and older generations of the model are about 1.2x faster.