Self-Supervised Learning in Hebrew–Model to Practice Framework

Oren Gal; Rafi Michaeli; Yerach Doytsher

doi:10.14738/tmlai.106.13515

Authors

Oren Gal Israel Institute of Technology, Technion, Israel
Rafi Michaeli Tel Aviv University, Israel
Yerach Doytsher Israel Institute of Technology, Technion, Israel

DOI:

https://doi.org/10.14738/tmlai.106.13515

Keywords:

CNN, self-supervised, N-gram, Natural Language Processing.

Abstract

In this paper, we present the current state-of-the-art models for Automatic Speech Recognition due to a self-supervised training implemented on Hebrew language. The motivation behind using self-supervised learning is that even though we wouldn't probably get the accuracy rates as if we choose a supervised learning, we still can achieve amazing results with relatively low amount of data. This way of training allows us to train a model on unlabeled data (or to use a pre-trained model, which is always more accessible. It’s goal in the first unsupervised phase is to learn some good representations from raw audio samples, which can be useful for speech recognition tasks, without using any label data. Then, the model can be fine-tuned on a particular dataset for a specific purpose. It means that our involvement really occurs in the last layers of the model. This kind of training proved to be very powerful. We present complete framework from model to practice with simulations and training model and present an impressive result on Hebrew.

Self-Supervised Learning in Hebrew–Model to Practice Framework

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Author Center

Indexing

Follow Us

Current Issue

Most Read Last week