Find Jobs
Hire Freelancers

fine-tuning the XLSR-Wav2vec 2.0 pre-trained model for the Turkish language and Hungarian language

$30-250 USD

Zaprt
Objavljeno pred skoraj 2 letoma

$30-250 USD

Plačilo ob dostavi
[login to view URL]%97_Transformers.ipynb#scrollTo=LBSYoWbi-45k This script can be used for Turkish, but a few changes and visualizations here would be better and model output and script should be able to upload my drive. facebook/wav2vec2-large-xlsr-53 will be pre-trained model. • Mozilla Common Voice dataset should be used to train the models • The models must be trained using wav2vec2 architecture [login to view URL] 2 pre-trained models are enough to train: o wav2vec2-xlsr-53 ([login to view URL]) o wav2vec2-xls-r-300m ([login to view URL]) 3) Please pay extra attention to this subsection: You should follow this script: [login to view URL] Inside this script, database installation and model trainings are given in detailed way. Inside script, database is installed in this part: 3.1) Here instead of “common_voice” dataset you should write “mozilla-foundation/common_voice_9_0” or other versions (7,8) All other cleaning and pre-processing steps should be the same as in script. 3.2) And here in this script you can deifne pre-trained model that you want to fine-tune In the above picture “facebook/wav2vec2-large-xlsr-53” pre-trained model is given. 3.3) After you finish the training, last thing you need to do is to boost the final models with n-gram language model (either 4 or 5). Here is the script for it: [login to view URL] This script is intended for Swedish language. For Turkish language you can use Turkish Wikipedia dump. You can find link below: [login to view URL] You will follow the given script, but you need to use the given Turkish data above. This is the part you need to change Or you can generate .arpa file by using this extractor directly: [login to view URL] To sum up, you need to run the given colab script and boost the final models with n-gram language model. This is all about experiments. 4) At the end, you need to write results of the trained models, compare them against each other by using charts, graphs, or tables. The models should be evaluated on 4 metrics: word error rate (WER) character error rate (CER). RTF= time needed for recognizing the full test set / total length of the full test set memory requirement = peak GPU memory load (during test) Additionally compare the final Turkish language models with Hungarian models (minimum 2 comparative graphs). you ’need to train the model for Hungarian. I provide already trained ones below: [login to view URL] [login to view URL] check this for getting dataset for Hungarian n-gram (and also helpful script)
ID projekta: 33751920

Več o projektu

7 ponudb
Projekt na daljavo
Aktivno pred 2 letoma

Želite zaslužiti?

Prednosti oddajanja ponudb na Freelancerju

Nastavite svoj proračun in časovni okvir
Prejmite plačilo za svoje delo
Povzetek predloga
Registracija in oddajanje ponudb sta brezplačna
7 freelancerjev je oddalo ponudbo s povprečno vrednostjo $168 USD za to delo
Avatar uporabnika
I have done similar projects to this, please send me a message right away let's get started. I'm a senior engineer with rich experience in Python, Data Processing, Machine Learning (ML), Data Visualization, Deep Learning. I am a Python developer with 5+ years of experience Skills: PyQt, PySide/PyQt,Scrapy, BeautifulSoup Java, C++, C#, SQL, 4, Pillow, Matplotlib, Xml, PHP, Django json, and csv modules. Expert in statistical analysis of datasets/images and apply ML/DL algorithm with python (Tensorflow, Keras,Pytorch) and R. I am interested in taking this task up for you and I hope I could awake your interest in me. Please kindly send me a message in chat window for more about the project and to adjust price, note price quoted in bid is not the final price, it can be less or more after discussing the project. I am always online, work hard and very sensitive to details. I hope to read from you in chat box. Best regards
$160 USD v 5 dneh
5,0 (7 ocen)
4,5
4,5
Avatar uporabnika
I am familiar with the wav2vec2 model and its applications. I'm really interested in the project and I am bidding with the least amount to start working on it. Kindly start the chat to discuss about the project.
$140 USD v 7 dneh
5,0 (2 ocen)
3,6
3,6
Avatar uporabnika
Hi @hmdv002. I read your document and saw all links. I have a experience about your project. I am a senior Python programmer with 5+ years of extensive experience. You can read my reviews to check me. I read your job posting carefully. I want to work with you for your project. I'm familiar with agile project management tools including Slack, JIRA, Trello, Bitbucket, Github, etc. I ensure the highest quality of product and 100% satisfaction through my work. I am innovative and strategic thinking professional with a proven track record of consistently going above and beyond in meeting customer needs and providing more value to the product than what the customer is paying for. For this very reason, they always get back to me again and again with promising ideas and projects. I hope we can discuss more details in chat. I'll look forward to hearing from you soon. Thanks so much. Kind Regards.
$200 USD v 7 dneh
5,0 (2 ocen)
3,2
3,2
Avatar uporabnika
Hi, I have been an academic at a top-ranked engineering university, since 2013. Currently, I am on a sabbatical, residing in the UK, as a stay-home-dad. I have adequate knowledge of the breadth of ML algorithms with an ability to evaluate and choose the best-suited algorithms, perform feature selection and optimize machine learning models. I have hands-on experience in implementing supervised ML algorithms like linear regression, logistic regression, decision trees, naïve Bayes, and K nearest neighbors. I have hands-on experience in using machine learning frameworks and libraries like Scikit-learn, Tensorflow, Keras, and Pytorch to solve real-world problems. Lately, I have taught and applied big data management and analytics, with exposure to supervised and unsupervised machine learning for big data problems using Apache Mahout and Apache Spark.
$200 USD v 7 dneh
5,0 (1 ocena)
1,6
1,6

O stranki

Zastava HUNGARY
Budapest, Hungary
5,0
1
Plačilna metoda je verificirana
Član(ica) od jun. 26, 2019

Verifikacija stranke

Hvala! Po e-pošti smo vam poslali povezavo za prevzem brezplačnega dobropisa.
Pri pošiljanju vašega e-sporočila je šlo nekaj narobe. Poskusite znova.
Registrirani uporabniki Skupaj objavljenih del
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Nalaganje predogleda
Geolociranje je bilo dovoljeno.
Vaša prijavna seja je potekla, zato ste bili odjavljeni. Prosimo, da se znova prijavite.