Optimize and fine-tune the neural network parameters (the code is
already written) to improve the frame-level speech recognition
classification using a neural network model based on input MFCC (Mel
Frequency Cepstral Coefficients) data. The task is to classify the
specific phonemes in the audio frames and eventually submit a prediction
result in .csv format to Kaggle, aiming to achieve an accuracy of 86-87%. Currently, my model achieves an accuracy of 84%.
Must use MLP (Multilayer Perceptron)
The total number of parameters up to 20 million.
The expected output/submission file should be similar to the attached submission(2).csv.
My code is HW1P2_F24_Starter_Notebook-Copy1(1).ipynb
The required data/corpus may be found at https://www.openslr.org/12.
Important - Read this before proceeding
These instructions reflect a task our writers previously completed for another student. Should you require assistance with the same assignment, please submit your homework details to our writers’ platform. This will ensure you receive an original paper, you can submit as your own. For further guidance, visit our ‘How It Works’ page.