wav2vec2-base-960h-librispeech-model

This model is a fine-tuned version of facebook/wav2vec2-base-960h on the LIBRI10H - ENG dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6832
  • Wer: 0.5184

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.003
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • num_epochs: 100.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
3.477 1.1565 200 2.8801 1.0
2.8001 2.3130 400 2.1488 0.9999
1.6185 3.4696 600 1.2806 0.9032
1.359 4.6261 800 1.1836 0.8888
1.2731 5.7826 1000 1.1609 0.8760
1.2092 6.9391 1200 1.1091 0.8629
1.1681 8.0928 1400 1.0761 0.8544
1.1297 9.2493 1600 1.0792 0.8494
1.0982 10.4058 1800 1.0455 0.8353
1.073 11.5623 2000 1.0372 0.8361
1.0436 12.7188 2200 1.0217 0.8368
1.021 13.8754 2400 0.9912 0.8129
0.9893 15.0290 2600 0.9936 0.8073
0.964 16.1855 2800 0.9619 0.7934
0.9391 17.3420 3000 0.9557 0.7898
0.9159 18.4986 3200 0.9378 0.7797
0.8927 19.6551 3400 0.9074 0.7680
0.8714 20.8116 3600 0.9022 0.7623
0.8412 21.9681 3800 0.8682 0.7360
0.8083 23.1217 4000 0.8436 0.7153
0.7838 24.2783 4200 0.8584 0.7048
0.7533 25.4348 4400 0.8101 0.6912
0.7286 26.5913 4600 0.7933 0.6707
0.6965 27.7478 4800 0.8056 0.6681
0.6859 28.9043 5000 0.7554 0.6417
0.6529 30.0580 5200 0.7624 0.6291
0.6315 31.2145 5400 0.7508 0.6123
0.6144 32.3710 5600 0.7255 0.6056
0.5933 33.5275 5800 0.7546 0.6045
0.5827 34.6841 6000 0.7054 0.5851
0.56 35.8406 6200 0.7157 0.5858
0.5431 36.9971 6400 0.7262 0.5788
0.5281 38.1507 6600 0.6931 0.5598
0.5091 39.3072 6800 0.7102 0.5660
0.5031 40.4638 7000 0.6894 0.5466
0.4867 41.6203 7200 0.6858 0.5441
0.4689 42.7768 7400 0.7051 0.5403
0.4653 43.9333 7600 0.6840 0.5335
0.4441 45.0870 7800 0.6987 0.5322
0.435 46.2435 8000 0.7294 0.5315
0.4302 47.4 8200 0.6831 0.5182
0.4173 48.5565 8400 0.7080 0.5217
0.4086 49.7130 8600 0.6974 0.5101
0.3995 50.8696 8800 0.6842 0.5059
0.3866 52.0232 9000 0.7347 0.5186
0.3779 53.1797 9200 0.7141 0.5008
0.3691 54.3362 9400 0.7005 0.4998
0.3609 55.4928 9600 0.7299 0.4964
0.3584 56.6493 9800 0.6965 0.4966
0.3461 57.8058 10000 0.7217 0.4898
0.3403 58.9623 10200 0.7178 0.4850
0.3342 60.1159 10400 0.7019 0.4832
0.3215 61.2725 10600 0.7528 0.4834
0.3182 62.4290 10800 0.7112 0.4794
0.3123 63.5855 11000 0.7456 0.4780
0.3065 64.7420 11200 0.7509 0.4729
0.302 65.8986 11400 0.7293 0.4743
0.2942 67.0522 11600 0.7418 0.4734
0.2872 68.2087 11800 0.7607 0.4643
0.2844 69.3652 12000 0.7360 0.4679
0.2775 70.5217 12200 0.7594 0.4639
0.2736 71.6783 12400 0.7489 0.4667
0.2633 72.8348 12600 0.7576 0.4670
0.2627 73.9913 12800 0.7881 0.4597
0.2592 75.1449 13000 0.7566 0.4573
0.2557 76.3014 13200 0.7827 0.4629
0.246 77.4580 13400 0.7816 0.4586
0.2455 78.6145 13600 0.7918 0.4574
0.238 79.7710 13800 0.7928 0.4519
0.2376 80.9275 14000 0.7769 0.4508
0.2319 82.0812 14200 0.7877 0.4519
0.2268 83.2377 14400 0.7943 0.4537
0.2297 84.3942 14600 0.7913 0.4500
0.2207 85.5507 14800 0.8011 0.4481
0.219 86.7072 15000 0.7940 0.4485
0.2159 87.8638 15200 0.8179 0.4470
0.2126 89.0174 15400 0.8171 0.4449
0.2097 90.1739 15600 0.8208 0.4456
0.2074 91.3304 15800 0.8218 0.4446
0.2062 92.4870 16000 0.8242 0.4439
0.206 93.6435 16200 0.8340 0.4432
0.2007 94.8 16400 0.8240 0.4420
0.1994 95.9565 16600 0.8306 0.4419
0.197 97.1101 16800 0.8371 0.4431
0.1955 98.2667 17000 0.8345 0.4421
0.1979 99.4232 17200 0.8382 0.4421

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
10
Safetensors
Model size
104M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for csikasote/wav2vec2-base-960h-librispeech-model

Finetuned
(136)
this model