Spaces:

thenativefox
/

RAG

Running

RAG

File size: 3,778 Bytes

939262b

Benchmarks

Hugging Face's Benchmarking tools are deprecated and it is advised to use external Benchmarking libraries to measure the speed 
and memory complexity of Transformer models.

[[open-in-colab]]
Let's take a look at how 🤗 Transformers models can be benchmarked, best practices, and already available benchmarks.
A notebook explaining in more detail how to benchmark 🤗 Transformers models can be found here.
How to benchmark 🤗 Transformers models
The classes [PyTorchBenchmark] and [TensorFlowBenchmark] allow to flexibly benchmark 🤗 Transformers models. The benchmark classes allow us to measure the peak memory usage and required time for both inference and training.

Hereby, inference is defined by a single forward pass, and training is defined by a single forward pass and
backward pass.

The benchmark classes [PyTorchBenchmark] and [TensorFlowBenchmark] expect an object of type [PyTorchBenchmarkArguments] and
[TensorFlowBenchmarkArguments], respectively, for instantiation. [PyTorchBenchmarkArguments] and [TensorFlowBenchmarkArguments] are data classes and contain all relevant configurations for their corresponding benchmark class. In the following example, it is shown how a BERT model of type bert-base-cased can be benchmarked.

from transformers import PyTorchBenchmark, PyTorchBenchmarkArguments
args = PyTorchBenchmarkArguments(models=["google-bert/bert-base-uncased"], batch_sizes=[8], sequence_lengths=[8, 32, 128, 512])
benchmark = PyTorchBenchmark(args)
</pt>
<tf>py
from transformers import TensorFlowBenchmark, TensorFlowBenchmarkArguments
args = TensorFlowBenchmarkArguments(
     models=["google-bert/bert-base-uncased"], batch_sizes=[8], sequence_lengths=[8, 32, 128, 512]
 )
benchmark = TensorFlowBenchmark(args)

Here, three arguments are given to the benchmark argument data classes, namely models, batch_sizes, and
sequence_lengths. The argument models is required and expects a list of model identifiers from the
model hub The list arguments batch_sizes and sequence_lengths define
the size of the input_ids on which the model is benchmarked. There are many more parameters that can be configured
via the benchmark argument data classes. For more detail on these one can either directly consult the files
src/transformers/benchmark/benchmark_args_utils.py, src/transformers/benchmark/benchmark_args.py (for PyTorch)
and src/transformers/benchmark/benchmark_args_tf.py (for Tensorflow). Alternatively, running the following shell
commands from root will print out a descriptive list of all configurable parameters for PyTorch and Tensorflow
respectively.

python examples/pytorch/benchmarking/run_benchmark.py --help
An instantiated benchmark object can then simply be run by calling benchmark.run().

results = benchmark.run()
print(results)
====================       INFERENCE - SPEED - RESULT       ====================

Model Name             Batch Size     Seq Length     Time in s
google-bert/bert-base-uncased          8               8             0.006   
google-bert/bert-base-uncased          8               32            0.006   
google-bert/bert-base-uncased          8              128            0.018   
google-bert/bert-base-uncased          8              512            0.088     

====================      INFERENCE - MEMORY - RESULT       ====================
Model Name             Batch Size     Seq Length    Memory in MB
google-bert/bert-base-uncased          8               8             1227
google-bert/bert-base-uncased          8               32            1281
google-bert/bert-base-uncased          8              128            1307
google-bert/bert-base-uncased          8              512            1539

====================        ENVIRONMENT INFORMATION         ====================