RAG / openai_text-embedding-ada-002 /fixed_chunks /_benchmarks.txt_chunk_0.txt
thenativefox
Added split files and tables
939262b
Benchmarks
Hugging Face's Benchmarking tools are deprecated and it is advised to use external Benchmarking libraries to measure the speed
and memory complexity of Transformer models.
[[open-in-colab]]
Let's take a look at how πŸ€— Transformers models can be benchmarked, best practices, and already available benchmarks.
A notebook explaining in more detail how to benchmark πŸ€— Transformers models can be found here.
How to benchmark πŸ€— Transformers models
The classes [PyTorchBenchmark] and [TensorFlowBenchmark] allow to flexibly benchmark πŸ€— Transformers models. The benchmark classes allow us to measure the peak memory usage and required time for both inference and training.
Hereby, inference is defined by a single forward pass, and training is defined by a single forward pass and
backward pass.
The benchmark classes [PyTorchBenchmark] and [TensorFlowBenchmark] expect an object of type [PyTorchBenchmarkArguments] and
[TensorFlowBenchmarkArguments], respectively, for instantiation. [PyTorchBenchmarkArguments] and [TensorFlowBenchmarkArguments] are data classes and contain all relevant configurations for their corresponding benchmark class. In the following example, it is shown how a BERT model of type bert-base-cased can be benchmarked.
from transformers import PyTorchBenchmark, PyTorchBenchmarkArguments
args = PyTorchBenchmarkArguments(models=["google-bert/bert-base-uncased"], batch_sizes=[8], sequence_lengths=[8, 32, 128, 512])
benchmark = PyTorchBenchmark(args)
</pt>
<tf>py
from transformers import TensorFlowBenchmark, TensorFlowBenchmarkArguments
args = TensorFlowBenchmarkArguments(
models=["google-bert/bert-base-uncased"], batch_sizes=[8], sequence_lengths=[8, 32, 128, 512]
)
benchmark = TensorFlowBenchmark(args)
Here, three arguments are given to the benchmark argument data classes, namely models, batch_sizes, and
sequence_lengths. The argument models is required and expects a list of model identifiers from the
model hub The list arguments batch_sizes and sequence_lengths define
the size of the input_ids on which the model is benchmarked. There are many more parameters that can be configured
via the benchmark argument data classes. For more detail on these one can either directly consult the files
src/transformers/benchmark/benchmark_args_utils.py, src/transformers/benchmark/benchmark_args.py (for PyTorch)
and src/transformers/benchmark/benchmark_args_tf.py (for Tensorflow). Alternatively, running the following shell
commands from root will print out a descriptive list of all configurable parameters for PyTorch and Tensorflow
respectively.
python examples/pytorch/benchmarking/run_benchmark.py --help
An instantiated benchmark object can then simply be run by calling benchmark.run().
results = benchmark.run()
print(results)
==================== INFERENCE - SPEED - RESULT ====================
Model Name Batch Size Seq Length Time in s
google-bert/bert-base-uncased 8 8 0.006
google-bert/bert-base-uncased 8 32 0.006
google-bert/bert-base-uncased 8 128 0.018
google-bert/bert-base-uncased 8 512 0.088
==================== INFERENCE - MEMORY - RESULT ====================
Model Name Batch Size Seq Length Memory in MB
google-bert/bert-base-uncased 8 8 1227
google-bert/bert-base-uncased 8 32 1281
google-bert/bert-base-uncased 8 128 1307
google-bert/bert-base-uncased 8 512 1539
==================== ENVIRONMENT INFORMATION ====================