Spaces:

thenativefox
/

RAG

Running

RAG / openai_text-embedding-ada-002 /recursive_chunks /_debugging.txt_chunk_1.txt

thenativefox

Added split files and tables

939262b 10 months ago

2.64 kB

	You can see that we added 2 of these and now we track if inf or nan for forwarded_states was detected
	somewhere in between.
	Actually, the detector already reports these because each of the calls in the example above is a nn.Module, but
	let's say if you had some local direct calculations this is how you'd do that.
	Additionally, if you're instantiating the debugger in your own code, you can adjust the number of frames printed from
	its default, e.g.:
	thon
	from transformers.debug_utils import DebugUnderflowOverflow
	debug_overflow = DebugUnderflowOverflow(model, max_frames_to_save=100)

	Specific batch absolute min and max value tracing
	The same debugging class can be used for per-batch tracing with the underflow/overflow detection feature turned off.
	Let's say you want to watch the absolute min and max values for all the ingredients of each forward call of a given
	batch, and only do that for batches 1 and 3. Then you instantiate this class as:
	python
	debug_overflow = DebugUnderflowOverflow(model, trace_batch_nums=[1, 3])
	And now full batches 1 and 3 will be traced using the same format as the underflow/overflow detector does.
	Batches are 0-indexed.
	This is helpful if you know that the program starts misbehaving after a certain batch number, so you can fast-forward
	right to that area. Here is a sample truncated output for such configuration:

	* Starting batch number=1 *
	abs min abs max metadata
	shared Embedding
	1.01e-06 7.92e+02 weight
	0.00e+00 2.47e+04 input[0]
	5.36e-05 7.92e+02 output
	[]
	decoder.dropout Dropout
	1.60e-07 2.27e+01 input[0]
	0.00e+00 2.52e+01 output
	decoder T5Stack
	not a tensor output
	lm_head Linear
	1.01e-06 7.92e+02 weight
	0.00e+00 1.11e+00 input[0]
	6.06e-02 8.39e+01 output
	T5ForConditionalGeneration
	not a tensor output
	* Starting batch number=3 *

	abs min abs max metadata
	shared Embedding
	1.01e-06 7.92e+02 weight
	0.00e+00 2.78e+04 input[0]
	5.36e-05 7.92e+02 output
	[]

	Here you will get a huge number of frames dumped - as many as there were forward calls in your model, so it may or may
	not what you want, but sometimes it can be easier to use for debugging purposes than a normal debugger. For example, if
	a problem starts happening at batch number 150. So you can dump traces for batches 149 and 150 and compare where
	numbers started to diverge.
	You can also specify the batch number after which to stop the training, with:
	python
	debug_overflow = DebugUnderflowOverflow(model, trace_batch_nums=[1, 3], abort_after_batch_num=3)