moshew commited on
Commit
7935f23
·
verified ·
1 Parent(s): 2aa0846

Add new SentenceTransformer model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 384,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,395 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - sentence-transformers
4
+ - sentence-similarity
5
+ - feature-extraction
6
+ - generated_from_trainer
7
+ - dataset_size:2000
8
+ - loss:CoSENTLoss
9
+ base_model: avsolatorio/GIST-small-Embedding-v0
10
+ widget:
11
+ - source_sentence: is alexa compatible with tv?
12
+ sentences:
13
+ - Of een ei iedere dag gezond of ongezond is, hangt af van wat je verder iedere
14
+ dag eet. Het Voedingscentrum adviseert om te variëren in vis, peulvruchten, vlees
15
+ en ei. Het eten van 2-3 eieren per week past in een gezonde voeding. Vegetariërs
16
+ kunnen 3-4 eieren per week eten.
17
+ - The price was right, the size was right and as it turns out this PYLE TV has the
18
+ best picture quality of all 5 TVs that our family watches! The setup was super
19
+ easy with no hassle. I would recommend it to anyone!
20
+ - According to the Association of British Insurers, insurance companies will look
21
+ into a policyholder's medical profile if they give up smoking. They'll commonly
22
+ seek a report from a policyholder's family doctor. If this raises concerns, they
23
+ may ask a policyholder to have a chest X-ray.
24
+ - source_sentence: is nyada a real college?
25
+ sentences:
26
+ - The instruments have been classified as Wind instruments (aero phonic) including
27
+ Bansuri and Nagaswaram; String instruments (chordophonic) including Dilruba and
28
+ Veena; Percussion instruments (membranophonic) including Tabla, Mridangam and
29
+ (idiophonic) Bortal, and Ghatam.
30
+ - This service is currently offered free of charge by the bank. You can get the
31
+ last 'Available' balance of your account (by an SMS) by giving a Missed Call to
32
+ 18008431122. You can get the Mini Statement (by an SMS) for last 5 transactions
33
+ in your account by giving a Missed Call to 18008431133. 1.
34
+ - King Size Bed Known as a standard 5ft bed or 150cm wide by 200cm in length.
35
+ - source_sentence: is europe bigger than australia?
36
+ sentences:
37
+ - Although this is just five per cent of the world's land mass (149.45 million square
38
+ kilometres), Australia is the planet's sixth largest country after Russia, Canada,
39
+ China, the United States of America and Brazil. ... almost as great as that of
40
+ the United States of America. about 50 per cent greater than Europe, and.
41
+ - The recommended dose of evening primrose oil is 8 to 12 capsules a day, at a dose
42
+ of 500 milligrams per capsule. A range of evening primrose oil products are available
43
+ for purchase online.
44
+ - This includes a three-year law degree, a one-year LPC and finally a two-year training
45
+ contract with a law firm. Studying a non-law subject for your degree means you'll
46
+ need to take the GDL conversion course before your LPC, which adds one year to
47
+ the total.
48
+ - source_sentence: how long does money take to transfer boi?
49
+ sentences:
50
+ - 'When will it take more than one working day? It will take more than one working
51
+ day to reach your payee''s bank when: You make a payment online after 3.30pm in
52
+ the Republic of Ireland or after 4.30pm in Northern Ireland and Great Britain
53
+ on a working day. Your payment will begin to process on the next working day.'
54
+ - U.S. citizens travelling to South Korea for business or tourism do not need a
55
+ visa. ... Although obtaining a visa in advance can ease the entry process, as
56
+ long as you have a valid U.S. passport, you can enter the Republic of Korea without
57
+ a visa for a stay of up to 90 days if you are a tourist or on business.
58
+ - Structural insulated panels (SIPs) are a high performance building system for
59
+ residential and commercial construction. The panels consist of an insulating foam
60
+ core sandwiched between two structural facings, typically oriented strand board
61
+ (OSB). SIPs are manufactured under factory controlled conditions.
62
+ - source_sentence: where are bussola shoes made?
63
+ sentences:
64
+ - According to Harvard University, biking at a moderate speed of 12 to 13.9 miles
65
+ per hour will cause a 155-pound person to burn 298 calories in 30 minutes. At
66
+ a faster rate of 14 to 15.9 miles per hour, a person of the same weight will burn
67
+ 372 calories.
68
+ - If you had bought just one share of Microsoft at the IPO, you would now have 288
69
+ shares after all the splits. Those shares would be worth $44,505 at the current
70
+ stock quote of $154.53. A $5,000 investment would have purchased 238 shares at
71
+ the IPO price.
72
+ - FRAM opens the first plant devoted exclusively to the development and manufacture
73
+ of heavy duty air filters and cartridges, in Nevada, Missouri.
74
+ pipeline_tag: sentence-similarity
75
+ library_name: sentence-transformers
76
+ ---
77
+
78
+ # SentenceTransformer based on avsolatorio/GIST-small-Embedding-v0
79
+
80
+ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [avsolatorio/GIST-small-Embedding-v0](https://huggingface.co/avsolatorio/GIST-small-Embedding-v0). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
81
+
82
+ ## Model Details
83
+
84
+ ### Model Description
85
+ - **Model Type:** Sentence Transformer
86
+ - **Base model:** [avsolatorio/GIST-small-Embedding-v0](https://huggingface.co/avsolatorio/GIST-small-Embedding-v0) <!-- at revision 75e62fd210b9fde790430e0b2f040b0b00a021b1 -->
87
+ - **Maximum Sequence Length:** 512 tokens
88
+ - **Output Dimensionality:** 384 dimensions
89
+ - **Similarity Function:** Cosine Similarity
90
+ <!-- - **Training Dataset:** Unknown -->
91
+ <!-- - **Language:** Unknown -->
92
+ <!-- - **License:** Unknown -->
93
+
94
+ ### Model Sources
95
+
96
+ - **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
97
+ - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
98
+ - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers)
99
+
100
+ ### Full Model Architecture
101
+
102
+ ```
103
+ SentenceTransformer(
104
+ (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
105
+ (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
106
+ (2): Normalize()
107
+ )
108
+ ```
109
+
110
+ ## Usage
111
+
112
+ ### Direct Usage (Sentence Transformers)
113
+
114
+ First install the Sentence Transformers library:
115
+
116
+ ```bash
117
+ pip install -U sentence-transformers
118
+ ```
119
+
120
+ Then you can load this model and run inference.
121
+ ```python
122
+ from sentence_transformers import SentenceTransformer
123
+
124
+ # Download from the 🤗 Hub
125
+ model = SentenceTransformer("moshew/gist_small_ft_gooaq_v2")
126
+ # Run inference
127
+ sentences = [
128
+ 'where are bussola shoes made?',
129
+ 'FRAM opens the first plant devoted exclusively to the development and manufacture of heavy duty air filters and cartridges, in Nevada, Missouri.',
130
+ 'According to Harvard University, biking at a moderate speed of 12 to 13.9 miles per hour will cause a 155-pound person to burn 298 calories in 30 minutes. At a faster rate of 14 to 15.9 miles per hour, a person of the same weight will burn 372 calories.',
131
+ ]
132
+ embeddings = model.encode(sentences)
133
+ print(embeddings.shape)
134
+ # [3, 384]
135
+
136
+ # Get the similarity scores for the embeddings
137
+ similarities = model.similarity(embeddings, embeddings)
138
+ print(similarities.shape)
139
+ # [3, 3]
140
+ ```
141
+
142
+ <!--
143
+ ### Direct Usage (Transformers)
144
+
145
+ <details><summary>Click to see the direct usage in Transformers</summary>
146
+
147
+ </details>
148
+ -->
149
+
150
+ <!--
151
+ ### Downstream Usage (Sentence Transformers)
152
+
153
+ You can finetune this model on your own dataset.
154
+
155
+ <details><summary>Click to expand</summary>
156
+
157
+ </details>
158
+ -->
159
+
160
+ <!--
161
+ ### Out-of-Scope Use
162
+
163
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
164
+ -->
165
+
166
+ <!--
167
+ ## Bias, Risks and Limitations
168
+
169
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
170
+ -->
171
+
172
+ <!--
173
+ ### Recommendations
174
+
175
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
176
+ -->
177
+
178
+ ## Training Details
179
+
180
+ ### Training Dataset
181
+
182
+ #### Unnamed Dataset
183
+
184
+ * Size: 2,000 training samples
185
+ * Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>label</code>
186
+ * Approximate statistics based on the first 1000 samples:
187
+ | | sentence1 | sentence2 | label |
188
+ |:--------|:----------------------------------------------------------------------------------|:------------------------------------------------------------------------------------|:--------------------------------------------------------------|
189
+ | type | string | string | float |
190
+ | details | <ul><li>min: 8 tokens</li><li>mean: 12.05 tokens</li><li>max: 23 tokens</li></ul> | <ul><li>min: 13 tokens</li><li>mean: 59.28 tokens</li><li>max: 118 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.5</li><li>max: 1.0</li></ul> |
191
+ * Samples:
192
+ | sentence1 | sentence2 | label |
193
+ |:--------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
194
+ | <code>what is the difference between rapid rise yeast and bread machine yeast?</code> | <code>Though there are some minor differences in shape and nutrients, Rapid-Rise Yeast is (pretty much) the same as Instant Yeast and Bread Machine Yeast. ... Also, Rapid-Rise Yeast is a little more potent than Active Dry Yeast and can be mixed in with your dry ingredients directly.</code> | <code>1.0</code> |
195
+ | <code>what is the difference between rapid rise yeast and bread machine yeast?</code> | <code>Application. To clarify, double-acting baking powder is “regular” baking powder. Single-acting baking powder exits, but when a recipe calls for baking powder it means double-acting. And even if a recipe does call for single-acting, you can substitute double-acting without worrying about it changing the recipe.</code> | <code>0.0</code> |
196
+ | <code>are light kits universal for ceiling fans?</code> | <code>Not all Universal Light Kits are actually Universal. They can be universal to only that manufacturer. ... Casablanca and Hunter Ceiling Fan Light Kits are universal only to their own fans.</code> | <code>1.0</code> |
197
+ * Loss: [<code>CoSENTLoss</code>](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#cosentloss) with these parameters:
198
+ ```json
199
+ {
200
+ "scale": 20.0,
201
+ "similarity_fct": "pairwise_cos_sim"
202
+ }
203
+ ```
204
+
205
+ ### Training Hyperparameters
206
+ #### Non-Default Hyperparameters
207
+
208
+ - `per_device_train_batch_size`: 16
209
+ - `per_device_eval_batch_size`: 16
210
+ - `num_train_epochs`: 1
211
+ - `warmup_ratio`: 0.1
212
+ - `seed`: 12
213
+ - `bf16`: True
214
+ - `dataloader_num_workers`: 4
215
+
216
+ #### All Hyperparameters
217
+ <details><summary>Click to expand</summary>
218
+
219
+ - `overwrite_output_dir`: False
220
+ - `do_predict`: False
221
+ - `eval_strategy`: no
222
+ - `prediction_loss_only`: True
223
+ - `per_device_train_batch_size`: 16
224
+ - `per_device_eval_batch_size`: 16
225
+ - `per_gpu_train_batch_size`: None
226
+ - `per_gpu_eval_batch_size`: None
227
+ - `gradient_accumulation_steps`: 1
228
+ - `eval_accumulation_steps`: None
229
+ - `torch_empty_cache_steps`: None
230
+ - `learning_rate`: 5e-05
231
+ - `weight_decay`: 0.0
232
+ - `adam_beta1`: 0.9
233
+ - `adam_beta2`: 0.999
234
+ - `adam_epsilon`: 1e-08
235
+ - `max_grad_norm`: 1.0
236
+ - `num_train_epochs`: 1
237
+ - `max_steps`: -1
238
+ - `lr_scheduler_type`: linear
239
+ - `lr_scheduler_kwargs`: {}
240
+ - `warmup_ratio`: 0.1
241
+ - `warmup_steps`: 0
242
+ - `log_level`: passive
243
+ - `log_level_replica`: warning
244
+ - `log_on_each_node`: True
245
+ - `logging_nan_inf_filter`: True
246
+ - `save_safetensors`: True
247
+ - `save_on_each_node`: False
248
+ - `save_only_model`: False
249
+ - `restore_callback_states_from_checkpoint`: False
250
+ - `no_cuda`: False
251
+ - `use_cpu`: False
252
+ - `use_mps_device`: False
253
+ - `seed`: 12
254
+ - `data_seed`: None
255
+ - `jit_mode_eval`: False
256
+ - `use_ipex`: False
257
+ - `bf16`: True
258
+ - `fp16`: False
259
+ - `fp16_opt_level`: O1
260
+ - `half_precision_backend`: auto
261
+ - `bf16_full_eval`: False
262
+ - `fp16_full_eval`: False
263
+ - `tf32`: None
264
+ - `local_rank`: 0
265
+ - `ddp_backend`: None
266
+ - `tpu_num_cores`: None
267
+ - `tpu_metrics_debug`: False
268
+ - `debug`: []
269
+ - `dataloader_drop_last`: False
270
+ - `dataloader_num_workers`: 4
271
+ - `dataloader_prefetch_factor`: None
272
+ - `past_index`: -1
273
+ - `disable_tqdm`: False
274
+ - `remove_unused_columns`: True
275
+ - `label_names`: None
276
+ - `load_best_model_at_end`: False
277
+ - `ignore_data_skip`: False
278
+ - `fsdp`: []
279
+ - `fsdp_min_num_params`: 0
280
+ - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
281
+ - `tp_size`: 0
282
+ - `fsdp_transformer_layer_cls_to_wrap`: None
283
+ - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
284
+ - `deepspeed`: None
285
+ - `label_smoothing_factor`: 0.0
286
+ - `optim`: adamw_torch
287
+ - `optim_args`: None
288
+ - `adafactor`: False
289
+ - `group_by_length`: False
290
+ - `length_column_name`: length
291
+ - `ddp_find_unused_parameters`: None
292
+ - `ddp_bucket_cap_mb`: None
293
+ - `ddp_broadcast_buffers`: False
294
+ - `dataloader_pin_memory`: True
295
+ - `dataloader_persistent_workers`: False
296
+ - `skip_memory_metrics`: True
297
+ - `use_legacy_prediction_loop`: False
298
+ - `push_to_hub`: False
299
+ - `resume_from_checkpoint`: None
300
+ - `hub_model_id`: None
301
+ - `hub_strategy`: every_save
302
+ - `hub_private_repo`: None
303
+ - `hub_always_push`: False
304
+ - `gradient_checkpointing`: False
305
+ - `gradient_checkpointing_kwargs`: None
306
+ - `include_inputs_for_metrics`: False
307
+ - `include_for_metrics`: []
308
+ - `eval_do_concat_batches`: True
309
+ - `fp16_backend`: auto
310
+ - `push_to_hub_model_id`: None
311
+ - `push_to_hub_organization`: None
312
+ - `mp_parameters`:
313
+ - `auto_find_batch_size`: False
314
+ - `full_determinism`: False
315
+ - `torchdynamo`: None
316
+ - `ray_scope`: last
317
+ - `ddp_timeout`: 1800
318
+ - `torch_compile`: False
319
+ - `torch_compile_backend`: None
320
+ - `torch_compile_mode`: None
321
+ - `include_tokens_per_second`: False
322
+ - `include_num_input_tokens_seen`: False
323
+ - `neftune_noise_alpha`: None
324
+ - `optim_target_modules`: None
325
+ - `batch_eval_metrics`: False
326
+ - `eval_on_start`: False
327
+ - `use_liger_kernel`: False
328
+ - `eval_use_gather_object`: False
329
+ - `average_tokens_across_devices`: False
330
+ - `prompts`: None
331
+ - `batch_sampler`: batch_sampler
332
+ - `multi_dataset_batch_sampler`: proportional
333
+
334
+ </details>
335
+
336
+ ### Training Logs
337
+ | Epoch | Step | Training Loss |
338
+ |:-----:|:----:|:-------------:|
339
+ | 0.008 | 1 | 1.9382 |
340
+
341
+
342
+ ### Framework Versions
343
+ - Python: 3.11.12
344
+ - Sentence Transformers: 4.1.0
345
+ - Transformers: 4.51.3
346
+ - PyTorch: 2.6.0+cu124
347
+ - Accelerate: 1.5.2
348
+ - Datasets: 3.5.0
349
+ - Tokenizers: 0.21.1
350
+
351
+ ## Citation
352
+
353
+ ### BibTeX
354
+
355
+ #### Sentence Transformers
356
+ ```bibtex
357
+ @inproceedings{reimers-2019-sentence-bert,
358
+ title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
359
+ author = "Reimers, Nils and Gurevych, Iryna",
360
+ booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
361
+ month = "11",
362
+ year = "2019",
363
+ publisher = "Association for Computational Linguistics",
364
+ url = "https://arxiv.org/abs/1908.10084",
365
+ }
366
+ ```
367
+
368
+ #### CoSENTLoss
369
+ ```bibtex
370
+ @online{kexuefm-8847,
371
+ title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
372
+ author={Su Jianlin},
373
+ year={2022},
374
+ month={Jan},
375
+ url={https://kexue.fm/archives/8847},
376
+ }
377
+ ```
378
+
379
+ <!--
380
+ ## Glossary
381
+
382
+ *Clearly define terms in order to be accessible across audiences.*
383
+ -->
384
+
385
+ <!--
386
+ ## Model Card Authors
387
+
388
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
389
+ -->
390
+
391
+ <!--
392
+ ## Model Card Contact
393
+
394
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
395
+ -->
config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "BertModel"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "classifier_dropout": null,
7
+ "hidden_act": "gelu",
8
+ "hidden_dropout_prob": 0.1,
9
+ "hidden_size": 384,
10
+ "id2label": {
11
+ "0": "LABEL_0"
12
+ },
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 1536,
15
+ "label2id": {
16
+ "LABEL_0": 0
17
+ },
18
+ "layer_norm_eps": 1e-12,
19
+ "max_position_embeddings": 512,
20
+ "model_type": "bert",
21
+ "num_attention_heads": 12,
22
+ "num_hidden_layers": 12,
23
+ "pad_token_id": 0,
24
+ "position_embedding_type": "absolute",
25
+ "torch_dtype": "float32",
26
+ "transformers_version": "4.51.3",
27
+ "type_vocab_size": 2,
28
+ "use_cache": true,
29
+ "vocab_size": 30522
30
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "4.1.0",
4
+ "transformers": "4.51.3",
5
+ "pytorch": "2.6.0+cu124"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": "cosine"
10
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b4de81eaf6cb6757657f65eeaad44454d47a964df2748f996250bc382f0f1e0
3
+ size 133462128
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,58 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "extra_special_tokens": {},
49
+ "mask_token": "[MASK]",
50
+ "model_max_length": 512,
51
+ "never_split": null,
52
+ "pad_token": "[PAD]",
53
+ "sep_token": "[SEP]",
54
+ "strip_accents": null,
55
+ "tokenize_chinese_chars": true,
56
+ "tokenizer_class": "BertTokenizer",
57
+ "unk_token": "[UNK]"
58
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff