metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:6000
- loss:CoSENTLoss
base_model: avsolatorio/GIST-small-Embedding-v0
widget:
- source_sentence: are paris metro tickets one way?
sentences:
- >-
The two big differences between the 2.4 GHz and 5 GHz frequencies are
speed and range. A wireless transmission at 2.4 GHz provides internet to
a larger area but sacrifices speed, while 5 GHz provides faster speeds
to a smaller area.
- >-
The State of Rhode Island has adopted the income shares model to
determine the weekly child support order. It is based upon the
philosophy that children are entitled to the standard of living based
upon both parents monthly income. ... Weekly gross income of both
parents before taxes and before any other deductions.
- >-
Insulin NPH may be administered in 2 divided doses daily (either as
equally divided doses, or as ~2/3 of the dose before the morning meal
and ~1/3 of the dose before the evening meal or at bedtime).
- source_sentence: how to pxe boot surface pro?
sentences:
- >-
The UKTV Play app, with shows from Dave, Drama, Yesterday and Really, is
available on smart TVs powered by Freeview Play and newer Samsung TVs.
... You can watch catch up and box sets from W, Alibi, Gold, Eden, Dave,
Drama and Yesterday on Sky+HD, Sky Q and Sky Go.
- >-
In a branch For cash that was deposited over the counter at another
bank, the processing and clearance time is 5 business days (not
including public holidays).
- >-
['Click "account" in the upper right corner of your Facebook page.',
'Select "privacy settings."', 'Under "block lists" at the bottom center
of the page, click "edit your lists."', 'At the top, under "block
users," add the name or e-mail address of the person you\'d like to
block.', 'Click "block."']
- source_sentence: what is long-term capital gains rate?
sentences:
- >-
You can get Social Security retirement or survivors benefits and work at
the same time. But, if you're younger than full retirement age, and earn
more than certain amounts, your benefits will be reduced. The amount
that your benefits are reduced, however, isn't truly lost.
- >-
Dreams that involve shouting can warn of impending trouble. When you are
the one shouting, this can mean you are going through a tough time in
your waking life. You may be only feeling only negative emotions. ...
Hearing someone else shouting signifies a warning of fright or anger.
- >-
A regular polygon is a flat shape whose sides are all equal and whose
angles are all equal. The formula for finding the sum of the measure of
the interior angles is (n - 2) * 180. To find the measure of one
interior angle, we take that formula and divide by the number of sides
n: (n - 2) * 180 / n.
- source_sentence: can a girl get pregnant two days after her menstruation?
sentences:
- >-
Newborn usually refers to a baby from birth to about 2 months of age.
Infants can be considered children anywhere from birth to 1 year old.
Baby can be used to refer to any child from birth to age 4 years old,
thus encompassing newborns, infants, and toddlers.
- >-
According to professional numerologists, there are three ultimately
lucky numbers for Capricorn-born people: they are 5, 6, and 8. In case
they want to increase the chance of success for anything, simply make
use of these numbers.
- >-
He's a professional dancer and model. J.C. Before entering the Big
Brother house, J.C. was a dancer who traveled the world to perform
professionally. “I do professional dancing. Not really break dancing, I
do more choreography dancing,” he said in an interview with
Entertainment Tonight Canada.
- source_sentence: how long does it take to transfer money between anz and westpac?
sentences:
- >-
This service is currently offered free of charge by the bank. You can
get the last 'Available' balance of your account (by an SMS) by giving a
Missed Call to 18008431122. You can get the Mini Statement (by an SMS)
for last 5 transactions in your account by giving a Missed Call to
18008431133. 1.
- >-
Simply put, 1 ply toilet paper is made of a single layer of paper, while
2 ply has two layers. ... In the 1950's, a manufacturer created a method
to roll and attach one-ply paper together to make a thicker “two-ply”.
For years, 2-ply toilet tissue was always thicker and usually assumed to
be better.
- >-
The main difference between unique and distinct is that UNIQUE is a
constraint that is used on the input of data and ensures data integrity.
While DISTINCT keyword is used when we want to query our results or in
other words, output the data.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
SentenceTransformer based on avsolatorio/GIST-small-Embedding-v0
This is a sentence-transformers model finetuned from avsolatorio/GIST-small-Embedding-v0. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: avsolatorio/GIST-small-Embedding-v0
- Maximum Sequence Length: 512 tokens
- Output Dimensionality: 384 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("moshew/gist_small_ft_gooaq_v1")
# Run inference
sentences = [
'how long does it take to transfer money between anz and westpac?',
"This service is currently offered free of charge by the bank. You can get the last 'Available' balance of your account (by an SMS) by giving a Missed Call to 18008431122. You can get the Mini Statement (by an SMS) for last 5 transactions in your account by giving a Missed Call to 18008431133. 1.",
"Simply put, 1 ply toilet paper is made of a single layer of paper, while 2 ply has two layers. ... In the 1950's, a manufacturer created a method to roll and attach one-ply paper together to make a thicker “two-ply”. For years, 2-ply toilet tissue was always thicker and usually assumed to be better.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 6,000 training samples
- Columns:
sentence1
,sentence2
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 label type string string float details - min: 8 tokens
- mean: 11.97 tokens
- max: 23 tokens
- min: 14 tokens
- mean: 58.86 tokens
- max: 126 tokens
- min: 0.0
- mean: 0.17
- max: 1.0
- Samples:
sentence1 sentence2 label what is the difference between rapid rise yeast and bread machine yeast?
Though there are some minor differences in shape and nutrients, Rapid-Rise Yeast is (pretty much) the same as Instant Yeast and Bread Machine Yeast. ... Also, Rapid-Rise Yeast is a little more potent than Active Dry Yeast and can be mixed in with your dry ingredients directly.
1.0
what is the difference between rapid rise yeast and bread machine yeast?
Omeprazole and esomeprazole therapy are both associated with a low rate of transient and asymptomatic serum aminotransferase elevations and are rare causes of clinically apparent liver injury.
0.0
what is the difference between rapid rise yeast and bread machine yeast?
Benefits of choosing a soft starter A variable frequency drive (VFD) is a motor control device that protects and controls the speed of an AC induction motor. A VFD can control the speed of the motor during the start and stop cycle, as well as throughout the run cycle.
0.0
- Loss:
CoSENTLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "pairwise_cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 16per_device_eval_batch_size
: 16num_train_epochs
: 1warmup_ratio
: 0.1seed
: 12bf16
: Truedataloader_num_workers
: 4
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 12data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 4dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size
: 0fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss |
---|---|---|
0.0027 | 1 | 0.3104 |
Framework Versions
- Python: 3.11.12
- Sentence Transformers: 4.1.0
- Transformers: 4.51.3
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.2
- Datasets: 3.5.0
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
CoSENTLoss
@online{kexuefm-8847,
title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
author={Su Jianlin},
year={2022},
month={Jan},
url={https://kexue.fm/archives/8847},
}