moshew's picture
Add new SentenceTransformer model
39a97eb verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:2000
  - loss:CoSENTLoss
base_model: avsolatorio/GIST-small-Embedding-v0
widget:
  - source_sentence: do vivid seats tickets work?
    sentences:
      - >-
        Charlotte-Mecklenburg Schools will be closed for students on Friday due
        to the forecast of severe weather. ... CMS staff members work with city
        and county leaders to receive the most up-to-date information about road
        and weather conditions.
      - >-
        Tickets are $40 per ticket and $400 for a table of ten. Tickets are
        available for purchase when you register for the show.
      - >-
        This service is currently offered free of charge by the bank. You can
        get the last 'Available' balance of your account (by an SMS) by giving a
        Missed Call to 18008431122. You can get the Mini Statement (by an SMS)
        for last 5 transactions in your account by giving a Missed Call to
        18008431133. 1.
  - source_sentence: is alexa compatible with tv?
    sentences:
      - >-
        To fix this Echo red light, start with the restart of the router and
        Amazon Echo. In case, the restart process doesn't work, check for the
        device and app update in Alexa app. If it's available, click the
        'Update' button for compatibility reason.
      - >-
        Ligament - A small band of dense, white, fibrous elastic tissue.
        Ligaments connect the ends of bones together in order to form a joint.
        Tendon - A tough, flexible band of fibrous connective tissue that
        connects muscles to bones.
      - >-
        There are 610 calories in a 1 bowl serving of El Pollo Loco Original
        Pollo Bowl.
  - source_sentence: can you play fortnite save the world on mac?
    sentences:
      - >-
        ['In the Music app on your Mac, click iTunes Store in the sidebar. ...
        ', 'Click Purchased (below Quick Links) near the top right of the iTunes
        Store window.', 'Click Music near the top right of the page that
        appears. ... ', 'To download an item, click its Download button .']
      - >-
        Essential Oils in the Second and Third Trimesters. "In the second and
        third trimesters, some essential oils are safe to use, as your baby is
        more developed," Edwards adds. These include lavender, chamomile, and
        ylang ylang—all of which calm, relax, and aid sleep.
      - >-
        ADR holders do not have to transact the trade in the foreign currency or
        worry about exchanging currency on the forex market. ... ADRs list on
        either the New York Stock Exchange (NYSE), American Stock Exchange
        (AMEX), or the Nasdaq, but they are also sold over-the-counter (OTC).
  - source_sentence: how long does money take to transfer boi?
    sentences:
      - >-
        When will it take more than one working day? It will take more than one
        working day to reach your payee's bank when: You make a payment online
        after 3.30pm in the Republic of Ireland or after 4.30pm in Northern
        Ireland and Great Britain on a working day. Your payment will begin to
        process on the next working day.
      - >-
        If you had bought just one share of Microsoft at the IPO, you would now
        have 288 shares after all the splits. Those shares would be worth
        $44,505 at the current stock quote of $154.53. A $5,000 investment would
        have purchased 238 shares at the IPO price.
      - >-
        FKM is the American standard ASTM short form name for Fluro-Elastomer.
        ... VITON™ is a registered trademark of Du Pont performance elastomers,
        the original developers of the rubber. However, the Viton is also used
        as a general name for the material, no matter who the manufacturer is.
  - source_sentence: how long is a texas vehicle inspection report good for?
    sentences:
      - >-
        ['Aerospace engineer.', 'Automotive engineer.', 'CAD technician.',
        'Contracting civil engineer.', 'Control and instrumentation engineer.',
        'Maintenance engineer.', 'Mechanical engineer.', 'Nuclear engineer.']
      - >-
        A key difference is that it's simpler to unlock a credit lock than it is
        to “thaw” a credit freeze. But a freeze may afford legal protections
        that a lock doesn't. ... The credit bureaus sometimes promote their
        credit lock services, which can carry a monthly fee, alongside their
        credit freeze options, which are free.
      - >-
        If your car fails its MOT you can only continue to drive it if the
        previous year's MOT is still valid - which might occur if you submitted
        the car for its test two weeks early. You can still drive it away from
        the testing centre or garage if no 'dangerous' problems were identified
        during the MOT.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

SentenceTransformer based on avsolatorio/GIST-small-Embedding-v0

This is a sentence-transformers model finetuned from avsolatorio/GIST-small-Embedding-v0. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: avsolatorio/GIST-small-Embedding-v0
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("moshew/gist_small_ft_gooaq_v3")
# Run inference
sentences = [
    'how long is a texas vehicle inspection report good for?',
    "If your car fails its MOT you can only continue to drive it if the previous year's MOT is still valid - which might occur if you submitted the car for its test two weeks early. You can still drive it away from the testing centre or garage if no 'dangerous' problems were identified during the MOT.",
    "['Aerospace engineer.', 'Automotive engineer.', 'CAD technician.', 'Contracting civil engineer.', 'Control and instrumentation engineer.', 'Maintenance engineer.', 'Mechanical engineer.', 'Nuclear engineer.']",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,000 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 8 tokens
    • mean: 12.05 tokens
    • max: 23 tokens
    • min: 13 tokens
    • mean: 59.84 tokens
    • max: 124 tokens
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    what is the difference between rapid rise yeast and bread machine yeast? Though there are some minor differences in shape and nutrients, Rapid-Rise Yeast is (pretty much) the same as Instant Yeast and Bread Machine Yeast. ... Also, Rapid-Rise Yeast is a little more potent than Active Dry Yeast and can be mixed in with your dry ingredients directly. 1.0
    what is the difference between rapid rise yeast and bread machine yeast? Fermentation recycles NAD+, and produces 2 ATPs. In lactic acid fermentation, pyruvate from glycolysis changes to lactic acid. ... In alcoholic fermentation, pyruvate changes to alcohol and carbon dioxide. This type of fermentation is carried out by yeasts and some bacteria. 0.0
    are light kits universal for ceiling fans? Not all Universal Light Kits are actually Universal. They can be universal to only that manufacturer. ... Casablanca and Hunter Ceiling Fan Light Kits are universal only to their own fans. 1.0
  • Loss: CoSENTLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "pairwise_cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • dataloader_num_workers: 4

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 4
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss
0.008 1 3.5339

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CoSENTLoss

@online{kexuefm-8847,
    title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
    author={Su Jianlin},
    year={2022},
    month={Jan},
    url={https://kexue.fm/archives/8847},
}