Text Generation
Transformers
Safetensors
English
llama
conversational
text-generation-inference
lucifertrj commited on
Commit
529e50e
·
verified ·
1 Parent(s): 799730b

add vLLM inference

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -23,6 +23,46 @@ base_model: codellama/CodeLlama-13b-Instruct-hf
23
 
24
  ## Inference
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  ```python
27
  import torch
28
  import transformers
@@ -70,6 +110,18 @@ hello_constant = tf.constant('Hello, World!')
70
 
71
  # Print the value of the constant
72
  print(hello_constant)
 
 
 
 
 
 
 
 
 
 
 
 
73
  ```
74
 
75
  ## 🔗 Key Features:
 
23
 
24
  ## Inference
25
 
26
+ > Hardware requirements:
27
+ >
28
+ > 30GB VRAM - A100 Preferred
29
+
30
+ ### vLLM - For Faster Inference
31
+
32
+ #### Installation
33
+
34
+ ```
35
+ !pip install vllm
36
+ ```
37
+
38
+ **Implementation**:
39
+
40
+ ```python
41
+ from vllm import LLM, SamplingParams
42
+
43
+ llm = LLM(model='aiplanet/panda-coder-13B',gpu_memory_utilization=0.95,max_model_len=4096)
44
+
45
+ prompts = [""" ### Instruction: Write a Java code to add 15 numbers randomly generated.
46
+ ### Input: [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]
47
+ ### Response:
48
+ """,
49
+ "### Instruction: write a neural network complete code in Keras ### Input: Use cifar dataset ### Response:"
50
+ ]
51
+
52
+ sampling_params = SamplingParams(temperature=0.1, top_p=0.95,repetition_penalty = 1.1,max_tokens=1000)
53
+
54
+ outputs = llm.generate(prompts, sampling_params)
55
+
56
+ for output in outputs:
57
+ prompt = output.prompt
58
+ generated_text = output.outputs[0].text
59
+ print(generated_text)
60
+ print("\n\n")
61
+ ```
62
+
63
+
64
+ ### Transformers - Basic Implementation
65
+
66
  ```python
67
  import torch
68
  import transformers
 
110
 
111
  # Print the value of the constant
112
  print(hello_constant)
113
+ ```
114
+
115
+ ## Prompt Template for Panda Coder 13B
116
+
117
+ ```
118
+ ### Instruction:
119
+ {<add your instruction here>}
120
+
121
+ ### Input:
122
+ {<can be empty>}
123
+
124
+ ### Response:
125
  ```
126
 
127
  ## 🔗 Key Features: