Spaces:
Running
[MODELS] Discussion
what are limits of using these? how many api calls can i send them per month?
How can I know which model am using
Out of all these models, Gemma, which was recently released, has the newest information about .NET. However, I don't know which one has the most accurate answers regarding coding
Gemma seems really biased. With web search on, it says that it doesn't have access to recent information asking it almost anything about recent events. But when I ask it about recent events with Google, I get responses with the recent events.
apparently gemma cannot code?
Gemma is just like Google's Gemini series models, it have a very strong moral limit put on, any operation that may related to file operation, access that might be deep, would be censored and refused to reply.
So even there are solution for such things in its training data, it will just be filtered and ignored.
But still didn't test the coding accuracy that doesn't related to these kind of "dangerous" operations
hi there, if deepseek r2 were to be out there, released, then it should be replacing the ageing deepseek-r1-distill-32b. Otherwise, we should bring back openthinker-2 -32b as a great addition.
we hope the devs will add new models soon!
hi @nsarrazin , please consider replacing deepseek-r1 with the oncoming deepseek-r2 if fully released to the public, or add a new reasoning model such as openthinker2-32b (https://huggingface.co./open-thoughts/OpenThinker2-32B ), since qwq-32b and deepseek-r1-distill-32b can hallucinate at times. thank you for your hard work, we appreciate you man.
hi @nsarrazin , please consider replacing deepseek-r1 with the oncoming deepseek-r2 if fully released to the public, or add a new reasoning model such as openthinker2-32b (https://huggingface.co./open-thoughts/OpenThinker2-32B ), since qwq-32b and deepseek-r1-distill-32b can hallucinate at times. thank you for your hard work, we appreciate you man.
and please don't forget to add community tool support for deepseek r2 (if released), qwq or any other reasoning model that is available on huggingchat.
new model with good reasoning and uncensored like commandR+
hermes is a bit old now. waiting for their newer llm. so, which model should be it? Wayfarer 70B (llama3.3) works?
new model with good reasoning and uncensored like commandR+
hermes is a bit old now. waiting for their newer llm. so, which model should be it? Wayfarer 70B (llama3.3) works?
we may need to wait for deepseek r2 release or add openthinker2-32b. they are some of the world's best reasoning models
gotta be uncensored too. and by reasoning, i don't mean chain of thought models, without a toggle option to turn off CoT, it takes way too much time, not worth it.
Wayfarer 70B came out a few months ago i think. should meet the criteria ig. https://huggingface.co./LatitudeGames/Wayfarer-Large-70B-Llama-3.3
and better options are always welcome!