You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
An example output from tgi-gaudi:
"To answer this question, let's review the execution history. We find two relevant pieces of information:
The Foo Fighters headlined the Reading and Leeds festivals in 2012.
The Foo Fighters have headlined the Reading and Leeds festivals at least once, in 2019.
However, we can see that the knowledge cutoff is 2019, and the question was asked in 2024. So, there might be information that is not available in the execution history.
From the"
The generation stopped at "From the" without hitting max_new_tokens = 4096.
Another example output from tgi-gaudi:
Unfortunately, the tools available do not directly provide information about the number of number one hits on the US Billboard Hot 100 chart for specific artists. To answer this question accurately, I would need to know this information. However, I can try to find a workaround.
My approach is to search the knowledge base for Michael Jackson's number one hits and Elvis Presley's number one hits and then compare the counts.
{"tool":"search_knowledge_base", "args":{"query": "Michael Jackson number one
The generation stopped in the middle of an incomplete json generation.
Expected behavior
generation should continue till max new tokens or hit an apparent stop token.
The text was updated successfully, but these errors were encountered:
System Info
tgi-gaudi 2.0.4
Used below docker compose yaml to launch tgi-gaudi
Serve llama3.1-70B-instruct model
--top_k 10
--max_new_tokens 8192
--temperature 0.01
--top_p 0.95
--repetition_penalty 1.03
--return_full_text false
services:
tgi-service:
image: ghcr.io/huggingface/tgi-gaudi:2.0.4
container_name: tgi-server
ports:
- "8085:80"
volumes:
- ${HF_CACHE_DIR}:/data
environment:
no_proxy: ${no_proxy}
http_proxy: ${http_proxy}
https_proxy: ${https_proxy}
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
HF_HUB_DISABLE_PROGRESS_BARS: 1
HF_HUB_ENABLE_HF_TRANSFER: 0
HABANA_VISIBLE_DEVICES: 0,1,2,3
OMPI_MCA_btl_vader_single_copy_mechanism: none
PT_HPU_ENABLE_LAZY_COLLECTIVES: true
runtime: habana
cap_add:
- SYS_NICE
ipc: host
command: --model-id ${LLM_MODEL_ID} --max-input-length 8192 --max-total-tokens 16384 --sharded true --num-shard 4
Information
Tasks
Reproduction
Use this script to reproduce: https://github.com/minmin-intel/GenAIComps/blob/ragagent-v1.1-dev/comps/agent/langchain/test_llama.sh
An example output from tgi-gaudi:
"To answer this question, let's review the execution history. We find two relevant pieces of information:
However, we can see that the knowledge cutoff is 2019, and the question was asked in 2024. So, there might be information that is not available in the execution history.
From the"
The generation stopped at "From the" without hitting max_new_tokens = 4096.
Another example output from tgi-gaudi:
Unfortunately, the tools available do not directly provide information about the number of number one hits on the US Billboard Hot 100 chart for specific artists. To answer this question accurately, I would need to know this information. However, I can try to find a workaround.
My approach is to search the knowledge base for Michael Jackson's number one hits and Elvis Presley's number one hits and then compare the counts.
{"tool":"search_knowledge_base", "args":{"query": "Michael Jackson number one
The generation stopped in the middle of an incomplete json generation.
Expected behavior
generation should continue till max new tokens or hit an apparent stop token.
The text was updated successfully, but these errors were encountered: