Options

To adjust these options, edit:

~/.config/chat-script/chat-script.ini

AppChainEmbeddingsResponse

App

Option	Desc	Default
share	Whether to create a publicly shareable link for the gradio app	False
server_name	IP address that local app is deployed at	127.0.0.1
server_port	Port that local app is deployed at	7860
inbrowser	Whether to automatically launch the gradio app in a new tab on the default browser	True

Chain

Option	Desc	Default
embeddings_model	Name of Ollama LLM used to generate embeddings	mxbai-embed-large
chat_model	Name of Ollama LLM used to generate responses	mistral
moderation_model	Name of Ollama LLM used to moderate queries	llama-guard3:1b
embeddings_url	URL of Ollama LLM used to generate embeddings	http://localhost:11434
chat_url	URL of Ollama LLM used to generate responses	http://localhost:11434
moderation_url	URL of Ollama LLM used to moderate queries	http://localhost:11434
show_progress	Whether to display embeddings model batch progress	False
keep_alive	How long the model will stay loaded into memory	5m
temperature	The temperature of the chat model. Increasing the temperature will make the model answer more creatively	0.6
top-k	Reduces the probability of generating nonsense. A higher value will give more diverse answers, while a lower value will be more conservative	30
top-p	Works together with top-k. A higher value will lead to more diverse text, while a lower value will generate more conservative text	0.7
collection_name	Name of local document collection	rag-chroma
top_n_results	Amount of documents to return	3
rag_fusion	Whether to enable rag-fusion, an advanced rag technique that may improve semantic search relevance	True
num_queries	Number of synthetic queries to generate for rag-fusion	2
top_n_results_fusion	Maximum amount of documents to return for rag-fusion (maximum, as unique union is taken)	2
embeddings_gpu	Whether to use the GPU when generating embeddings (on devices with <8GB VRAM, setting to False can reduce latency)	True

Embeddings

Option	Desc	Default
embeddings_model	Name of Ollama LLM used to generate embeddings	mxbai-embed-large
embeddings_url	URL of Ollama LLM used to generate embeddings	http://localhost:11434
show_progress	Whether to display document loading and embeddings model batch progress	True
collection_name	Name of local document collection	rag-chroma
use_multithreading	Whether to enable CPU multithreading for loading documents	True
chunk_size	Number of tokens in each split document chunk	250
chunk_overlap	Number of tokens shared between consecutive split document chunks	50
batch_size	Maximum number of split documents in each embeddings batch	41666

Response

Option	Desc	Default
context_stream_delay	Amount of time in s to delay tokens streamed for non-LLM text (sources, moderation notice)	0.075
max_history	Maximum number of previous user messages to include as context	2
print_state	Whether to print app state for each query. Includes: IP address, chat history, and context	True
moderate	Whether to moderate user queries before allowing responses. Prints IP address and offending query even if print_state is false (allows for privacy-preserving moderation)	False
moderate_alert	Whether to display system alerts when an unsafe question is received (Linux-only)	False