Ollama an Console Modul

esferatec

I am currently trying to write a console application for ‘Ollama’. Ollama is a server to run large language models locally. The server has a REST API that can be used to access the chat function, among other things.

I have managed to establish a connection to the server, ask my question and get an answer back. But I have to wait until I have received the full answer.

local ollamaserver = net.Http("http://localhost:11434")
local postrequest = ollamaserver:post("/api/generate", json.encode("{
  "model": "llama3.2",
  "prompt": "Why is the sky blue?"
}"))

With Ollama and ChatGTP, however, writing from word to word starts immediately.

Can I achieve this with HTTP at all? Or can this be achieved via SOCKET? Has anyone already done something similar? I have tried to work with SOCKET, but can't get a connection to the server.

Samir

esferatec

Hi esferatec

This seems as an ollama server issue. Maybe your computer is not powerful enough to run the requests ?

See [Login to see the link]

esferatec

Thanks for your answer. the speed is not the problem with the own ollama conole and python it works.

The problem is that with LuaRT I have to wait until I have received the whole response and then I can print it all at once. That works fine.

I would need to be able to print every partial response (chunk) the app receive. Similar to the example ‘net / download’ where you use "request.received" tu update the percentage during th download.

To be honest I don't know how this works exactly with python. Here there is a standrd library to communicate with the server.

Samir

Oh ok, I understand
Maybe you can try to use the same method as on the download.lua example, by using download() with a temporary filename then check for this file while the Task is not terminated.

For now, Http object don't provide a Http.content property. The only field updated during a response data download, is the Http.received property

Maybe for a new update 🙂

Samir

Update :

LuaRT 1.9.0 will add this Http.content property to get the current internal received buffer

esferatec

Thank you very much for the suggestion. I get the following answers from the Ollama server when I set ‘stream = true’.

{"model":"llama3.2:3b","created_at":"2025-01-02T19:22:25.4781414Z","response":"The","done":false}
{"model":"llama3.2:3b","created_at":"2025-01-02T19:22:25.5542834Z","response":" sky","done":false}
{"model":"llama3.2:3b","created_at":"2025-01-02T19:22:25.6251413Z","response":" appears","done":false}
{"model":"llama3.2:3b","created_at":"2025-01-02T19:22:25.7005726Z","response":" blue","done":false}

But I don't want you to spend a lot of time on it. I also have to do more research on how it can be done in other programming languages. Maybe I can also do something with socket. But I don't want you to spend a lot of time on it. I also need to do more research on how it can be done in other programming languages. Maybe you can also do something with socket. But I'm not familiar with this and still need to read up on it.

Samir

It's already implemented 😉