flipboard.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
Welcome to Flipboard on Mastodon. A place for our community of curators and enthusiasts to inform and inspire each other. If you'd like to join please request an invitation via the sign-up page.

Administered by:

Server stats:

1.2K
active users

#lmstudio

1 post1 participant1 post today

Hongkiat: Running Large Language Models (LLMs) Locally with LM Studio. “Running large language models (LLMs) locally with tools like LM Studio or Ollama has many advantages, including privacy, lower costs, and offline availability. However, these models can be resource-intensive and require proper optimization to run efficiently. In this article, we will walk you through optimizing your […]

https://rbfirehose.com/2025/03/25/hongkiat-running-large-language-models-llms-locally-with-lm-studio/

Not sure if you have noticed it: Google has released Gemma 3, a powerful model that is small enough to run on local computers.

blog.google/technology/develop

I've done some experiments on my Laptop (with a Geforce 3080ti), and am very impressed. I tried to be happy with Llama3, with the Deepseek R1 distills on Llama, with Mistral, but the models that would run on my computer were not in the same league as what you get from ChatGPT or Claude or Deepseek remotely.

Gemma changes this for me. So far I let it write 3 smaller pieces of Javascript, analyze a few texts, and it performed slow, but flawlessly. So finally I can move to a "use the local LLM for the 90% default case, and go for the big ones only if the local LLM fails".

This way
- I use far less CO2 for my LLM tasks
- I am in control of my data, nobody can collect my prompts and later sell my profile to ad customers
- I am sure the IP of my prompts stay with me
- I have the privacy to ask it whatever I want and no server in the US or CN has those data.

Interested? If you have a powerful graphiccs card in your PC, it is totally simple:

1. install LMStudio from LMStudio.ai
2. in LMStudio, click Discover, and download the Gema3 27b Q4 model
3. Chat

If your graphics card is too small, you might head for the smaller 12b model, but I can't tell you how well it performs.

Google · Introducing Gemma 3: The most capable model you can run on a single GPU or TPUBy Clement Farabet

Did a few coding experiments with Gemma 3 local on lmstudio. So far it performs flawless (in terms of capability - on my lowly Geforce 3080ti it is fairly slow, something like 5 tokens per second. But I've got time, and it is mine, running locally, no Billionaire's corporation sees my prompts.

For me (privacy nut) this is a big thing, not having to use ChatGPT for everything.

Whoohoo! #LMStudio lets me run the big Gemma 3 27b, 4-bit quantized locally on my slightly old gaming laptop with Geforce 3080 TI. It is slow, but my first tests show it is indeed fairly powerful. Not up to the reasoning models for reasoning tasks (duh!), but way above everything else I could run locally before (Llama 3).

A big step towards keeping your data private and still enjoy the services of a great LLM.

#gemma#google#ai

youtu.be/J4qwuCXyAcU

In this video, Ollama vs. LM Studio (GGUF), showing that their performance is quite similar, with LM Studio’s tok/sec output used for consistent benchmarking.

What’s even more impressive? The Mac Studio M3 Ultra pulls under 200W during inference with the Q4 671B R1 model. That’s quite amazing for such performance!

youtu.be- YouTubeEnjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.
Continued thread

New setup guide explains how to extract accurate text from PDFs while preserving reading order, handling tables, equations and handwriting with open-source tools on standard Mac hardware.

🖥️ Download and run #LMStudio as local inference server to host the #OlmOCR model