AI on Cloudron
-
From Reddit:
Etched is working on a specialized AI chip called Sohu. Unlike general-purpose GPUs, Sohu is designed to run only transformer models, the architecture behind LLMs like ChatGPT.The company claims Sohu offers dramatically better performance than traditional GPUs while using less energy. This approach could improve AI infrastructure as the industry grapples with increasing power consumption and costs.
Key details:
Etched raised $120 million Series A funding to work on Sohu Sohu will be manufactured using TSMC 4nm process The chip can deliver 500,000 tokens per second for Llama 70B One Sohu server allegedly replaces 160 H100 GPUs Etched claims Sohu is going to be 10x faster and cheaper than Nvidia's next-gen Blackwell GPUs
-
Good model for coding:
https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Instruct
-
Groq is the AI infrastructure company that delivers fast AI inference.
The LPU Inference Engine by Groq is a hardware and software platform that delivers exceptional compute speed, quality, and energy efficiency.
Groq, headquartered in Silicon Valley, provides cloud and on-prem solutions at scale for AI applications. The LPU and related systems are designed, fabricated, and assembled in North America.
Since I use this with Llama 3 70B I don't have a need for GPT 3.5 anymore. GPT 4 is too expensive IMHO
-
@Kubernetes Thanks. How do you actually sign up for Groq, as their stych servers don't seem to be working and they seem to require a Github account for registration
-
@LoudLemur I did sign up with my Github Account...
-
Llama 3.1 405b released. Try here: https://www.meta.ai
-
-
-
Step Games - an prisoner's dilemma for Large Language Models. The emergent text section is quite interesting:
https://github.com/lechmazur/step_game