AI on Cloudron
-
Amazon (AWS) recently made a big announcement: a service Bedrock was finally released.
If you missed the news:
Amazon decided to enter massively into the generative AI field and invested in the company Anthropic for $4 billion so they can get into the race and compete with popular AI services like ChatGPT, Dall-E, MidJourney, etc.The first proposal of Amazon is Bedrock, a fully managed AI service that offers multiple foundational models such as Claude, Jurassic, Titan, and Stable Diffusion XL, among others.
I've started to test Bedrock myself, what's interesting is being able to test and use different FMs from different vendors from the same web UI from inside your AWS account BUT, from a unique API too!
That means, AWS is creating a single API access that can be used to access all the actual avaialable FMs through its Bedrock service, and any new ones they will eventually add. Interesting approach.
Some has already made a video here,
, if you want to take a peak. -
$1.99/hour for Nvidia H100 80GB GPU
Enjoy the amazing graphics in this paper!
https://invidious.lunar.icu/watch?v=ffarLQDQmC4 -
Nothing really to do with AI on Cloudron, but just tapping into the knowledge of the people watching this space who are contributing to this thread:
I had an idea I like which AI could probably help with/ be a good fit for:
- give an AI access to an ebook library
- ask an AI write summaries of those books
- ask an AI to turn those summaries into scripts for 2 minute animated video summaries
- ask an AI to create those animations based on those scripts.
- ask an AI to publish those videos to a PeerTube/ TikTok/ YouTube/ Twitter/ Facebook Shorts/ Instagram channel etc etc
I'm presuming that this might already be perfectly possible with already existing tools, but I've no idea how one might go about it. Can anyone here enlighten me/ give me a few pointers.
I had this idea because one of my favourite books is "The Resilience Imperative: Cooperative Transitions to a Steady-state Economy" by Michael Lewis and Pat Conaty, and I've often thought that a short animated video version would be really great.
And as I'm about to start some work with Michael Lewis (helping to promote the next iteration of Synergia Institute's MOOC "Toward Co-operative Commonwealth: Transition in a Perilous Century") I reminded myself of that idea and then thought: AI could probably do it for me! And then figured if it could do it for one book why not All The Books!
-
@jdaviescoates It is a lovely idea, and sounds doable in principle. In practice, somebody talented like e.g. robi would need to do it at the moment, because the tech required is not quite there, yet. A lot of fixing would need to be accomplished.
-
eBook library - you can train an ai on a local repository of texts using e.g. gpt4all. The AIs I have managed to get working with this aren't impressive, but that is more me than the AI! It would help if you had practiced it a few times, and also had a big spec computer to do it properly after pilot studies.
-
AI Summaries - Yes, that is possible, but the summaries are too short. The number of tokens you can create are about 2000, which isn't long for a summary. You might have to summarize chapter by chapter and then stitch these together later. The chunking could be useful for video.
-
AI Scripts - The amount of output you can generate on a local llama is not gigantic and it is a bit hit or miss what sort of script format you would get. Trying to keep consistency between chapters might be troublesome. Proprietary AI would make it a bit easier, but then you might have refusals / alignment problems.
-
AI Animations - The ones I have seen are about 20 seconds or so. They are becoming better. Trying to make your first 20 second animation have a resemblance to a subsequent animation would be tricky.
-
Automating the flow - robi would be good at that, but then he has great expertise.
I think if you wait for a while, a lot of the friction will be reduced and soon "everybody will be doing it".
-
-
@LoudLemur Wow, thanks for the gracious mention(s)
While I used to be in that industry and can see how it might be done, it's a deep multi stage pipeline of things that would need lots of creative solutions.
So my recommendation would be to reach out and partner with a startup/dev/community in the space who is interested in making a splash with such a solution and get more traction behind it.
Years ago I was pitched to do a startup with similar goals but more around TV series and Movies.
In any case, great idea @jdaviescoates
-
@robi said in AI on Cloudron:
a startup with similar goals but more around TV series and Movies
Even for me, it would be interesting to imagine what you would now be doing if you had ended up following that route. For you, you must wonder sometimes. I am just glad you are here.
FaceChain
Rapid training and face generation based on a few of your uploaded images:
https://huggingface.co/spaces/modelscope/FaceChainhttps://github.com/modelscope/facechain
(I haven't had success using this on Huggingface. Maybe it is overloaded...)
-
-
@jdaviescoates said in AI on Cloudron:
ask an AI to create those animations based on those scripts.
non-Free Cascadeur can help make models, but automation is not yet here:
https://invidious.projectsegfau.lt/watch?v=JrddPZmUHvE&si=FW0u3Fdpu6U7ZrTC&quality=dash -
@LoudLemur said in AI on Cloudron:
@micmc said in AI on Cloudron:
what's interesting is being able to test and use different FMs
FMs?
Foundation models
-
Continue.dev
https://continue.dev/
The open-source autopilot for software development
Bring the power of ChatGPT to your IDEAnswer coding questions
ββββββββββββββββββββ
Highlight sections of code and ask Continue for another perspectiveβwhat does this forRoot() static function do in nestjs?β
βwhy is the first left join in this query necessary here?β
βhow do I run a performance benchmark on this rust binary?βEdit in natural language
ββββββββββββββββββ
Highlight a section of code and instruct Continue to refactor itβ/edit rewrite this to return a flattened list from a 3x3 matrixβ
β/edit refactor these into an angular flex layout on one line"
β/edit define a type here for a list of lists of dictionariesβGenerate files from scratch
βββββββββββββββββββββ
Open a blank file and let Continue start new Python scripts, React components, etc.β/edit get me started with a basic supabase edge functionβ
β/edit implement a c++ shortest path algo in a concise wayβ
β/edit create a docker compose file with php and mysql server" -
@LoudLemur said in AI on Cloudron:
@jdaviescoates said in AI on Cloudron:
ask an AI to create those animations based on those scripts.
non-Free Cascadeur can help make models, but automation is not yet here:
https://invidious.projectsegfau.lt/watch?v=JrddPZmUHvE&si=FW0u3Fdpu6U7ZrTC&quality=dashThanks but I'm thinking much more simple animation. Like those classic Whiteboard style ones (made especially famous by RSA shorts), or even just the Xtranormal type ones.
I found https://wave.video which was pretty impressive TBH, I just gave it a link to a webpage and it did a pretty good job and making a nice video out of it (but then wanted $25 for me to download the video it auto-generated, hence why I'm not sharing it with you here)
-
@jdaviescoates said in AI on Cloudron:
Thanks but I'm thinking much more simple animation. Like those classic Whiteboard style ones (made especially famous by RSA shorts),
Those were great videos. The whiteboard sketches really helped hold the attention and were a pleasure to watch and often funny. I think I have a clear idea of the sort of thing you are looking for now. It is the sort of thing that would lend itself to text-to-image very well indeed, I would say. I hope you are eventually able to do it!
-
Leo - A more private LLM from Brave - https://brave.com/leo/
-
@robi said in AI on Cloudron:
Leo - A more private LLM from Brav
Excellent! Nice one, @robi!
This seems to be a similar tactic used by Micro$oft to get people to (finally) use Bing browser: they make access to ChatGPT4 available in Bing. If you are using Brave, but setup some other search engine as the default, you will need to undo that to try out Leo.One thing you could try is setting normal browsing to one search engine e.g. Brave for Leo, and then set incognito browsing to a different search engine so you can handily switch to that when needed.
If you go into Brave settings, you can see the default is llama2 13B. If you unlock (by paying for Leo Pro) you can use llama2 70b or Claude Instant. (Claude is great by the way, but proprietary.)
Is there a shortcut key, to help you select a Leo AI response rather than having to switch from keyboard to mouse-clicking on Leo?
-
48GB VRAM AI Workstation for about US$1092
https://github.com/magiccodingman/Magic-AI-Wiki/blob/main/Wiki/Budget-AI-Workstation-Build.md
Nvidia Tesla P40 GPU $175 2 $350 Link
P40 power adapters $15 2 $30 Link
Dell PowerEdge R730 (64GB RAM, 2x E5-2667v4 3.2GHz = 16 Cores, 8 bay) $364 1 $364 Link
1100W Dell PSU $21 2 $42 Link
Any Cheap SSD's $27 2 $54 Link
R730 Riser 3 GPU Addition $15 1 $15 Link
Drive Caddies for SSD's $30 1 $30 Link
NVME SSD 4TB 7.3k MB/s $189 1 $189 Link
NVME PCIE Addition Card $18 1 $18 Link
Total Cost $1,092 ------ ------ ------ -
Locally Hosted Language Model with AI image support (i.e. multi-modal)
Demo: http://imagebind-llm.opengvlab.com/
Self-Host: https://github.com/Alpha-VLLM/LLaMA2-Accessory/tree/main/SPHINX#host-local-demoWould any brave soul from here like to try this? Multi-modal means that the AI can look as well as read.
-
plugin.surf
Thousands of free as in beer ChatGPTs
https://plugin.surf/