[Video] Testing Google Gemini Audio Capabilities

Gemini with audio Video Code Mix.install([ {:req, "~> 0.4.14"}, {:kino, "~> 0.12.0"} ]) Form form = Kino.Control.form( [ prompt: Kino.Input.textarea("Prompt"), audio: Kino.Input.audio("Audio", format: :wav) ], submit: "Submit" ) frame = Kino.Frame.new(…

TIL: Sum Types With `instructor_ex`

The Instructor Elixir library lets you retrieve structured output from LLMs like the OpenAI GPT models. But I found that having it return structs that are sum types is not that straightforward. Simple Instructor responses For this post, let’s looks at survey questions as an example. You can define…

TIL: Running `cargo-flamegraph` With Tauri Apps

A Tauri app is just a Rust-compiled binary. But because the Tauri workflow also involves some frontend development you normally develop your app using something like: pnpm tauri dev So when I had to profile my app, it wasn’t immediately obvious how to invoke flamegraph, but it’s really…

TIL: Base64 Encoding an ImageBuffer as PNG(in Rust)

Recently, I had to base64-encode an image::ImageBuffer to send it as a data URL like data:image/png;base64,iVBORw0KGgoAAAANSUhEU... Initially, I tried this: // Cargo.toml image = "0.24.7" base64 = "0.21.5" use base64::{engine::general_purpose, Engine as _}; use image::ImageBuffer; // WRONG:…

TIL: File Uploads Using the Req Elixir Library

The Req Elixir library doesn’t support file uploads(as of version 0.4.5). Instead, you need to use multipart to construct the HTTP request before you send it. Here is an example sending an mp3 file to the OpenAI audio transcription API(aka Whisper): def transcribe_audio(file_…

Project: Try DALLE-3

Recently, I wanted to use OpenAI’s DALLE-3 to generate some images. Surprisingly, I didn’t find an easy way to do this(have OpenAI API set up but no ChatGPT Plus). At least as of right now, OpenAI’s playground doesn’t let you play with DALLE-3. So I…

TIL: Creating `sentence-transformers` Embeddings From Bumblebee

One amazing thing about the Elixir machine learning ecosystem is that you can run state-of-the-art ML models right from Elixir. You can go pick your favorite Hugging Face model and get it running without any ceremony. I was recently creating text embeddings using Bumblebee, and realized that some of the…