VASA-1 – Microsoft Research

VASA, a framework for generating lifelike talking faces of virtual characters with appealing visual affective skills (VAS), given a single static image and a speech audio clip. Source: VASA-1 – Microsoft Research

Michael Tsai – Blog – The Alternative Implementation Problem

What I’ve concluded, based on experience, is that positioning your project as an alternative implementation of something is a losing proposition. It doesn’t matter how smart you are. It doesn’t matter how hard you work. The problem is, when you build an alternative implementation, you’ve made yourself subject to the whims of the canonical implementation. […]

Woodworking as an escape from the absurdity of software

You know the drill, sometimes the world of software development feels so absurd that you just want to buy a hundred alpaca and sell some wool socks and forget about solving conflicts in package.json for the rest of your life. Source: Woodworking as an escape from the absurdity of software

Andrej Karpathy on X: “Congrats to @AIatMeta on Llama 3 release!! 🎉 Notes: Releasing 8B and 70B (both base and finetuned) models, strong-performing in their model class (but we’ll see when the rankings come in @ @lmsysorg :)) 400B

Scaling laws. Very notably, 15T is a very very large dataset to train with for a model as “small” as 8B parameters, and this is not normally done and is new and very welcome. The Chinchilla “compute optimal” point for an 8B model would be train it for ~200B tokens. (if you were only interested […]

SLAM algorithms for 360 views : r/computervision

Because 360 cameras are not really 360 but dual cameras with both images stitched together more or less correctly, you are really talking about an extension of those methods to dual fish-eye cameras.

Gemini 1.5 and Google’s Nature – Stratechery by Ben Thompson

…the reliance on scale and an overwhelming infrastructure advantage. That, more than anything, is what defines Google, and it was encouraging to see that so explicitly put forward as an advantage. Source: Gemini 1.5 and Google’s Nature – Stratechery by Ben Thompson

How We Built a Custom Permissions DSL at Figma | Figma Blog

When you click on the “Share” button in a Figma file, the share modal pops up, which controls who can access the file and their corresponding level of authorship. There are two main ways to access a file: through a role and through a link. Roles are hierarchical, so you might have access to the […]

CyberChef is “the Cyber Swiss Army Knife—a web app for encryption, encoding, compression and data analysis”—entirely client-side JavaScript with dozens of useful tools for working with different formats and encodings. Source: