Unlock the full potential of AI with Building LLMs for Production—our 470+ page guide to mastering LLMs with practical projects and expert insights!

Publication

AI-Generated Animations Are Here (Almost…)
Latest   Machine Learning

AI-Generated Animations Are Here (Almost…)

Last Updated on May 16, 2024 by Editorial Team

Author(s): Shahzeb Naveed

Originally published on Towards AI.

AI bot sliding on a sine curve. (Source: Author/Adobe Firefly)

Last week, as I scrolled through my Instagram feed, an animation video popped up in the famous 3Blue1Brown style explaining how Convolution Neural Networks work. It suddenly occurred to me if we can automate this entire process of animation, from concept to visualization down to Manim-based Python script that can generate animation videos automatically. With a quick Google search, I found that the author of 3Blue1Brown had already open-sourced his Python-based engine Manim, which creates programmatic animations. At this point, everything felt sorted out.

Setting Up

I installed the Manim on my Mac following the conda method. Instead of installing the full LaTex distribution, I installed the BasicTex since I couldn’t afford a 4 GB installation on my already jam-packed Mac. Additionally, I had to install some missing modules like standalone and preview as follows:

sudo tlmgr install standalone
sudo tlmgr install preview

Showtime!

Now, since ChatGPT has been my go-to AI for routine tasks, I went straight to it with an initial prompt (an awfully vague one though):

“I need to create an animation explanation video using Manim python library similar to 1 Blue 3 Brown. First, generate a 2-minute transcript explaining the transformers architecture at a high-level. Then for that trasncript, create a python program using manim package to create a scene or multiple scenes to create animation visualizing the transcript.”

It generated the following Manim-based Python script:

# Import necessary modules from Manim
from manim import *

class EmbeddingScene(Scene):
def construct(self):
# Create a matrix representing the embedding matrix
embedding_matrix = Matrix([[1, 0.5, 0], [0.2, 0.8, 0.3], [0, 0.4, 1]])
embedding_matrix.scale(0.8)
self.play(Create(embedding_matrix))

# Create arrows to represent the transformation process
arrow_query = Arrow(embedding_matrix.get_center(), embedding_matrix.get_center() + RIGHT)
arrow_key = Arrow(embedding_matrix.get_center(), embedding_matrix.get_center() + UP)
arrow_value = Arrow(embedding_matrix.get_center(), embedding_matrix.get_center() + DOWN)

self.play(Create(arrow_query), Create(arrow_key), Create(arrow_value))

# Add labels for Query, Key, and Value
label_query = Text("Query").next_to(arrow_query, RIGHT)
label_key = Text("Key").next_to(arrow_key, UP)
label_value = Text("Value").next_to(arrow_value, DOWN)

self.play(Create(label_query), Create(label_key), Create(label_value))

# Show the attention mechanism (optional)
attention_arrow = Arrow(arrow_query.get_end(), arrow_key.get_start())
self.play(Create(attention_arrow))

# Show the final embedding
final_embedding = Circle(color=BLUE).next_to(embedding_matrix, RIGHT)
self.play(TransformFromCopy(embedding_matrix, final_embedding))

# Fade out everything
self.play(FadeOut(embedding_matrix), FadeOut(arrow_query), FadeOut(arrow_key),
FadeOut(arrow_value), FadeOut(label_query), FadeOut(label_key),
FadeOut(label_value), FadeOut(attention_arrow))

self.wait(1) # Pause for a moment

Then, I issued the following command:

manim -p transformers.py

that rendered the video and opened the video file after completion:

AI-Assisted Manim animation trying to explain Transformers. (Source: Author)

What a bummer! As you can see, the result was as awful as my prompt as it lacked any logical visualization that made sense, with random arrows pointing towards random directions, incorrect concepts, an empty circle that represents nothing, and no sense of story-telling whatsoever). This served as the first reality check and pushed me to do better at Feature Engineering.

Over repeated failed experimentation, I arrived at the following prompt (with more explicit concepts, a simpler problem, and clearer instructions). Asking the AI to first come up with a “plan of action” yielded significant improvements.

“Need to create an animation for an explanation video to explain how curve fitting works in ML.

# Visualization Concept

1. A set of dots roughly scattered like a sine curve on a graph (not a pure sine curve but with some noise)

2. then another solid curve (depicting the curve defined by an ML model) appears. It starts as a straight line but then as the training progresses, the straight line fits into the set of dots and ultimately transforms into a sine curve.

3. Then create similar scenes for 2 additional curves

# Instructions

1. Be creative to add animations as needed and as possible in the Manim python library.

2. First, create a proper plan of action to think what exactly will you be visualizing (graphic elements, formulas, etc.) to explain what concepts.

3. Then translate that action plan to a python script.

4. Do not add any full sentences on the screen. You can however use labels as text if needed.

5. Generate code for all scenes.

6. Code should be complete with all modules imported, all variables defined. It should run as is.”

AI explains the concept of curve fitting in ML (Source: Author)

This seemed pretty neat to me, although, in the second half of the video, the text overlaps with the graph, the arrow points to nothing, and so does the circle.

In another task, I asked the LLM to animate a hypothetical feature matrix with a column matrix representing labels, along with a rectangle focusing each of the columns one by one. The results were pretty neat but this was only after hours of prompt tuning, manual debugging, and going as explicit as possible.

AI visualizes a feature matrix. (Source: Author)

Issues:

  1. Deprecated Code: Manim appears to be like a rapidly evolving library with many methods/attributes deprecated or renamed. Secondly, the public codebase using the Manim package seems to be limited. These factors led to the LLMs generating code that didn’t run on the first go and therefore, made the debugging process extremely cumbersome.
  2. ChatGPT 3.5: Since I initially used ChatGPT 3.5, I realized it wasn’t as updated and lacked the latest changes in the Manim package. Therefore, I decided to give its competitors a try.
  3. Google Gemini: I’ve never been a fan of Google’s Gemini and this experiment made my opinion even stronger. On several occasions, it either generates “templates” and asks the user to fill in the details despite explicitly asking it to suggest complete, ready-to-run code. Plus, it generates unnecessary explanations of the code post-generation. I eventually dumped Gemini but did use it to enhance my prompt and understanding of what a good “plan of action” might be.
  4. Anthropic’s Claude: I then tried Anthropic’s Claude for the first time. And surprisingly, it gave a decent result on the first go. However, on many occasions, it still suggested code with deprecated Manim functionality even though it was trained more recently.
  5. GPT4: I also tried GPT-4 Turbo via the OpenAI API (and also via Microsoft Co-Pilot). The responses were only slightly more reliable and still faced the issue of deprecated code (which kept me from upgrading to ChatGPT Plus).

Conclusion:

Even though this 3-day experimentation has been extremely cumbersome and frustrating, I don’t want to leave the impression that AI-assisted animation is a waste of time. In my opinion, you might be able to get something out of LLMs but you should keep these things in mind:

  1. You must master the Manim framework itself (or at least its basics) to be able to instruct the LLM appropriately and also to be able to debug the code (that’s often deprecated as pointed out above).
  2. You can’t solely rely on any prompt engineering technique like Chain of Thought and expect the LLM to conceptualize an entire animation by itself. You’ve got to have some level of creativity to come up with ideas by yourself and have the AI interpolate through some of it.
  3. More publicly-available Manim codebase has to be made available and made part of the training datasets for LLMs to avoid suggesting deprecated code.

If you do give this idea a shot, please do share how it went in the comments below.

AI fails to animate a smiley (Source: Author)

Thanks for reading!

GitHub Repository: https://github.com/shah-zeb-naveed/manimations

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Feedback ↓