Can You Save on LLM Tokens Using Images Instead of Text?
Posted2 months agoActiveabout 2 months ago
pagewatch.aiTechstory
calmmixed
Debate
40/100
Large Language ModelsToken OptimizationImage Processing
Key topics
Large Language Models
Token Optimization
Image Processing
The article explores using images instead of text to save on LLM tokens, sparking discussion on the trade-offs between token count, accuracy, and processing complexity.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
6d
Peak period
8
144-156h
Avg / period
4.8
Comment distribution19 data points
Loading chart...
Based on 19 loaded comments
Key moments
- 01Story posted
Nov 1, 2025 at 6:34 PM EDT
2 months ago
Step 01 - 02First comment
Nov 7, 2025 at 11:33 PM EST
6d after posting
Step 02 - 03Peak activity
8 comments in 144-156h
Hottest window of the conversation
Step 03 - 04Latest activity
Nov 10, 2025 at 2:51 AM EST
about 2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45786042Type: storyLast synced: 11/20/2025, 12:29:33 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
https://en.wikipedia.org/wiki/A_picture_is_worth_a_thousand_...
https://arxiv.org/abs/2010.11929
"""
"""It's the same as LLMs being able to "decode" Base64, or work with sub-word tokens for that matter, it just learns to predict that:
<compressed representation> will be followed by (or preceded by) <decompressed representation>, or vice versa.
The how is variable. The calm paper seems to have used a MLP to compress from and ND input (N embeddings of size D) into a single D embedding and other for decompress them back
Hit this issue optimizing LLM request times. Ending up lowering image resolution. Lost some accuracy but could bear that.