Doing My Own Syntax Highlighting (finally)
Posted2 months agoActive2 months ago
alexwlchan.netTechstory
calmpositive
Debate
20/100
Syntax HighlightingProgrammingText Editors
Key topics
Syntax Highlighting
Programming
Text Editors
The author shares their experience of implementing their own syntax highlighting, sparking discussion about the challenges and trade-offs involved in such a task.
Snapshot generated from the HN discussion
Discussion Activity
Moderate engagementFirst comment
6d
Peak period
6
144-156h
Avg / period
3.5
Key moments
- 01Story posted
Oct 22, 2025 at 9:15 AM EDT
2 months ago
Step 01 - 02First comment
Oct 28, 2025 at 7:09 PM EDT
6d after posting
Step 02 - 03Peak activity
6 comments in 144-156h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 29, 2025 at 5:27 PM EDT
2 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45668641Type: storyLast synced: 11/20/2025, 2:27:16 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
To quibble a bit: Color choice is mostly a matter of taste, but highlighting itself is a matter of workflows. Highlighting syntax in particular just happens to be a default most people find acceptable.
In other circumstances, the user may benefit from coloring by variable data-type, or coloring by distinct variable name, or coloring by scope, etc. Often IDEs will keep the font color, and use other channels like a highlight-box around some text, or gutter-icons.
- make all literals the same color
- make uses of defined items colored as a lighter color of the definition
- still color conditionals and loops
- use JS to have hovering over an item to highlight all other uses
I think the highlighting should serve 2 purposes: 1. help you parse the code, and 2. put more or less emphasis on some elements.
Comments have a primary role when reading someone's code, so they deserve a distinguished color by virtue of point 2 above.
Strings are sometimes difficult to parse correctly because the symbol to start them is the same to end them, so item 1 above applies.
And variable definitions have the tendency of hiding in plain sight, despite being crucial to understand a piece of code, so they match both criteria 1 and 2.
But numbers, booleans and constants can't be possibly mistaken for anything else, nor do they need to stand out more than the rest, so why highlighting them?
Deemphasizing punctuation might be a good idea: I'd probably reserve the same treatment to some common boilerplates, too, like #include in C/C++, #[derive] in Rust, etc.
Finally many languages make it hard to tell types and variables apart. Therefore I'd argue that types deserve their own coloring, obeying reason 1 above.
To add a final nitpick, the two "use" statements in the example define two symbols, "FilterType" and "Error". I think only these two words should be highlighted in blue, not the rest of the hierarchy.
Rather, the point of syntax highlighting (IMHO) is to accomplish three closely-related goals:
1. to insert obvious boundaries wherever the syntactic category of the lexeme stream changes, by changing color. This is why political maps color each country differently — it outlines what region of the map is in what country. (Note that, on its own, you don't need any given region to have any stable assigned color to achieve this effect. Political maps are often colored using the four-color theorem. Code could be too, if this is all you wanted to achieve.)
2. to create a scannable visual index, with the colors serving as syntactic categories, allowing your eyes to jump around the screen, or scan the file while scrolling, "by syntactic category." (That is: to re-anchor your eyes on a line that contains the identifier `foo`, without syntax highlighting, you'd have to either read the file line-by-line; or remember "where you left" the line by the relative shapes of the lines on-screen; or literally search for `foo` in your editor. But if `foo` is an identifier, and identifiers have their own distinct color, then you can glance around the screen for all the tokens that have been syntax-highlighted as identifiers — and then, as your eye lands on each identifier, you just check whether it says `foo`.) This is a reflex you pick up after reading a lot of code in a stable syntax-highlighting scheme; you might not even be aware you do this!
3. to induce in the user a sort of syntax-category<=>color synesthesia, where you can learn to spot problems in the code simply by noticing that something is the wrong color; or that you expected a token of a certain color to be present, but it's not (this is why parens+brackets+braces are often each given their own distinct highlight color). Basically the inversion of #2.
You really only get any of these benefits to the degree that your syntax highlighting is [as the author puts it] "christmas lights diarrhea." You immediately lose benefit #1 as soon as any two syntactic categories are the same color. And you lose benefits #2 and #3 more and more as fewer things have their own distinct highlight colors.
"Fully" colorized code might be ugly as hell to just read; but when you're actually writing it, it's ergonomic.
I don't like this trend copying at all. The post he's referring to is probably written by someone with light sensitivity.
- Code vs comments - builtins vs other symbols - maybe strings.
But I like to rely on whitespace (blank lines and indentations) more than colors these days.
[0]: https://arxiv.org/abs/2008.06030