The "it" in AI Models Is the Dataset
Posted4 months agoActive4 months ago
nonint.comTechstory
calmneutral
Debate
20/100
Artificial IntelligenceMachine LearningDatasets
Key topics
Artificial Intelligence
Machine Learning
Datasets
Discussion on the importance of datasets in AI models and their impact on functionality.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
26m
Peak period
1
0-1h
Avg / period
1
Key moments
- 01Story posted
Sep 1, 2025 at 4:30 PM EDT
4 months ago
Step 01 - 02First comment
Sep 1, 2025 at 4:56 PM EDT
26m after posting
Step 02 - 03Peak activity
1 comments in 0-1h
Hottest window of the conversation
Step 03 - 04Latest activity
Sep 1, 2025 at 4:56 PM EDT
4 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
Discussion (1 comments)
Showing 1 comments
measurablefunc
4 months ago
If the goal is to recreate the training data set then all functional approximations are extensionally equivalent modulo biases introduced by the architecture. What I mean by architectural bias is how missing pieces of the data manifold are imputed, i.e. given some point x (w/o a matching output in the optimization corpus) different algorithms will give different results based on how x is encoded into the interal/latent representation of the data manifold. But even this difference is essentially averaged away by the users b/c the goal is to create something that will please the most number of users so it all eventually converges to the average agreed upon sentiment of a large enough sample of people.
View full discussion on Hacker News
ID: 45096362Type: storyLast synced: 11/17/2025, 10:04:25 PM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.