Introducing the Massive Legal Embedding Benchmark (mleb)
Posted3 months agoActive3 months ago
isaacus.comTechstory
calmpositive
Debate
20/100
AILegal TechNlp
Key topics
AI
Legal Tech
Nlp
The Massive Legal Embedding Benchmark (MLEB) is introduced as a new benchmark for evaluating legal embeddings, sparking discussion on its potential impact and applications in the legal tech space.
Snapshot generated from the HN discussion
Discussion Activity
Light discussionFirst comment
11m
Peak period
4
0-1h
Avg / period
4
Key moments
- 01Story posted
Oct 16, 2025 at 4:30 AM EDT
3 months ago
Step 01 - 02First comment
Oct 16, 2025 at 4:41 AM EDT
11m after posting
Step 02 - 03Peak activity
4 comments in 0-1h
Hottest window of the conversation
Step 03 - 04Latest activity
Oct 16, 2025 at 4:50 AM EDT
3 months ago
Step 04
Generating AI Summary...
Analyzing up to 500 comments to identify key contributors and discussion patterns
ID: 45602844Type: storyLast synced: 11/17/2025, 10:08:59 AM
Want the full context?
Jump to the original sources
Read the primary article or dive into the live Hacker News thread when you're ready.
Wow.
Voyage's terms say:
> you grant Voyage AI (and its successors and assigns) a worldwide, irrevocable, perpetual, royalty-free, fully paid-up, right and license to use, copy, reproduce, distribute, prepare derivative works of, display and perform the Customer Content: ... (iii) to train, improve, and otherwise further develop the Service (such as by training the artificial intelligence models we use).
Cohere's terms say:
> YOU GRANT US A ... RIGHT TO ... USE ... ANY DATA ... TO ... IMPROVE AND ENHANCE THE COHERE SOLUTION AND OUR OTHER OFFERINGS AND BENCHMARK THE FOREGOING, INCLUDING BY SHARING API DATA AND FINETUNING DATA WITH THIRD PARTIES ...
Jina's terms say:
> Jina AI shall, subject to applicable mandatory data protection requirements, be entitled to retain data uploaded to the Jina AI Systems or otherwise provided by the Customer or collected by Jina AI in the course of providing the Services and to use such data in anonymized/pseudonymized format for its business purposes including to improve its artificial intelligence applications.