Extracting relationship intelligence from large message datasets

I’m looking to process large volumes of message data (150,000+ messages) to extract structured insights about people and relationships. The goal is to turn messy, unstructured conversations into clean, analyzable data that can power features like:

  • Contact and relationship extraction—Identify people mentioned in messages, extract contact details, and classify relationship types.

  • Relationship strength mapping - Categorize connections using a simple 2x2 framework or numeric scale (e.g., weak/strong ties, intimacy scores).

  • Interest/topic detection - Surface shared interests and recurring conversation themes.

  • LLM-ready formatting - Output the data in a structured format optimized for downstream use by large language models.

Ideally, this would include a middleware layer that handles chunking, structuring, and metadata enrichment before sending anything to the LLM, rather than just dumping raw text. That approach would enable scalability (e.g., MapReduce-style processing), maintain context fidelity, and avoid the need for expensive LLM training.

This capability would be beneficial for anyone building tools in the relationship management, personal CRM, or social intelligence space, especially when working with high-volume, unstructured message data.

If others are tackling something similar, I would love to hear how you approach it.

Please authenticate to join the conversation.

Upvoters
Status

In Progress

Board

💡 How I'd like to use Storytell

Date

8 months ago

Subscribe to post

Get notified by email when there are changes.