HFA Icon

Data Licensing And API Access

Data Licensing for Funds & Fintechs

Build research and products on top of structured, point-in-time datasets. Deliveries include a simple schema,
a manifest with checksums, and a change log so your work is reproducible.

Who this is for

  • Quant & discretionary funds (signal research, knowledge graphs, governance maps)
  • Fintech teams (searchable UI, alerts, manager pages, foundations views)

Datasets

Conference Ideas (2011–present; strongest from 2024 →)

Concentrated idea flow from investor conferences and manager events.

  • What’s included (high level): event/series, dates, city, session; speaker & firm (normalized); company references with identifiers; timestamps for event and publication (explicit latency); optional direction and thesis tags; links to agenda/deck/transcript when available.
  • Cadence: event-driven (busy during conference weeks).

Investor Letters (2013 →)

Manager & fund, letter date/period, title, tags, tickers mentioned, source link, and a snippet (~500 characters) suitable for UI display with attribution and link.
Where rights permit, full text can be licensed under upper tiers.

  • Cadence: event-driven with weekly roll-ups.

13F Signals (HFA proprietary)

A signals layer built from public filings (we do not resell raw 13F positions).

  • Manager normalization, position changes, concentration metrics, crowding/overlap, simple thematic tags.

Foundations / 990-PF (positions-focused)

Normalized public-equity positions and related holdings from US foundation filings, mapped to managers/issuers, with links back to source documents.

Crosswalk (optional add-on)

A light graph connecting managers, foundations, speakers, and tickers—useful for governance views, provenance, and faster discovery across datasets.

Quick sample payloads

Illustrative JSON snippets. Full 100–500 row samples with schema & manifest available on request.

Conference Ideas — JSON

{
  "event_id": "evt_20240506_milken_001",
  "series": "Milken",
  "event_dt": "2024-05-06T14:30:00Z",
  "publish_dt": "2024-05-06T16:05:12Z",
  "lag_minutes": 95,
  "session_title": "AI & Infrastructure",
  "speaker": {"name":"Jane Doe","firm":"Example Capital","speaker_id":"spk_123"},
  "company": {"legal":"Acme Corp","ticker":"ACME","figi":"BBG000...","cik":"0000123456"},
  "direction": "long",
  "themes": ["AI","Capex","Data centers"],
  "novelty_flag": true,
  "source_url": "https://hedgefundalpha.com/conferences/..."
}

Investor Letters — JSON

{
  "letter_id": "ltr_2024Q4_example_001",
  "manager_id": "mgr_example",
  "fund": "Example Partners LP",
  "letter_period": "2024-Q4",
  "title": "2024 Q4 Letter",
  "publish_dt": "2025-01-15T12:00:00Z",
  "tickers": ["ABC","XYZ"],
  "snippet": "We continue to see attractive risk/reward in...",
  "rights_scope": "snippet_only",
  "source_url": "https://hedgefundalpha.com/letters/..."
}

13F Signals — JSON

{
  "manager_id": "mgr_example",
  "filing_period": "2024Q4",
  "ticker": "XYZ",
  "action": "new",
  "weight_pct": 2.3,
  "weight_delta_bps": 230,
  "top10_flag": true,
  "crowding_quantile": 0.82
}

Foundations / 990-PF — JSON

{
  "foundation_id": "fnd_abc",
  "fiscal_year": 2024,
  "issuer": "Acme Corp",
  "cusip": "004123AA0",
  "cik": "0000123456",
  "mv_usd": 1250000,
  "asset_class": "Public Equity",
  "source_url": "https://irs.gov/..."
}

Letters — CSV (micro-sample)

letter_id,manager,fund,letter_period,title,publish_dt,tickers,snippet,source_url
ltr_2024Q4_example_001,Example Manager,Example Partners LP,2024-Q4,2024 Q4 Letter,2025-01-15T12:00:00Z,"ABC|XYZ","We continue to see...",https://hedgefundalpha.com/letters/...
ltr_2023Q3_example_002,Another Manager,Alpha Fund,2023-Q3,2023 Q3 Letter,2023-10-20T09:30:00Z,"DEF","Macro headwinds easing...",https://hedgefundalpha.com/letters/...

Parquet is our canonical format; CSV/NDJSON are available for quick QA or stream-style ingestion.

Delivery & Formats

  • Parquet (recommended) as canonical; CSV sample included; NDJSON available.
  • Transport: S3 pre-signed links or shared S3 path; secure HTTPS; Snowflake share on request.
  • Schema pack: schema.json, data_dictionary.md, manifest.json (checksums).
  • Point-in-time discipline: snapshot at ingest; corrections appended with a versioned change log (no silent rewrites).

Cadence & Latency

Content is event-driven by nature (some days many updates, some days none).
We include both event and publish/ingest timestamps so teams can model lag explicitly.