Reading SEC filing data

SEC filing data is messier than it looks

EDGAR is excellent infrastructure: free, complete, and authoritative. But the filings it serves are raw. The moment you try to filter by 8-K item, count what insiders actually bought, or read a fund’s real positions, you hit edge cases the SEC never normalized away. These guides explain the traps we ran into building BetterEDGAR — and how the product resolves them — using real examples, not definitions you could get anywhere.

The guides

  • Tickers lie; CIKs are forever

    Tickers get reassigned after a delisting and company names change, so a plain search can land on the wrong filer. The CIK is the only stable identifier, and resolving the right company is its own problem.

  • 8-K items aren’t a clean taxonomy

    Items arrive as a free-text string, one filing carries several, Item 8.01 is a catch-all, and the numbers don’t map to intent. 6-Ks have no items at all.

  • Most “insider buying” isn’t buying

    Grants, option exercises, and tax withholding all show up on Form 4 but aren’t open-market trades — and superseded amendments double-count if you don’t drop them.

Other things that break in SEC data

A few more places where the raw filings fight back, and why naive parsing gets them wrong.

Item numbers don’t equal intent
8-K Item 2.01 is an acquisition closing, 2.02 is earnings, 2.03 is new debt. There’s no clean form-code-to-meaning mapping without SEC domain knowledge.
Tickers get reused; names are ALL CAPS
A delisted ticker can be reassigned, companies rebrand, and EDGAR stores names like “MICRON TECHNOLOGY INC.” Matching a query to the right issuer over time is its own problem.
Bank structured notes flood “financing”
Dealer structured-note shelf takedowns (424B2/424B5/FWP) are technically financing but bury real equity and debt offerings, so they have to be demoted out of the way.
Exhibit filenames are cryptic
A press release arrives as ex99_1.htm with no label. Knowing it’s a press release versus an investor presentation versus a transcript takes content classification, not a filename.
XBRL concept names vary by company
There’s no universal tag for “revenue.” One company files Revenues, another NetRevenue, another a custom extension — so comparing the same line across issuers isn’t a lookup.

SEC filing glossary

The terms behind the traps above, defined in one line each.

SEC EDGAR
The SEC’s Electronic Data Gathering, Analysis, and Retrieval system — the official, free public archive of company filings. It is the source of record; BetterEDGAR links back to it.
CIK
Central Index Key: the SEC’s permanent 10-digit identifier for a filer. Unlike a ticker, a CIK never changes and is unique, so it is the reliable way to pin down an issuer over time.
Accession number
The unique identifier for a single filing, formatted like 0000320193-24-000123. It points to one exact submission on EDGAR.
CUSIP
A nine-character identifier for a security. 13F holdings are reported by CUSIP, not ticker or name, so reading them means mapping CUSIPs back to companies.
Form 8-K
A “current report” used to disclose material events between regular reports — results, leadership changes, agreements — generally within four business days of the event.
Form 6-K
The report foreign private issuers furnish for material information made public abroad. Unlike an 8-K, it has no numbered item list, so its topic must be inferred from content.
Form 4
A statement of changes in beneficial ownership filed by insiders, due within two business days of the trade. Its transaction code, not the form itself, tells you what actually happened.
Form 13F
A quarterly report of equity holdings filed by institutional managers with at least $100 million in 13F securities, due within 45 days of quarter-end. Reported by CUSIP.
Schedule 13D / 13G
Beneficial ownership reports filed after crossing 5% of a company. A 13D signals non-passive (often activist) intent; a 13G is the lighter passive-holder alternative.
Form 10-K / 10-Q
The annual (10-K, audited) and quarterly (10-Q) reports carrying a company’s financial statements and management discussion.
Inline XBRL
Machine-readable financial data tagged inside a filing. Coverage and concept names vary by issuer, which is why comparing one company’s numbers to another is not a simple lookup.

FAQ

If SEC EDGAR is free, what is actually hard about using it?
Getting a filing is easy; making filings comparable is not. EDGAR delivers raw documents and loosely structured metadata. Turning that into something you can filter — by 8-K item, insider transaction type, exhibit type, or the right issuer — takes a lot of normalization, which is the work BetterEDGAR does.
Why is it hard to match a query to the right company?
Tickers get reused after a delisting, companies rebrand, and EDGAR stores names in all caps. The same letters can point to different issuers over time, so resolving a search to the right filer means ranking exact identifiers, current names, and former names — not a single lookup.
Does an insider “buying” on a Form 4 mean they bought on the open market?
Not necessarily. Form 4 transaction codes distinguish a real purchase (P) from a stock grant (A), an option exercise (M/X), or shares withheld for taxes (F). Many headline “insider buys” are actually compensation events, not discretionary purchases.
Is BetterEDGAR the official source?
No. SEC EDGAR remains the official source of record. BetterEDGAR is an independent interface that cleans up and links back to those public filings.

Skip the parsing — start with clean data

BetterEDGAR does this normalization so you don’t have to: filterable 8-K items, real insider trades, readable exhibit labels, and the right issuer every time, all linked back to the source SEC filing.