Reading SEC filing data
SEC filing data is messier than it looks
EDGAR is excellent infrastructure: free, complete, and authoritative. But the filings it serves are raw. The moment you try to filter by 8-K item, count what insiders actually bought, or read a fund’s real positions, you hit edge cases the SEC never normalized away. These guides explain the traps we ran into building BetterEDGAR — and how the product resolves them — using real examples, not definitions you could get anywhere.
The guides
- Tickers lie; CIKs are forever
Tickers get reassigned after a delisting and company names change, so a plain search can land on the wrong filer. The CIK is the only stable identifier, and resolving the right company is its own problem.
- 8-K items aren’t a clean taxonomy
Items arrive as a free-text string, one filing carries several, Item 8.01 is a catch-all, and the numbers don’t map to intent. 6-Ks have no items at all.
- Most “insider buying” isn’t buying
Grants, option exercises, and tax withholding all show up on Form 4 but aren’t open-market trades — and superseded amendments double-count if you don’t drop them.
Other things that break in SEC data
A few more places where the raw filings fight back, and why naive parsing gets them wrong.
- Item numbers don’t equal intent
- 8-K Item 2.01 is an acquisition closing, 2.02 is earnings, 2.03 is new debt. There’s no clean form-code-to-meaning mapping without SEC domain knowledge.
- Tickers get reused; names are ALL CAPS
- A delisted ticker can be reassigned, companies rebrand, and EDGAR stores names like “MICRON TECHNOLOGY INC.” Matching a query to the right issuer over time is its own problem.
- Bank structured notes flood “financing”
- Dealer structured-note shelf takedowns (424B2/424B5/FWP) are technically financing but bury real equity and debt offerings, so they have to be demoted out of the way.
- Exhibit filenames are cryptic
- A press release arrives as ex99_1.htm with no label. Knowing it’s a press release versus an investor presentation versus a transcript takes content classification, not a filename.
- XBRL concept names vary by company
- There’s no universal tag for “revenue.” One company files Revenues, another NetRevenue, another a custom extension — so comparing the same line across issuers isn’t a lookup.
SEC filing glossary
The terms behind the traps above, defined in one line each.
- SEC EDGAR
- The SEC’s Electronic Data Gathering, Analysis, and Retrieval system — the official, free public archive of company filings. It is the source of record; BetterEDGAR links back to it.
- CIK
- Central Index Key: the SEC’s permanent 10-digit identifier for a filer. Unlike a ticker, a CIK never changes and is unique, so it is the reliable way to pin down an issuer over time.
- Accession number
- The unique identifier for a single filing, formatted like 0000320193-24-000123. It points to one exact submission on EDGAR.
- CUSIP
- A nine-character identifier for a security. 13F holdings are reported by CUSIP, not ticker or name, so reading them means mapping CUSIPs back to companies.
- Form 8-K
- A “current report” used to disclose material events between regular reports — results, leadership changes, agreements — generally within four business days of the event.
- Form 6-K
- The report foreign private issuers furnish for material information made public abroad. Unlike an 8-K, it has no numbered item list, so its topic must be inferred from content.
- Form 4
- A statement of changes in beneficial ownership filed by insiders, due within two business days of the trade. Its transaction code, not the form itself, tells you what actually happened.
- Form 13F
- A quarterly report of equity holdings filed by institutional managers with at least $100 million in 13F securities, due within 45 days of quarter-end. Reported by CUSIP.
- Schedule 13D / 13G
- Beneficial ownership reports filed after crossing 5% of a company. A 13D signals non-passive (often activist) intent; a 13G is the lighter passive-holder alternative.
- Form 10-K / 10-Q
- The annual (10-K, audited) and quarterly (10-Q) reports carrying a company’s financial statements and management discussion.
- Inline XBRL
- Machine-readable financial data tagged inside a filing. Coverage and concept names vary by issuer, which is why comparing one company’s numbers to another is not a simple lookup.
FAQ
- If SEC EDGAR is free, what is actually hard about using it?
- Getting a filing is easy; making filings comparable is not. EDGAR delivers raw documents and loosely structured metadata. Turning that into something you can filter — by 8-K item, insider transaction type, exhibit type, or the right issuer — takes a lot of normalization, which is the work BetterEDGAR does.
- Why is it hard to match a query to the right company?
- Tickers get reused after a delisting, companies rebrand, and EDGAR stores names in all caps. The same letters can point to different issuers over time, so resolving a search to the right filer means ranking exact identifiers, current names, and former names — not a single lookup.
- Does an insider “buying” on a Form 4 mean they bought on the open market?
- Not necessarily. Form 4 transaction codes distinguish a real purchase (P) from a stock grant (A), an option exercise (M/X), or shares withheld for taxes (F). Many headline “insider buys” are actually compensation events, not discretionary purchases.
- Is BetterEDGAR the official source?
- No. SEC EDGAR remains the official source of record. BetterEDGAR is an independent interface that cleans up and links back to those public filings.
Skip the parsing — start with clean data
BetterEDGAR does this normalization so you don’t have to: filterable 8-K items, real insider trades, readable exhibit labels, and the right issuer every time, all linked back to the source SEC filing.