Holdings
Active positions, performance against the S&P 500, and the full trade log.
Portfolio — agent's paper portfolio (the benchmark)
This is AlphaSteve's own portfolio — paper capital managed by the agent, used to measure whether the kit actually generates alpha versus the S&P 500. It is NOT the user's real positions. The discipline is the same as a real portfolio; the capital is paper.
Mandate
- Inception: 2026-05-26 (Tuesday — Monday May 25 is Memorial Day, US markets closed)
- Starting capital: $10,000 USD
- Benchmark: S&P 500 (SPY total return)
- Universe: US-listed common stock, ADRs, ETFs (SPY excluded as it is the benchmark)
- Direction: long only (no shorts)
- Decision authority: agent executes autonomously when kit discipline is clear (kill criteria firing, sell trigger crossed, watchlist trigger firing with thesis refresh confirming). All actions logged in Transactions.
- Cadence: daily check (weekdays, 9:00 AM); initial build runs on the first daily-check if portfolio is empty.
A position enters the portfolio when a thesis verdict is BUY (either directly out of fresh research, or via watchlist-trigger-firing followed by thesis refresh). A position exits when sold or stopped out per the kill-criteria framework.
Current drawdown state
The daily portfolio check (9:00 AM, weekdays) updates this section before doing anything else. The values below are read by the drawdown-protocol to determine which band applies.
| Field | Value | Notes |
|---|---|---|
| Peak NAV | $10,000.00 | Inception value; will update on first new high |
| Current NAV | $10,000.00 | Pre-inception |
| Absolute drawdown | 0.0% | (current − peak) / peak |
| Rolling 6-mo return vs. SPY | n/a | Insufficient history |
| Rolling 6-mo return vs. RPV | n/a | Insufficient history |
| Current band | None | First band fires at −5% |
| Days in current drawdown | 0 | Counted on trading days from last peak |
| Last band-crossing event | n/a | Logged here when crossed |
| Pending mandated actions | None | Re-reads / rebooks owed from prior band crossings |
When a band crosses, the day's portfolio note records the crossing event with the recommended action, and the "Pending mandated actions" row carries the action until completed. Band thresholds, definitions, and recommended responses live in drawdown-protocol.
Master register
| Ticker | Thesis | Buy date | Conviction tier | Target size % | Current size % | Cost basis ($) | Shares | Current ($) | Position value | P&L (%) | Kill criteria fired? | Sell trigger ($) |
|---|
No active positions as of 2026-05-23. First build scheduled for 2026-05-26 (Tuesday — Memorial Day closure pushed inception one day).
Cash
| Date | Cash position | % of NAV |
|---|---|---|
| 2026-05-23 | $10,000 | 100% (pre-inception) |
Cash buffer is part of the strategy, not a residual. The deep-value framework explicitly favors holding cash when nothing screens cheap enough (see margin-of-safety-pricing).
Conviction tiers (from position-sizing-kelly)
| Tier | Target weight | When to use |
|---|---|---|
| Core 1 | 8–12% | Highest conviction, wide margin of safety, deeply researched |
| Core 2 | 5–8% | Strong conviction, good margin of safety, well-researched |
| Mid | 2–4% | Moderate conviction or developing thesis |
| Probe | 0.5–1.5% | Learning a name; not yet sized for full conviction |
Hard limits regardless of conviction:
- No single position above 15% at cost
- No single industry above 30%
- No single country above 25% (Tier 1 country)
Position-management discipline (lives here, not in the thesis)
Each active position is managed against:
- Buy / build pattern — sized incrementally to target weight as conviction confirms and (where applicable) price weakens.
- Sell trigger — when central value is reached, trim to half position; when 15-25% above central value, exit fully.
- Kill criteria — specific observable events from the thesis. When any fires, the thesis is re-evaluated immediately, not on the regular cadence. If the kill is confirmed, position is exited regardless of price.
- Opportunity cost test — periodically (at least quarterly), is this position still the best use of this slot, or is there a clearly better idea in the workbench / watchlist with wider margin of safety?
Daily / weekly tracking cadence
- Daily — pre-market scan checks each portfolio position against current price for kill-criteria proximity and trigger crossing
- Weekly — Monday digest reviews each position end-to-end (price, P&L, news, filings, insider activity, kill-criteria status)
- Per-position calibration tracker — appended at each calibration checkpoint (quarterly + 12-month + 24-month from buy date)
How to add a position
When a buy decision is made:
- Record the transaction in Transactions with date, price, size, and link to the thesis
- Add a row to this register
- Update the linked thesis's frontmatter:
verdict: buy,position_size_pct: <target>,buy_date: <date> - Move the calibration tracker from "pre-position" to "active position" mode (the schedule starts from buy date if it differs from the thesis verdict date)
- Set up automated price alerts at the sell trigger and at material kill-criteria thresholds
How to exit a position
When a sell decision is made:
- Record the transaction in Transactions with date, price, size, reason (sell trigger / kill criterion / opportunity cost / other)
- Update the register row — move size to 0%, note the exit date and exit price
- Update the thesis frontmatter:
verdict: held → exited,exit_date,exit_price - Calibration tracker continues for at least 12 months post-exit to test whether the exit decision was correct
- Consider whether the name should re-enter the watchlist if a future price level would warrant re-engagement
Transaction cost model
Even paper money is debited a realistic friction at every transaction. Without this, the Performance scoreboard understates the cost of activity and overstates the kit's edge — and a 6-month scorecard built on frictionless returns cannot be honestly compared to anything investable.
Friction is applied as a round-trip cost on the dollar value of each transaction, on both the buy and the sell side, regardless of whether the position was profitable. Tiers:
| Liquidity tier | Examples | Round-trip cost | Notes |
|---|---|---|---|
| Large-cap liquid US equity | S&P 500 names, top NASDAQ-100, large ADRs trading > $50M/day | 10 bps | Tight spread, ample volume |
| Mid-cap US equity | Russell mid-cap, secondary-tier ADRs, $5–50M/day volume | 25 bps | Wider spreads, occasional slippage |
| Small-cap or thin ADR | Russell 2000 and below, ADRs under $5M/day, recent IPOs, post-bankruptcy emergents | 50 bps | Material spread, slippage on size |
The friction is applied at the moment of transaction in Transactions and flows into NAV via the cost basis. A buy of $1,000 of a large-cap large-liquid name records as $1,005 invested ($1,000 stock plus $5 round-trip friction, half applied at buy); the sell side records the remaining half against proceeds.
For ETFs (excluding SPY, which is the benchmark and not investable), the tier is determined by the underlying basket — broad large-cap ETFs at 10 bps, sector or factor ETFs at 25 bps, narrow thematic or country ETFs at 50 bps.
The friction tiers are deliberately conservative for a paper portfolio. The point is that the 6-month scorecard reflects a realistic operating cost, not an idealized one.
Linked
- Watchlist — the previous stage; names move here when triggers fire and thesis refreshes confirm
- Transactions — chronological log of all buys, sells, and verdict changes
- Performance — daily NAV, alpha vs. SPY / RPV / RPG, drawdown ledger
- drawdown-protocol — portfolio-level kill criteria the daily check enforces against the state above
- position-sizing-kelly — the sizing framework
- permanent-capital-loss — what position management is ultimately protecting against
- margin-of-safety-pricing — sell discipline anchors
- deliverable-suite — where the portfolio sits in the broader pipeline
- six-month-test — the calibration checkpoint that scores realized vs. expected behavior
Performance — daily NAV and alpha vs. SPY, RPV, RPG
Daily ledger of the agent's paper portfolio performance. The primary benchmark is SPY total return from the same inception date; the secondary benchmarks are RPV (Russell 1000 Value, Invesco S&P 500 Pure Value ETF) and RPG (Russell 1000 Growth, Invesco S&P 500 Pure Growth ETF). Alpha is the cumulative return differential against each benchmark — positive alpha means the agent is generating returns above the benchmark; negative alpha means the benchmark would have done better.
Three benchmarks rather than one because a long-only deep-value portfolio is implicitly long the value factor. A kit that beats SPY but matches RPV is not generating alpha — it is renting value-factor exposure that any value ETF would deliver. The RPG track is the opposing comparison: if value as a category is having a hostile multi-year period, the gap to RPG quantifies the regime cost. Without all three, the six-month-test cannot honestly score whether the kit's edge is methodological or factor-based.
The point of this file is not to celebrate good days. It is to measure honestly whether the kit's discipline — deep value, structural quality, margin of safety — produces excess returns over a meaningful holding period versus the right comparison. Over short horizons, alpha is noise. Over multi-year horizons, it is the verdict on the kit itself.
Inception
Date: 2026-05-26 (Tuesday — Monday May 25 is Memorial Day, US markets closed) Starting NAV: $10,000.00 Starting SPY benchmark: TBD (set at first daily-check run) Starting RPV benchmark: TBD (set at first daily-check run) Starting RPG benchmark: TBD (set at first daily-check run)
Daily ledger
| Date | NAV ($) | Daily return | Cum. return | SPY cum. | RPV cum. | RPG cum. | Alpha vs. SPY | Alpha vs. RPV | Alpha vs. RPG | Notes |
|---|---|---|---|---|---|---|---|---|---|---|
| 2026-05-23 | 10,000.00 | n/a | n/a | n/a | n/a | n/a | n/a | n/a | n/a | Pre-inception, cash only |
Monthly summary
| Month | NAV end | Month return | SPY return | RPV return | RPG return | Alpha vs. SPY | Alpha vs. RPV | Alpha vs. RPG | Notes |
|---|
Drawdown tracking
Both absolute drawdown (peak-to-trough on portfolio NAV) and relative drawdown vs. the SPY and RPV benchmarks are tracked. The drawdown-protocol fires on whichever is worse at any given check.
| Date | Peak NAV | Current NAV | Absolute DD % | Rolling 6-mo vs. SPY | Rolling 6-mo vs. RPV | Active band | SPY DD for comparison |
|---|
Attribution (quarterly)
What drove returns this quarter? Quarterly review of:
- Best and worst contributing positions
- Sectors over- and under-weight vs. benchmark
- Cash drag vs. benefit
- Decisions that mattered (entries, exits, sizing)
| Quarter | Best contributor | Worst contributor | Sector tilt vs. SPY | Key decisions |
|---|
How returns are computed
- Daily NAV = sum(position value) + cash. Position value = shares × closing price. Transaction friction per Portfolio § Transaction cost model is debited at the time of each trade and flows through to NAV via cost basis.
- Daily return = (NAV today − NAV yesterday) / NAV yesterday. Initial date returns null.
- Cumulative return = (NAV today − starting NAV) / starting NAV.
- SPY / RPV / RPG benchmarks = same percentage move applied to a hypothetical $10,000 in each ETF at the same inception date. We track each level for comparability; all three are total-return (dividends reinvested in benchmark).
- Alpha vs. each benchmark = portfolio cumulative return − benchmark cumulative return. In percentage points.
- Absolute drawdown = (current NAV − peak NAV) / peak NAV. Always negative or zero. Peak resets only on a new high.
- Rolling 6-mo vs. benchmark = trailing-130-trading-day portfolio return minus trailing-130-trading-day benchmark return. Used by drawdown-protocol to detect relative-underperformance bands even when absolute drawdown is shallow.
Dividends from benchmark ETFs are reinvested in the benchmark calculation. Dividends from portfolio positions are accumulated in cash unless a separate reinvestment instruction is added.
Periodic review
- Daily — the portfolio daily-check task updates this file automatically: today's row in the daily ledger, drawdown updated if new peak or new trough.
- Monthly — first business day of each month, the daily task computes the prior month's summary row.
- Quarterly — first business day of each new quarter, attribution analysis is appended.
- Annual — anniversary of inception, full-year retrospective compares actual returns to the calibration predictions in each thesis tracker.
Reading the alpha
A few months of positive alpha is not evidence the kit works. A few months of negative alpha is not evidence the kit fails. The signal-to-noise ratio in equity returns over short periods is poor.
The meaningful tests:
- 6 months — first scheduled checkpoint per six-month-test. Sample is small; alpha is mostly descriptive but the attribution split (selection vs. sizing, alpha vs. SPY vs. alpha vs. RPV) is already informative.
- 12 months — sample is still small but trends start to be visible. Was alpha positive against all three benchmarks on a return-weighted basis?
- 24 months — covers at least one cycle of revaluation. Alpha at this point reflects more than just random sequencing.
- 36+ months — sample is meaningful. Alpha here, against RPV in particular, is closer to a real verdict on the kit.
Until 12 months, treat alpha as descriptive (here is what happened) rather than evaluative (is the kit working). The exception is the six-month-test which uses pre-specified success and failure criteria written before any data existed, precisely to avoid the rationalization a free-form 6-month read invites.
The factor-vs-skill question
The reason three benchmarks rather than one:
- Alpha vs. SPY positive, alpha vs. RPV positive — the kit is generating excess return above and beyond the value-factor tailwind. This is the configuration the mission claims.
- Alpha vs. SPY positive, alpha vs. RPV ~zero or negative — the kit is delivering value-factor beta, not methodological alpha. A value ETF would do the same for ten basis points of fees.
- Alpha vs. SPY ~zero, alpha vs. RPV positive — value is in a hostile regime; the kit is generating relative-to-style alpha but the style itself is being punished. Continue if the lens has not been falsified per 06-falsification.
- Alpha vs. SPY negative, alpha vs. RPV negative — neither the kit nor the style is working. This is the configuration that escalates to falsification review.
The split matters more than the headline number. Report all four configurations explicitly in monthly summaries from month 6 onward.
Linked
- Portfolio — current positions, drawdown state, transaction cost model
- Transactions — full audit trail
- drawdown-protocol — the portfolio-level kill criteria the drawdown columns feed
- six-month-test — pre-specified evaluation criteria for the first major checkpoint
- position-sizing-kelly — the sizing framework
- margin-of-safety-pricing — the cash-vs-deploy discipline
- methodology-calibration — long-run audit of whether central values are biased
- kit-debrief-001-PLTR — kit improvement log; performance feeds this
Transactions — chronological log
Every verdict change, every buy, every sell, every status migration. The audit trail for decision quality. Calibration scoring traces back through this log.
Log
| Date | Ticker | Action | Size % | Price ($) | Thesis | Rationale |
|---|---|---|---|---|---|---|
| 2026-05-23 | PLTR | Verdict: PASS-with-trigger; added to Watchlist | 0 | 135.00 (ref) | PLTR | Central value $52, buy trigger $29 (45% MoS). Price implies top-decile-decade software outcome; insider selling $6B with zero buys; no margin of safety. Watchlist for re-engagement at ~$29. |
Conventions
Actions:
Verdict: X— thesis verdict assigned or changed (X = buy, pass-with-trigger, pass, avoid, short-candidate)Buy— capital committed; size and price filled inAdd— added to existing position; size delta in size columnTrim— partial sell; size delta in size column (negative)Sell— full exitMove: X → Y— pipeline state change (e.g., Watchlist → Portfolio when trigger fires)Status check— periodic re-review without action (less frequent log entries; mostly for material thesis updates)
Size: for buys/adds/trims, the change in position size as % of portfolio. For verdicts and status checks, 0.
Price: transaction price for trades; reference price for verdicts and status checks.
Rationale: one to three lines. The reason this action happened. Future calibration reads this column to assess whether reasoning was sound.
Periodic audit
Quarterly, re-read the last 90 days of transactions. Look for:
- Pattern of churn (frequent small trades signal hesitation or noise-trading)
- Pattern of conviction drift (positions that started Core 1 and got trimmed without a clear trigger)
- Pattern of regret (entries where rationale was thin, in hindsight)
- Pattern of validated discipline (positions where kill criteria fired and were honored)
Annually, the transactions log feeds the calibration scorecard in 10-Calibration/.
Linked
- Portfolio — live register driven by these transactions
- Watchlist — passive register; transitions to/from logged here
- position-sizing-kelly — the sizing framework that should drive every transaction
- 05-decision-framework — every action passes through the framework's gates