How to Build a Thematic Stock Screener (AI, EV, Clean Energy)

Copy Link

Thematic investing lives in the gap between a story and a spreadsheet. You’re betting on a structural shift, not a single quarter. But themes like artificial intelligence, electric vehicles, and clean energy are messy and fast-moving. Labels get slapped on companies that barely fit. Others stay invisible because they don’t talk about the theme, they quietly sell the picks and shovels. A good thematic stock screener helps you map that terrain, separate signal from noise, and act with discipline.

This is a practical build. I’ll focus on how to design a thematic stock screener that finds and ranks candidates across AI, EV, and clean energy, with the flexibility to reuse the same framework for other themes. I’ll also call out pitfalls I’ve learned by trial, error, and a few painful positions.

Start with a thesis you can falsify

Themes grow barnacles. Marketing, TikTok headlines, broker hype. Before you query a single ticker, write the thesis in two or three sentences and include what would disprove it. For AI, you might argue that inference workloads migrate from general purpose GPUs to specialized accelerators, favoring companies with software lock-in over commodity hardware. For EVs, you might believe adoption rises but margins compress, so value pools shift to charging networks and fleet software. For clean energy, you might see grid bottlenecks and interconnect queues as the gating factor, making transmission equipment and power electronics the scarce assets.

A falsifiable thesis gives you a north star for your screener. If you can’t describe where economic rents accrue, your filter will just collect familiar logos.

Define theme taxonomies: core, picks and shovels, enablers, passengers

A taxonomy helps sort exposure and avoid false positives. Most theme investors blend three buckets:

Core winners directly monetizing the theme. Think leading hyperscalers driving AI workloads or a top EV manufacturer with global scale.
Picks and shovels that sell tools into the growth engine, like semiconductor equipment makers, battery materials refiners, or grid inverter manufacturers.
Enablers and passengers that benefit indirectly. Data center REITs, copper miners, industrial gas suppliers.

The taxonomic labels are not rigid. A company can straddle buckets. Nvidia looks core for AI infrastructure, yet it also functions as a picks and shovels supplier to AI application builders. Still, forcing yourself to tag companies in one of these groups clarifies what to measure and how to compare them.

Build your data spine first

A screener is only as useful as its data spine. Integrate three layers: fundamentals, theme signals, and risk context.

Fundamentals include revenue growth, gross margin, free cash flow margin, net cash or net debt, R&D intensity, and capital intensity. For hardware-centric themes like EVs and clean energy, working capital swings and cash conversion cycles matter. For AI software or semis, gross margin structure and tape-out cadence matter more.

Theme signals require a bit of craft. I’ve had success with a blend of the following:

Segment-level revenue and commentary. Map reported segments to the theme. If a company discloses “AI-related” revenue or “EV battery” revenue, capture it with a conservative interpretation. Where management guidance is vague, triangulate through shipments or capacity additions.
Capex intent. AI data center growth shows up in hyperscaler capex plans, while EV manufacturing and battery plants show up as gigawatt-hour and gigafactory announcements. Clean energy exposure appears in interconnect backlogs and inverter/transformer order books.
Supply chain linkages. Cross-reference supplier and customer disclosures. For example, match a power semiconductor company’s design wins with EV OEM models, or match HBM memory suppliers with announced GPU ramps.
Patent and hiring trends. Not a magic bullet, but sustained hiring in model optimization or power electronics can corroborate exposure.
Regulatory catalysts. Track subsidies, tax credits, and standards. The Inflation Reduction Act changed the economics for US solar and battery supply chains. Euro 7, IRA content rules, and China’s policy moves all feed the screen.

Risk context keeps you honest. Capture leverage, refinancing needs in the next 24 months, geographic revenue split, and commodity exposures. For EVs, lithium and nickel prices swing margins. For clean energy, polysilicon prices and tariff regimes matter. For AI, the GPU supply chain and data center power availability can become bottlenecks.

Translate the thesis into measurable filters

A thematic screen should not look like a generic “growth at any price” filter. Tailor your criteria to where the profits are likely to accrue.

For AI, I prefer two distinct screens: infrastructure and application. On infrastructure, I look for companies with high incremental gross margins per watt or per compute unit sold, visible capacity additions, and pricing power. That pushes you toward GPU suppliers, networking fabric vendors, HBM memory producers, and power equipment manufacturers tied to data center builds. On applications, I emphasize user distribution and switching costs, not just model performance. A small company demoing clever AI feels exciting, but a platform with millions of embedded users and strong unit economics is a different animal.

For EVs, consider unit growth and cost curves, but focus the filter on where returns are less cyclical. Charging networks with rising utilization, power electronics suppliers with design wins across multiple OEMs, and thermal management firms with higher content per vehicle tend to have better durability than a single OEM dependent on a consumer credit cycle.

For clean energy, the screen should capture grid realities. Inverter makers and transformer suppliers with multi-quarter backlogs and pricing power can sit in a sweet spot. Developers with heavy interconnect exposure need careful risk weighting. Turbine OEMs can look cheap on sales, but warranty provisions and retrofit costs can destroy returns, so include a quality-of-earnings filter that penalizes frequent “one-time” charges.

Core fields to capture in your dataset

Your columns should reflect how value is created and defended within each theme. At minimum, store:

Theme revenue share. Ideally, a conservative range, not a single point estimate. If the CEO claims 60 percent of revenue from AI, assume a lower bound unless segments corroborate.
Order backlog and book-to-bill. In semis and power equipment, book-to-bill above 1.0 with stable lead times can signal sustainable demand. In EVs, watch orders versus cancellations and fleet conversions.
Capacity and capex. For AI, track GPU wafer starts, HBM capacity by fab, and data center megawatts under construction. For clean energy, track gigawatt capacity in modules, inverters, and transformers.
Unit economics. Gross margin trajectory, contribution margin for new products, warranty provisions, and service revenue share. Many “best stocks to buy now” lists ignore that a two-point margin swing can erase the equity story.
Competitive moat signals. Share of design wins, switching cost proxies, installed base, ecosystem depth. For example, you might encode a score for a software platform with APIs deeply embedded in enterprise workflows versus a point tool.
Balance sheet strength. Net cash, maturities within two years, access to capital markets. The next rate cycle shift can punish capital-intensive stories.

Where to source the data without fabricating anything

Focus on public and repeatable sources. The best raw materials are company filings, investor day decks, and transcripts. Use these to determine segment exposure and guidance details. Supplier and customer disclosures connect dots. Industry associations release shipment and installation data for solar, wind, EVs, and grid equipment. Government databases track interconnect queues and capacity additions. For hiring and patents, use official company job boards and patent offices rather than third-party counts that can misclassify.

Avoid overfitting to alternative data that looks impressive but rarely maps cleanly to profits. Web traffic, GitHub stars, or social mentions can complement, but they should not drive the screen.

Design a scoring system that respects uncertainty

Even with clean inputs, theme attribution has error bars. Code those uncertainty ranges into the score. I use banded scores rather than single-number precision. For instance, theme revenue share might be 20 to 30 percent with medium confidence. The score can reflect both the midpoint and the confidence penalty.

Separate presence from quality. A company at 80 percent AI revenue with low margins and rising inventory may deserve a lower rank than a company at 25 percent exposure with high returns on invested capital and pricing power. Your composite score should blend exposure, quality, and durability. If you’re building a stock scanner to find stocks for tactical trades, you can tilt toward near-term catalysts and momentum, but keep the quality dimension visible so you do not confuse a squeeze with a business.

Add guardrails against fads and survivorship bias

Every hot theme attracts marginal players that chase multiple narratives. I’ve seen battery companies pivot to hydrogen and back to EVs within two years. Guardrails help:

Minimum operating history for core metrics. Demand at least eight quarters of revenue and gross margin data unless the company is a genuine early-stage listing with a clear capital runway.
Dilution checks. If shares outstanding rise faster than revenue over a sustained period, down-rank the name unless capital raises are funding provable capacity coming online.
Disclosure discipline. Companies that frequently redefine segments or adopt new non-GAAP adjustments to smooth earnings deserve skepticism.
Inventory and receivables trends. Swelling inventories or receivables growing faster than sales often precede disappointment in hardware-heavy themes.

These guardrails reduce noise when you try to find stocks aligned with the theme rather than stories optimized for headlines.

A practical build: from query to ranked list

Start with the universe. For AI, that might be global semis, memory, networking, cloud platforms, and select software names. For EVs, add OEMs, tier-1 suppliers, charging networks, battery materials, and semiconductor content providers. For clean energy, include module and inverter makers, grid equipment, developers, yieldcos, and EPC contractors.

Next, apply baseline liquidity and financial health filters. You can require average daily value traded above a threshold to avoid slippage. Exclude companies with material going-concern warnings or imminent debt maturities if your mandate avoids restructuring risk.

Then code the theme exposure. Use a rule-based approach first. For example, a power semiconductor firm with more than 30 percent of revenue from automotive and explicit design wins in EV platforms gets an EV tag. An inverter manufacturer with utility-scale backlog and disclosed gigawatts shipped gets a clean energy tag. A memory supplier with public HBM ramp plans and named GPU customers gets an AI infrastructure tag. For software, look for enterprise AI products with disclosed ARR and cohort retention.

With trade ideas promo code tagging complete, bring in the quality and durability metrics. Compute rolling gross margin trends, free cash flow margin, return on invested capital where meaningful, and backlog-to-revenue ratios. Normalize across industries. A 55 percent gross margin in software is different from 55 percent in power equipment.

Finally, assemble the composite score. I prefer a three-pillar structure: exposure, quality, and durability. Weight them depending on your thesis. If you’re early in a theme, emphasize quality and durability to avoid value traps. Later in a cycle, when leaders are obvious and richly priced, you might allow more exposure weight to find underfollowed enablers.

Backtesting without lying to yourself

The temptation to curve-fit is strong. The theme changes definitions over time, and many companies did not disclose AI or EV revenue five years ago. To stay honest, backtest on features that existed then. For AI, use proxy variables like data center capex exposure and HPC share rather than retroactively counting “AI revenue.” For EVs, use automotive content per vehicle and announced EV model pipelines instead of modern EV segment revenue disclosures.

Look for stability of rank rather than raw returns. If the screener consistently surfaces companies that later emerge as category leaders, you’re on the right track. If the screener works only for the last two years, you’ve probably overfit to a bull narrative.

Case sketches: how the framework handles each theme

AI infrastructure highlights the power bottlenecks. Data centers need more megawatts per rack. That pulls in switchgear, transformers, busways, thermal management, and advanced networking. Your screener should tag companies with disclosed data center revenue shares, orders tied to hyperscaler builds, and capacity expansions in relevant product lines. It should reward pricing power and backlog quality while penalizing reliance on a single hyperscaler.

AI applications split between horizontal platforms and vertical tools. A screen that only captures press release volume misses the real moat. Focus on deployed users, integration depth, and incremental gross profit per incremental compute dollar. Companies that turn increased inference costs into higher ARPU with minimal churn deserve a higher score than those giving features away to drive vanity metrics.

EVs bring the classic value-shift from OEMs to suppliers. Most EV OEMs battle scale, pricing, and warranty learning curves. Your screener should favor tier-1 suppliers with multi-OEM design wins in traction inverters, SiC devices, onboard chargers, and thermal systems. Charging networks with rising station utilization and strong site hosts can pass a durability test. Battery materials require a separate lens, balancing price realization against long-term offtake contracts and environmental permitting risk.

Clean energy divides between generation and the grid. Utility-scale solar and wind developers have become capital structure stories. Equipment makers with pricing power and multi-quarter backlogs such as inverter, transformer, and HVDC technology providers can show better return conversion. Your screener should identify backlog growth, price discipline, and service attach rates. Also consider yieldcos or infrastructure funds for income investors, but add sensitivity to interest rates and contract resets.

How to integrate valuation without letting it dominate

Valuation matters, but it should not drown the thesis when a theme is early. When AI infrastructure demand surges, leaders will look expensive on trailing metrics. Use a layered approach. Track EV to sales and EV to gross profit for capital-light models, price to free cash flow for mature cash generators, and EV to invested capital for heavy equipment. Compare to their own history, not just sector medians. A company at the 80th percentile of its own valuation range with improving returns may still be attractive.

For EVs and clean energy, adjust for cyclical margins. At the bottom of a price war, P/E can look high even as the setup improves. Conversely, during subsidy-fueled booms, seemingly cheap multiples can mask a peak.

Build the workflow so you actually use it

A screener that sits in a spreadsheet graveyard helps no one. Tie it into a weekly or monthly workflow. Pull fresh data on earnings days. Flag names where the theme exposure or backlog changed materially. Set alerts when composite scores cross thresholds so you can do deeper research rather than chase headlines.

I keep three watchlists per theme. Leaders that I would own at a fair price, enablers that I monitor for improving economics, and experiments with small allocations only if catalysts line up. The screener feeds these lists, but dispositions depend on judgment and position sizing discipline.

Avoid the two easiest mistakes

First, don’t take management’s theme labels at face value. If a CEO says “AI is 40 percent of our pipeline,” ask what that means. Is it a rebrand of existing analytics? Is it dependent on one GPU program? Your screen should mark that as low confidence until revenue lines up.

Second, beware of the “one quarter hero.” A power equipment company might land a large grid contract and briefly inflate backlog. If margins don’t follow or if the order mix is low quality, the hangover hurts. The screener should smooth across several quarters and penalize volatility without a clear reason.

Two compact checklists to keep you disciplined

Theme exposure checklist
Do I have segment-level revenue or credible proxies for the theme?
Is there corroborating evidence from customers or suppliers?
Are capacity and capex aligned with the claimed exposure?
Is the exposure diversifying across customers and geographies?
What would invalidate the exposure claim in the next two quarters?
Quality and durability checklist
Are margins stable or expanding alongside growth?
Is backlog translating to cash, not just press releases?
Are balance sheet and refinancing timelines safe under stress?
Does the company have switching-cost moats or sticky contracts?
Are there hidden liabilities such as warranties or regulatory risks?

These lists are short by design. They force quick sanity checks before you get carried away by narratives about the best stocks to buy now.

Common edge cases and how to handle them

Conglomerates with partial exposure can pollute your results. If only 15 percent of revenue is tied to your theme but it is the fastest-growing segment, notate the optionality but keep the core score anchored to current exposure. You can create a sum-of-the-parts view separately.

Geographic policy risk can swing outcomes. An inverter maker deeply reliant on a single country’s subsidy program deserves a risk discount relative to a global peer, even with identical growth. Encode a policy concentration factor into your score.

Early-stage listings often entice with big TAMs. Demand a path to unit economics, not just revenue growth. If operating cash flow remains negative while stock-based compensation balloons, give the durability pillar a low score until business reality catches up.

Commodity-linked names like copper miners are leveraged to themes but move with global cycles. Tag them as enablers and keep them in a separate basket so you don’t mix supply-side beta with technology adoption alpha.

Incorporating technicals without turning it into a day-trader’s tool

A small technical overlay can improve entries. I watch for price above a long-term moving average and healthy volume on up days around earnings where the fundamentals improved. But avoid making technicals your primary rank. The goal is to use the stock screener to find stocks aligned with structural shifts, then use technicals to pace your buying stocks, stage entries, and manage risk.

From screen to portfolio: sizing and pacing

No screener substitutes for sizing judgment. Early in a theme, concentrate in durable enablers. As the winners become clearer and liquidity deepens, build positions in core names at reasonable prices. For volatile suppliers, let the screen guide you to add on backlog-supported dips. Scale into positions rather than chasing spikes. In capital intensive names, tie adds to visible capacity milestones or regulatory approvals.

One hard rule I’ve learned to respect: avoid overexposure to a single bottleneck. In AI, the energy and cooling constraints can ripple through your suppliers. In EVs, a single chemistry decision can benefit one set of materials and hurt another. Diversify across sub-buckets within the theme.

Putting it all together

A thematic stock screener is not a magic oracle. It’s a disciplined way to encode a thesis, gather the right facts, weigh uncertainty, and keep yourself from chasing narratives. When you build it around AI, EV, and clean energy, you will find that many of the best ideas sit one step away from the headlines. The networking chip that quietly wins sockets across multiple clouds. The transformer maker with an 18-month backlog and scarce capacity. The EV thermal supplier with rising content per vehicle regardless of consumer incentives. Those names don’t always top social feeds, but they show up consistently when your data spine is honest and your scoring respects how value is created.

Treat the screener as a living map. Themes evolve. AI workloads shift, EV charging models adapt, clean energy policy toggles. Update the taxonomy, refresh the guardrails, and test whether the same metrics still predict durability. If they stop working, don’t force them. Adjust, retest on features you had at the time, and keep your process grounded.

And remember the human part. Call customers. Read engineering blogs from the companies you track. Watch how supply contracts are structured. The spreadsheet will catch the big picture, but edge often lives in the footnotes and the relationships between buyers and suppliers. That is where structural themes become cash flows you can underwrite.

Public Last updated: 2026-02-07 06:56:19 PM