Pricing & Profitability supportLegal Ops support

Fake Plaintiffs Firm Billing Data Generator

Name: Fake Plaintiffs Firm Billing Data Generator
Brand: Counsel Commons™
Price: 40.00 USD
Availability: InStock

ByLegal InnovAI LLCPricing & Profitability·Law Firm / Legal Business Management·jurisdiction-neutral within the US

⚠ Important · This tool is not vetted for accuracy by Counsel Commons™. Outputs are productivity aids for legal-business-management workflows — not legal advice and not a substitute for the relevant professional's judgment. Professional review is required before any firm-affecting action (the relevant CFO, COO, legal-ops lead, managing partner, or other domain professional, depending on the tool). The author affirmed the tool targets business-management use cases at upload, but accuracy and currency are not guaranteed.Treat tools like a baseline: after purchase, read every file in the bundle and modify the tool to fit your own workflows, data sources, and firm conventions before running it.

📍 Jurisdiction · Calibrated for jurisdiction-neutral within the US. Verify carefully before applying outputs to any other jurisdiction. Authority, procedure, and standards differ by state and court.

About this tool

This skill produces synthetic, structurally-realistic case-and-billing data for plaintiff / contingency-fee law firms — designed for testing, demonstration, and training of plaintiff-side profitability tools, BI dashboards, case-management migrations, and analytics prototypes. It is NOT for use in any real client matter or case-management workflow. The unit of analysis is the case (with a single recovery event or loss), not a defense-side hourly matter. The generator models the economics that contingency work requires: tiered contingency fees, cost advances treated as firm capital at risk, recovery waterfalls, Rule 1.5(e) referral splits, lien resolution (medical, ERISA, Medicare, Medicaid), statutory fee-shifting in civil rights and FLSA cases, and Common Benefit Fund deductions in mass tort. Through a 10-question intake the user specifies firm size, practice mix across 30 plaintiff practice subtypes (auto PI, premises, med mal, product, trucking, employment Title VII / ADEA / ADA / FLSA / FMLA / retaliation, Section 1983, FHA, voting rights, consumer class, securities class, antitrust, ERISA class, pharma, environmental, talc, qui tam, and more), case volume, fee structure mix, win rates, recovery ranges, referral-split economics, and intake-source spend. Output is a zipped CSV bundle (or Excel workbook) covering 13 entity types: TIMEKEEPERS, RATE_HISTORY, CLAIMANTS, CASES, TIME_ENTRIES, CASE_COSTS, CASE_OUTCOMES, CASE_DISTRIBUTIONS, REFERRAL_SPLITS, LIENS, INTAKE_SOURCES, INVOICES (sparse — hourly opt-out + fee-petition only), and PAYMENTS. Every file is paired with a SUMMARY_STATS sheet and a README.txt; the bundle filename includes "SYNTHETIC-DATA-NOT-REAL" so it cannot be mistaken for real records downstream. Realism rules embedded include: log-normal recovery distributions by practice area, lifecycle durations calibrated per practice (pre-suit 6–18 mo, litigation 12–36 mo, trial cases +6–18 mo), case-phase-appropriate timekeeper attribution, automatic cost write-offs on losses (per state ethics rules — contingency firms cannot bill clients for costs on losing cases), at least one medical lien on most settled PI cases, jurisdiction-aware lien negotiation ranges, and intake-source cost-per-case attribution that ties marketing spend to signed and won cases. All identifiers are generic and sequential — Claimant [N], Defendant [N], Insurer [N], Person [N], Lienholder [N], Docket [N]. The PII rules are strict: no medical record numbers, no SSNs, no real-looking DOBs, no phone numbers, emails, street addresses, policy numbers, claim numbers, VINs, plates, license numbers, or medically-identifying injury descriptions. City + state only. Outputs require professional review. This skill generates synthetic test data and nothing it produces should be used in or treated as a record of any real case, claimant, insurer, or recovery. Users testing analytics or pricing tools against this data should validate their tools against real (privileged) data before relying on any conclusions in production. The schema is intentionally compatible with the pricing-and-profitability skill suite authored by Legal InnovAI and available on Counsel Commons, so plaintiff-firm financials produced here can be analyzed by the same workbook templates by Legal InnovAI.

Preview before you buy:

Example input

(The Skill will ask the user to answer questions about the type of firm and practice areas and other timekeeper billing-related questions to refine the customized data output; the user does not need to enter a prompt like the example below, but may.)

Generate 3 years of synthetic data for a boutique plaintiff PI firm:
- 8 timekeepers (1 equity partner, 2 associates, 3 paralegals, 1 case manager, 1 intake specialist)
- Practice mix: Auto PI 70%, Premises 20%, Med mal 10%
- ~150 cases/year
- Fee mix: tiered contingency 90%, flat contingency 10%
- Referral % at default (25% of cases, 33% fee split)
- Default intake sources, default win rates
- Output: CSV bundle with README

Example output

For the example input above, the user would get back a zipped bundle named
"plaintiffs-billing-data-SYNTHETIC-DATA-NOT-REAL.zip" containing 13 CSV
files plus a README and a summary-statistics sheet — together, three
years of internally-consistent fake records for a boutique plaintiff
PI firm of about eight people.

The files cover the full life of the firm. See eight
timekeepers, their annual rate history, the ~445 claimants who walked
in the door, and the 450 cases those claimants generated (split across
auto accident, premises liability, and medical malpractice in roughly
the proportions you asked for). Each case carries the timeline one might
expect: an intake date, an incident date, a litigation phase that
moves through pre-suit / filed / discovery / trial prep, and an
outcome — settled, won at judgment, lost, dismissed, or still open.

Inside the cases, the time entries (about 38,000 of them) are
attributed to specific timekeepers at each phase: paralegals and
case managers heavy in pre-suit, associates building the litigation
record, the partner reviewing motions and handling trial work.

Cost-advance ledgers track the firm's own money going into each
case — medical-record fees, filing fees, expert witnesses, depo
costs — and at case close, those costs either come back from the
recovery (on wins) or get written off (on losses), the way real
contingency-firm books actually work.

For the cases that recovered, the output shows a distribution
waterfall: gross recovery, costs reimbursed, the contingency fee
calculated at the tier that applied (33% pre-suit, 40% post-filing,
45% post-trial), any referring-attorney split under Rule 1.5(e),
the liens paid out (medical, ERISA, Medicare where applicable, with
realistic negotiated reductions), and the client's net check.

About a quarter of cases have a referring attorney attached, with
the fee split documented. Most settled PI cases carry at least one
lien.

A separate intake-sources file ties the firm's marketing spend
(TV, Google ads, organic web, bar referrals, word-of-mouth) to
the cases each channel produced, with cost-per-acquired-case and
cost-per-signed-case math already done.

The README labels the dataset SYNTHETIC — NOT REAL, lists the
parameters used to generate it, and gives row counts per table.
The summary-statistics sheet shows the realism checks: win rates
landed within a couple of points of the practice-area targets,
median recoveries fell where they should, cost write-off rates
matched the loss rates, and the lien-negotiation rates came in
near the published norms (~38% average reduction).

Every identifier is generic and sequential — Claimant 1, Defendant
12, Insurer 4, Lienholder 7, Docket 219 — so the dataset can be
shared with vendors, demoed in a sales conversation, or loaded
into a testing environment without any risk of resembling a real
matter.

Screenshot examples

Part of Claimant Type data
Part of Case Outcome data
Part of Intake Source data
Part of Timekeeper data

Sanitized example, not professional advice. All sales final — use the preview to confirm fit before purchase.

Compatible models

The author has tested this tool on the providers below. The specific model list updates automatically as providers ship new models or retire old ones. Compatibility with providers not listed below is not guaranteed — the tool may not produce equivalent results outside the tested set.

✓ TestedClaude (Anthropic)Works on Claude Opus 4.7, Claude Haiku 4.5, Claude Sonnet 4.6

✓ TestedChatGPT (OpenAI)Works on GPT-5

Not testedGemini (Google)

Not testedLocal / open-weight

Not testedOther — or self-contained / no LLM

Data handling

🔒

When you run a tool, it will be through whichever LLM you choose to run it through on your end. We do not provide any platforms through which you can run a tool. How the LLM provider you choose to use handles your inputs — retention, training, routing — depends on your plan and configuration with them, not on us.FYI:Free and consumer plans often allow training on inputs by default; enterprise, team, and API plans often don't, but it is your responsibility to check your provider's data-use policy and your plan settings before sending anything firm-confidential or privileged.

Seller of record

Business name: Legal InnovAI LLC
Entity type: Verified business (Stripe-KYC'd)
Location: Colorado

This is the party you have a software-license contract with. If you aren't satisfied with the tool, please contact this party directly to work it out.

Version history

v1.0.3Current2026-05-22
```
removed some non-litigation UTBMS codes
```

v1.0.22026-05-14

Cost schemas added for improved profitability analyses

v1.0.12026-05-13
```
Fixed table linking
```
v1.0.02026-05-12

Existing buyers receive new versions free of charge. Pin to a specific version from your library if your workflow needs the exact bundle behavior of an earlier release.

Buyer reviews

No reviews yet — be the first after you buy.

Before you buy

Tools are starting points, like templates. Read every file in the bundle before running, modify for your workflow, and assess safety and legal implications for your use case.
Outputs vary run-to-run. Generative AI is non-deterministic by design — the same tool on the same input can produce different results, and outputs can vary across sessions, model versions, and provider load conditions. Your input will differ and your model may differ, so you should expect your output to vary from the example above. Variance is normal, not a defect.
All sales final. Tools are immediately downloadable digital goods.