What is credit underwriting?
Credit underwriting is the discipline of deciding, for any given customer at any given moment, two things at once: will they repay, and at what price does this loan make sense. Get the first answer wrong and you book a default. Get the second wrong and you either price the customer out (the right customer walks) or price them in too cheap (you book a loan that doesn't cover its loss-adjusted return). Indian underwriters work both questions under tight RBI rules, with five orthogonal data sources, against a customer population where 60%+ of working-age adults are still thin-file or no-file by Western standards.
The hard part isn't the model. Logistic regression with five clean features will beat a gradient-boosted ensemble with five noisy features ten times out of ten. The hard part is the data layer underneath: pulling the bureau cleanly, parsing the bank statement reliably, fetching GST filings without rate-limit pain, verifying the UAN against EPFO, catching the ghost loan before it hits your book. Most underwriting failures are data-plumbing failures dressed up as model failures.
This guide walks the underwriting surface in the order you actually meet it as an Indian lender: the regulators (RBI Master Direction on Digital Lending, the Account Aggregator framework, the 2023 FLDG circular), the five data sources every Indian underwriter touches, bureau vs alt-data trade-offs (with the comparison table credit committees actually want), risk-based pricing applied in tiers, re-scoring cadence across the active book, and the five implementation pitfalls that book defaults faster than any model can catch up.
India lending regulatory map
Three regulators set the rules every Indian digital lender works under. The Reserve Bank of India is the load-bearing one: the 2022 Digital Lending Guidelines and the subsequent Master Direction on Digital Lending define who can lend, how loans must be priced, what must be disclosed, how data can be collected, and how LSPs can partner with regulated entities. The same body owns the co-lending framework and the 2023 FLDG circular that formally permits first-loss default guarantees up to 5% from REs.
The credit bureaus (CIBIL, CRIF High Mark, Experian, Equifax) operate under RBI's Credit Information Companies (Regulation) Act, 2005. Every regulated entity is required to report tradelines monthly and is entitled to pull on consent. The four-bureau split matters: not every lender reports to every bureau, so a customer's "true" obligation profile is a union — which is why high-value lenders pull multi-bureau and not single-bureau. Bureau pulls show up on the customer's record themselves; too many enquiries in 90 days is a negative signal.
The GST council (via CBIC) runs the GST registry — the income-data backbone for MSME lending — and the EPFO runs UAN, the employer-confirmed-income signal for salaried borrowers. Neither body sets lending rules, but their data shapes who can be underwritten and how cheaply. The FATF shapes the AML perimeter that underwriting sits inside; FATF's 2024 Mutual Evaluation continues to influence the RBI's expectations on ongoing-monitoring effectiveness across the credit book.
The Account Aggregator framework — RBI's consented data-sharing layer — is the load-bearing change of the last three years. AA collapses "upload your bank statement PDF" into a one-tap, signed, machine-readable pull. For underwriting, AA shifts cash-flow analysis from an origination-only signal to one you can pull every cycle without friction. The practical takeaway: build your underwriting stack against AA-shaped data contracts now, even if you also accept legacy uploads — the AA volume curve makes the PDF path the long tail by 2027.
The 5 underwriting data sources in India
Five orthogonal signals carry the bulk of underwriting weight. Different lenders combine them differently; serious underwriters touch all five for any non-trivial loan.
1. Credit bureau (CIBIL / CRIF / Experian / Equifax)
The score plus the tradeline detail. Bureau tells you how the customer has handled credit in the past, what they currently owe across other lenders, and how aggressively they've been shopping for more. Coverage is uneven: prime urban customers have rich files, first-jobbers and rural customers are often thin-file. Latency is 2–5 seconds. Cost is ₹5–25 per pull depending on bureau and report depth. The first call in any underwriting stack — but never the last.
2. Bank statement (via AA or upload)
12 months of statements parsed into salary credits, EMI debits, NACH mandates, recurring spends, average balance, and bounce rate. The cash-flow lending primary signal. Reliable for salaried customers (salary credit narrative is standardised). Trickier for self-employed (income lumpy, mixed business/personal flows). AA-sourced statements arrive signed and machine-readable; PDF uploads require parsing + signature verification + tamper checks.
3. GST returns (GSTR-1, GSTR-3B)
24 months of filings carries the income signal for MSME underwriting. Turnover from GSTR-1, taxes paid from GSTR-3B, return-filing on-time rate as a discipline proxy. Works because GST filings carry tax cost on over-reporting and penalty risk on under-reporting — the borrower is bonded to the data. The unlock for self-employed and small-business credit where bureau is thin and bank statements are mixed.
4. UAN / EPFO contributions
For salaried borrowers, the 12-digit Universal Account Number ties to monthly PF contributions reported by the employer. Continuous contributions = employer-confirmed income; gaps = unemployment periods; multiple employers in 24 months = job-hopping risk. The cleanest "is this person actually employed where they say they are" signal in the Indian stack.
5. Adverse records (courts, FIRs, MNRL)
Court records and FIRs catch the customers whose risk doesn't show up in tradeline data — the ones with active civil suits, criminal complaints, or police records. MNRL (Mobile Number Revocation List) catches the customers whose registered mobile has been ported, surrendered, or reissued — a takeover vector that's invisible to bureau and bank statement alike. The three together close the long tail that pure financial signals miss.
Bureau vs alt-data — when do you use each?
Bureau is necessary but never sufficient. Alt-data — bank statements via AA, GST, UAN, adverse records — is mandatory for thin-file customers and additive for everyone else. Credit committees ask for the trade-off table directly; here it is.
| Dimension | Credit bureau | Alt-data (AA / GST / UAN) |
|---|---|---|
| Question it answers | How has this customer handled credit in the past? | What is the customer earning and spending right now? |
| Data freshness | 30–45 day reporting lag | Real-time (AA, GST live registry, EPFO) |
| Population coverage | Strong on thick-file; weak on first-jobbers and rural | Strong on salaried + MSME; weak on cash-economy |
| Latency | 2–5 seconds per bureau call | 2–10 seconds depending on source + consent flow |
| Cost | ₹5–25 per pull | ₹2–15 per pull (AA cheaper than upload-parse pipelines) |
| Failure mode | Ghost loans (other lenders not yet reported); enquiry stacking | Forged statements (mitigated by AA); GST under-declaration |
Build the stack to call both, every time, for any loan above a trivial threshold. Bureau plus AA plus (UAN for salaried) or (GST for MSME) covers 95% of the decisioning surface. The remaining 5% — high-value loans, complex obligors, adverse-record-positive customers — gets the full court/FIR/MNRL layer plus manual review. The point of the decisioning infrastructure is to make all of that one orchestration, not five integrations.
Risk-based pricing — the framework credit committees defend
Risk-based pricing is the discipline of charging each customer a rate that covers their expected loss, the cost of capital, the cost of servicing, and a target return — no more, no less. The math is well-understood; the operational difficulty is segmenting the book finely enough that pricing is defensible at every tier, without segmenting so finely that you're effectively pricing individuals (which the RBI Digital Lending Guidelines and fair lending norms push against).
Three tiers cover most Indian unsecured-credit books.
Tier-0 (prime salaried, ≥ 750 bureau, clean cash-flow, UAN-confirmed): Bureau pull + AA salary credit + UAN. EMI-to-income capped at 30%. Pricing at policy floor minus risk premium. Customer onboarded inside 5 minutes. Default rate target ≤ 1.5% annualised.
Tier-1 (near-prime, 700–750 bureau, mixed cash-flow): Bureau + AA + UAN + (if self-employed) GST returns + adverse-record check. EMI-to-income capped at 35%. Pricing at policy floor plus standard risk premium. Onboarding inside 15 minutes. Default rate target 2.5–4% annualised.
Tier-2 (subprime / thin-file, 650–700 bureau or no-file): Full alt-data stack — AA cash-flow, UAN if salaried, GST if MSME, adverse-record, MNRL. Smaller ticket, shorter tenor, structured to build behavioural data. EMI-to-income capped at 40%. Pricing at policy floor plus subprime premium. Default rate target 5–9% annualised, with the model trained for the higher loss curve.
What credit committees actually defend at quarterly review is the gap between the modelled loss rate and the realised loss rate at each tier. If realised > modelled by more than 50bps for two quarters running, the tier definitions or the underwriting cutoffs get re-tuned. The credit-analytics loop is what keeps the pricing honest.
Re-scoring cadence across the active book
Underwriting at origination is the first decision. Re-scoring is every decision after — and for a revolving book, every decision after is most of the lifetime credit risk.
Continuous re-scoring (revolving products). Credit cards, lines of credit, BNPL, overdraft facilities — behavioural re-scoring runs every cycle. Repayment, utilisation, fresh bureau pulls, new tradelines elsewhere all feed into a rolling score. The model that ships well treats each cycle as a fresh decision, with the score driving credit-limit increases, decreases, and freezes.
Trigger-based re-scoring (term loans). For installment products, re-score at meaningful inflection points: missed EMI, NACH bounce, request for top-up, request for tenure extension, request for foreclosure. Each trigger pulls a fresh bureau plus AA bank statement plus early warning signals like sudden multi-bureau enquiry spikes elsewhere.
Annual refresh (the whole active book). At minimum, every 12 months: bureau refresh + MNRL on the registered mobile + adverse record scan + UAN re-pull for salaried customers (catches job changes that change the income profile). Annual refresh is the catch-net for everything the trigger-based re-scoring missed.
The bug that bites: caching. Teams that cache a bureau report for 30 days "to save money" end up underwriting with stale data exactly during the windows when ghost loans are racking up. Pull fresh on every decision; the cost is dwarfed by the loss avoided.
Implementation pitfalls — the 5 that bite
Every underwriting team meets the same five.
1. Bureau pull as the only signal. Thin-file and no-file customers can't be scored on bureau alone — and they're 60%+ of working-age India. A stack that declines everyone the bureau can't score declines the customer set most likely to grow into a lifetime relationship. Build cash-flow plus UAN plus GST into the standard underwriting path, not as a fallback.
2. Statement parsing without de-duplication of recurring inflows. Inbound EMI reversals (failed mandates that get retried), duplicate salary credits at month-end + month-start, and recurring inter-account self-transfers all inflate apparent income if not de-duplicated. A serious bank-statement-analysis layer dedupes by amount + counterparty + narration before computing income. Without it, EMI-to-income ratios run systematically low and you book underpriced loans.
3. UAN check skipped at refresh. A salaried customer who changes employers mid-tenure has a different income, a different employer credit risk, and possibly a gap month with no PF contribution. Re-pulling UAN at the annual refresh catches the job change before the EMI bounces. Skipping it guarantees you find out at the bounce, not before.
4. Ignoring GST data at MSME underwriting. For MSME borrowers — proprietorships, small private companies — the bureau is thin, bank statements are mixed business/personal, and "income proof" is essentially absent in the traditional sense. GST returns are the load-bearing income signal and are also the regulator-blessed one (filed under tax penalty). Skipping GST and underwriting MSME on bureau + bank only systematically misprices the segment.
5. Ignoring MNRL on the registered mobile during re-scoring. A customer's mobile gets reissued. The new owner gets all OTPs. The original customer's authentication is now spoofable, and the next loan request — top-up, line increase — is a stranger making it. MNRL at every re-scoring decision closes this. Skipping it is the most common ghost-loan vector we see in the second half of any active book.
How Deepvue ships underwriting
Every API in the catalog below sits on the same auth, the same SLA, the same decisioning layer underneath. Bureau pull, bank-statement parsing, GST returns, UAN / EPFO, court / FIR, MNRL — one contract for the entire underwriting signal set, with the orchestration layer handling parallel calls, fallbacks, and re-scoring triggers.
The audit trail is the load-bearing layer. Every signal pulled, every score computed, every decision taken — logged per-customer with rationale at the moment of decision. The proprietary-score path runs on top — you can plug in your own scorecard or use a Deepvue-tuned reference model.
Sub-3-second response on the full bureau-plus-AA-plus-UAN decision. RBI-aligned with the Digital Lending Guidelines, AA-native, FLDG-ready. Live across lenders processing underwriting decisions at scale.