DBS Foundation Coding Camp 2024/AI/ML/course

House Price Prediction

This project focuses on practical regression thinking by connecting feature engineering and model selection to a concrete stakeholder question: how to estimate house prices more reliably from property attributes and market context.

project links
Domain
AI/ML
Role
Machine Learning Engineer
Output
Research/Case Study
Category
Regression Modeling
Project Framing

A source-backed case study built for recruiter review

This reading path makes the problem choice, evidence quality, user framing, execution decisions, and proof trail visible without overstating what the sources support.

Project Type
course

Applied regression project for housing-price estimation with business framing, feature analysis, and model comparison.

Orientation
Tech

Shows that applied ML value comes not only from prediction score, but from interpretable framing and decision relevance.

Core Stack
Python · Jupyter Notebook · Pandas · Scikit-learn

Notebook-based regression workflow using tabular housing data, preprocessing steps, feature analysis, and comparative model experimentation.

Why This Problem Mattered

Problem framing before execution

The case-study layer starts with why this problem was selected and how the context justified investment.

Problem Framing Map

Issue

Housing-price estimation becomes unreliable when feature context, market assumptions, and model selection are not framed clearly.

Context

The project focuses on practical regression thinking by tying feature engineering and model selection to a concrete pricing question instead of treating regression as a generic algorithm demo.

Why Selected

It strengthens the portfolio by showing interpretable applied ML reasoning on tabular data with business-use framing.

Problem statement

Housing-price estimation becomes unreliable when feature context, market assumptions, and model selection are not framed clearly.

Solution thesis

Built a regression workflow that analyzes housing features, compares model behavior, and documents the business relevance of prediction output.

Research and Evidence

What supports the narrative

Evidence is surfaced with its source type and credibility note so the recruiter can quickly see what is directly backed versus intentionally constrained.

Business-use framing
local

The project explicitly documents regression work from a business-use perspective rather than only a statistical exercise.

Credibility: Supported by the summary, impact, metrics, and README-backed project record.
Comparative modeling
local

Multiple model families were considered before the final approach was framed.

Credibility: Backed by the metrics, responsibilities, and notebook-oriented architecture description.

Credibility Notes

  • The project is presented as applied regression reasoning and documentation, not as a deployed valuation service.
  • No pricing-product accuracy guarantee or live stakeholder adoption claim is introduced without stronger evidence.
Who The User Was

User framing stays explicit

When formal research artefacts are not available, the page still explains who the work served and why that user framing is justified by the existing sources.

Primary user
Stakeholders who need more reliable house-price estimation from property attributes and market context.

The problem framing explicitly connects model work to a real pricing question instead of abstract experimentation.

Reviewer stakeholder
Reviewers evaluating whether feature analysis and model choice are connected to decision relevance.

A major project strength is how it keeps business framing attached to the regression workflow.

Decision Flow

How design thinking translated into decisions

The goal is to show the trace from research and insight to concrete product or system decisions, then to the outcomes those decisions supported.

Design Thinking Flow

Each step keeps the movement from evidence to action explicit before the rationale expands it.

  1. Step 1
    Stakeholder-question framing

    Started from the need for more reliable price estimation before choosing model families.

    Signal: Decision relevance precedes model mechanics.
  2. Step 2
    Feature reasoning

    Connected property attributes and market context to the regression workflow so preprocessing remains interpretable.

    Signal: The project stays understandable as applied ML, not only as notebook math.
  3. Step 3
    Comparative evaluation

    Used multiple model families to reason about fit rather than assuming one default solution.

    Signal: The workflow encourages evaluation discipline over one-shot modeling.

Decision Rationale

Each decision keeps the path from insight to execution visible before ending on the outcome signal.

Business-linked regression framing
Insight

Prediction models are less useful when they are disconnected from the stakeholder question they are meant to support.

Decision

Framed the project around reliable estimation relevance, not just model execution.

Outcome

The case reads as practical applied ML with clearer decision context.

Multiple-model comparison
Insight

Regression quality is easier to justify when multiple model behaviors are considered before settling on one approach.

Decision

Compared several model families in the notebook workflow.

Outcome

The project shows evaluation reasoning instead of a single-model assumption.

Solution and System Execution

Execution choices and delivery details

This section preserves the technical and operational substance: architecture, responsibilities, trade-offs, and implementation quality signals.

System Design

Notebook-based regression workflow using tabular housing data, preprocessing steps, feature analysis, and comparative model experimentation.

Source-backed Impact

Shows that applied ML value comes not only from prediction score, but from interpretable framing and decision relevance.

Responsibilities

  • Prepared tabular housing dataset for regression analysis
  • Compared candidate model approaches
  • Documented business framing and interpretation context

Stack Decisions

  • Used notebook workflow for transparent experimentation
  • Focused on regression interpretability instead of production deployment claims
  • Preserved business framing alongside technical modeling choices

Trade-offs

  • Accepted lower operational maturity in exchange for clearer analytical storytelling
  • Kept scope centered on model reasoning rather than deployment infrastructure

Challenges

  • Relating technical feature behavior to understandable real-estate decision context
  • Avoiding overclaiming from a compact learning-oriented regression artefact
Outcomes and Proof

What was delivered and what can be verified

Outcome claims remain conservative and source-backed, while proof records and recruiter-safe links surface the strongest verification trail available.

Validation Signals

  • Regression framing is documented with business-use perspective.
  • Multiple model families were considered before locking the final approach.

Source-backed Outcomes

  • Regression framing documented with business-use perspective
  • Multiple model families considered before locking final approach
Retrospective and Limits

What the project proves, and what it does not

Strong case studies show both what was learned and where the current evidence stops.

Retrospective

Next iteration should add reproducibility steps, evaluation summary, and explicit stakeholder usage examples.

Evidence Limits

  • Current sources do not support deployment, live pricing workflow integration, or monitored production usage.
  • The project should remain framed as interpretable regression experimentation and documentation.

Lessons

  • Good problem framing improves model interpretation quality
  • Comparing multiple model families helps avoid premature algorithm choice