Forecasting Used Phone Prices with ML

This post covers the research behind my IEEE paper "Forecasting the Prices using Machine Learning Techniques: Special Reference to used Mobile Phones", published at the Second International Conference on Augmented Intelligence and Sustainable Systems (ICAISS 2023). Read the full paper on IEEE Xplore →

The Problem

The secondhand smartphone market is enormous but opaque. Sellers price phones based on gut feeling, and buyers overpay or miss good deals because there's no transparent pricing model. Sites like eBay and OLX show wildly varying prices for the same model in the same condition. Can machine learning build a reliable price predictor?

The answer is yes — but the real challenge was in the data, not the model.

End-to-End ML Pipeline

flowchart TD Scrape["Data Collection eBay · OLX · Flipkart n=3,200 listings"] Clean["Data Cleaning duplicates · nulls outlier removal"] EDA["EDA distributions correlation matrix"] FE["Feature Engineering depreciation ratio price-per-GB age buckets"] Encode["Encoding Ordinal: condition OHE: brand · OS"] Split["Train/Test Split 80/20 · stratified"] Models["Model Evaluation 6 algorithms"] Best["XGBoost R²=0.91"] SHAP["SHAP Analysis feature importance"] Scrape --> Clean Clean --> EDA EDA --> FE FE --> Encode Encode --> Split Split --> Models Models --> Best Best --> SHAP

Dataset Construction

We collected data on used smartphones across multiple platforms, capturing 13 features:

Feature	Type	Notes
`brand`	Categorical	Apple, Samsung, OnePlus, Xiaomi, etc.
`model`	Categorical	High cardinality — grouped by generation
`ram_gb`	Numeric	4, 6, 8, 12, 16 GB
`storage_gb`	Numeric	64, 128, 256, 512 GB
`battery_mah`	Numeric	2000–6000 mAh range
`camera_mp`	Numeric	Primary rear camera MP
`screen_size_inch`	Numeric	5.0–7.0 inches
`age_months`	Numeric	Months since original release
`condition`	Ordinal	New > Like-new > Good > Fair
`has_5g`	Binary	0/1
`os`	Categorical	iOS / Android
`launch_price_usd`	Numeric	Original MSRP at launch
`used_price_usd`	Numeric	Target variable

import pandas as pd
import numpy as np
from scipy import stats

df = pd.read_csv('used_phones.csv')

# Remove statistical outliers (prices > 3 std from mean per model)
df = df.groupby('model').apply(
    lambda x: x[np.abs(stats.zscore(x['used_price_usd'])) < 3]
).reset_index(drop=True)

# Condition: ordinal encoding (order matters)
condition_map = {'New': 4, 'Like New': 3, 'Good': 2, 'Fair': 1}
df['condition_score'] = df['condition'].map(condition_map)

print(f"Final dataset: {len(df)} listings, {df['brand'].nunique()} brands")

Feature Engineering

Raw features aren't always the most predictive. Three engineered features improved model performance significantly:

def engineer_features(df):
    # 1. Depreciation ratio: how fast does this brand/model lose value?
    # Captures brand premium and perceived durability
    df['depreciation_ratio'] = df['used_price_usd'] / df['launch_price_usd']
    # Apple ~0.65 (holds value), budget Android ~0.25

    # 2. Price-per-GB storage: normalizes across storage tiers
    df['price_per_gb'] = df['launch_price_usd'] / df['storage_gb']

    # 3. Age buckets: phone depreciation is non-linear
    # Drops steeply year 1, flattens after year 2
    df['age_bucket'] = pd.cut(
        df['age_months'],
        bins=[0, 6, 12, 24, float('inf')],
        labels=['0-6mo', '6-12mo', '1-2yr', '2yr+']
    )

    # 4. Flagship flag: premium models depreciate differently
    flagship_models = ['iPhone 14 Pro', 'iPhone 15', 'Galaxy S23 Ultra',
                       'Galaxy S24', 'Pixel 8 Pro']
    df['is_flagship'] = df['model'].isin(flagship_models).astype(int)

    return df

df = engineer_features(df)

Models Evaluated

from sklearn.linear_model import LinearRegression, Ridge, Lasso
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor, GradientBoostingRegressor
from sklearn.svm import SVR
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.metrics import r2_score, mean_absolute_error, mean_squared_error
from sklearn.preprocessing import StandardScaler
import xgboost as xgb

feature_cols = ['ram_gb', 'storage_gb', 'battery_mah', 'camera_mp',
                'age_months', 'condition_score', 'has_5g',
                'launch_price_usd', 'depreciation_ratio',
                'price_per_gb', 'is_flagship',
                'brand_Apple', 'brand_Samsung']  # OHE brands

X = df[feature_cols]
y = df['used_price_usd']

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

models = {
    'Linear Regression': LinearRegression(),
    'Ridge':             Ridge(alpha=1.0),
    'Lasso':             Lasso(alpha=0.1),
    'Decision Tree':     DecisionTreeRegressor(max_depth=8, random_state=42),
    'Random Forest':     RandomForestRegressor(n_estimators=200, random_state=42),
    'XGBoost':           xgb.XGBRegressor(
                             n_estimators=500,
                             learning_rate=0.05,
                             max_depth=6,
                             subsample=0.8,
                             colsample_bytree=0.8,
                             random_state=42
                         ),
    'SVR':               SVR(kernel='rbf', C=100, gamma=0.1),
}

results = {}
for name, model in models.items():
    model.fit(X_train, y_train)
    preds = model.predict(X_test)
    results[name] = {
        'R2':   r2_score(y_test, preds),
        'MAE':  mean_absolute_error(y_test, preds),
        'RMSE': np.sqrt(mean_squared_error(y_test, preds))
    }

results_df = pd.DataFrame(results).T.sort_values('R2', ascending=False)
print(results_df)

Results

Model	R² Score	MAE (USD)	RMSE (USD)
XGBoost	0.91	$28.4	$41.2
Random Forest	0.88	$33.1	$48.7
Gradient Boosting	0.87	$35.6	$51.3
Decision Tree	0.79	$44.2	$64.8
SVR	0.76	$48.7	$69.4
Ridge	0.68	$57.3	$80.1
Linear Regression	0.65	$61.0	$84.7

Linear regression underperformed because phone depreciation is non-linear — price drops steeply in year one and flattens after year two. Tree-based models capture this naturally through recursive splits on age_months.

SHAP Feature Importance

SHAP (SHapley Additive exPlanations) revealed which features actually drove predictions — not just model coefficients:

import shap

explainer = shap.TreeExplainer(models['XGBoost'])
shap_values = explainer.shap_values(X_test)

# Top features by mean |SHAP value|
shap.summary_plot(shap_values, X_test, plot_type="bar")
# Results (approximate ranking):
# 1. launch_price_usd    — 0.38  (most important)
# 2. ram_gb              — 0.22
# 3. age_months          — 0.19
# 4. depreciation_ratio  — 0.14
# 5. brand_Apple         — 0.11
# 6. condition_score     — 0.09
# ...

# Partial dependence: age vs price
shap.dependence_plot('age_months', shap_values, X_test)
# Reveals: steep SHAP drop from 0-12 months, plateau after 24 months

Key finding: Brand alone explained a large portion of price variance. Apple devices depreciate significantly slower — an iPhone 2 years old holds value better than a similarly-specced Android. The depreciation curve is the most interesting shape in the data: non-linear, brand-specific, and poorly captured by linear models.

The Depreciation Curve Visualization

Depreciation curves by brand. Apple retains ~65% value at 24 months; comparable Android ~45%. Both show non-linear decay that linear models can't capture.

What I'd Do Differently Today

Live pricing data — scrape continuously and build an online learning model rather than a static snapshot. Used phone prices shift with new model releases.
NLP features from listing text — "minor scratches" vs "perfect condition" carries strong price signal. A fine-tuned sentence-BERT encoder on listing descriptions would add meaningful features.
Bayesian hyperparameter optimization — we used grid search; Optuna or Ray Tune would have found better XGBoost params faster.
Stacked ensemble — XGBoost + Random Forest + Ridge stacked would likely push R² above 0.93.
Seller behavior features — days listed, number of views, seller rating. These predict willingness-to-negotiate, not just fair market value.

This was my first published research. The main lesson: three months of data collection and cleaning, one month of modeling. The 3:1 ratio wasn't a mistake — it was the right investment. Clean, well-featured data with a simple XGBoost beat dirty data with a sophisticated neural network every time we tested it.

Forecasting Used Phone Prices with ML — IEEE ICAISS 2023