Machine learning models, such as automated valuation models (AVMs), are used for predicting property values in various industries including real estate, mortgage lending, and insurance. AVMs have gained popularity as a key feature of many real estate applications because of their evolved decision-making process for buying, selling, and financing properties.
As much as they can help in the home buying and selling process, there is the issue of regressivity in AVMs. Regressivity refers to the tendency of a model to perform poorly, particularly in scenarios where it inaccurately estimates values for certain subsets of data. This may lead to systematic bias that disproportionately affects specific individuals, communities, and property types.
Even though AVMs can fall victim to regressivity, that doesn’t mean that their estimates are completely invalid. AVMs are constantly being tweaked and improved by technical teams to increase their accuracy. Through regular maintenance and oversight of the development and deployment of AVMs, we can ensure more reliable property valuations.
Understanding regressivity in AVMs
Regressivity in AVMs can occur due to various factors, such as limited training data, inadequate feature selection, or lack of regular model updates. A consistent pattern of overvaluation or undervaluation in AVM results can be a sign of regressivity.
Another way regressivity occurs could be because of selective regression. This is a machine learning technique used when the model lacks confidence in its prediction.
Selective regression is used to increase model performance, but research by MIT and the MIT-IBM Watson AI lab has shown that it can also have the opposite effect. Training a model’s confidence measure, also known as its confidence score, on overrepresented groups of data or insufficient data can lead to inaccuracies for underrepresented groups of data.
There are few ways to combat these instances and reduce regressivity in AVMs.
6 strategies to combat regressivity in AVMs
A few of the biggest factors that contribute to regressivity are the training data and features used by the model. Regularly updating the model, including its data and features, is necessary to prevent inaccurate results from an AVM.
It is also important to have human oversight with ongoing performance monitoring. Implementing these strategies as a part of the machine learning lifecycle can lead to fair and equitable valuations that benefit all stakeholders in real estate transactions.
1. Training data diversity
Incorporating diverse data in the subset used to train an AVM provides the model with a comprehensive understanding of the underlying patterns and relationships.
For example, diverse training data could include:
- Properties from various geographic locations in urban, suburban, and rural areas
- Diverse property types, including single-family homes, condominiums, townhouses, multi-family units, commercial properties, and vacant land
- Data collected over different time periods
- Different neighborhood and community attributes, such as proximity to schools, parks, and public transportation
Having diverse subsets of training data enables the model to generalize well and make accurate predictions on new data.
2. Feature selection
Careful consideration should be given to features and variables used in AVMs. Factors included must be relevant and objective indicators of property value to avoid variables that might introduce or perpetuate socioeconomic disparities.
Selecting relevant features helps to simplify the model, so it focuses only on the most important information. Overly complex models may learn irrelevant information and cause overfitting. If a model is overfitted, it is difficult for it to generalize and understand patterns.
A simple, more generalized model is less likely to exhibit regressivity by avoiding the adoption of false correlations present in irrelevant features.
3. Regular model updates
Real estate markets are dynamic and subject to constant changes. Performing frequent model updates allows the AVM to adapt to these changes by incorporating new data and retraining the model. When a model is more up to date, it can provide more accurate valuations that reflect the latest market conditions.
An up-to-date model also prevents model drift, which is what happens when the model degrades from shifts in foundational patterns. Model drift can cause regressivity by losing some of its ability to understand the relationships between property features and values. Frequently retraining the model on fresh data prevents model drift and ensures it remains relevant and accurate.
4. Ongoing performance monitoring
Continuous monitoring helps us identify issues early on, allowing us to take corrective actions as soon as possible. Performance metrics must measure how well the model is performing. Once we set a standard, we can set up automation to detect any signs of model degradation.
If the model is shown to start exhibiting signs of decreased accuracy or increased errors, ongoing monitoring helps identify those issues promptly.
5. Fairness assessments
Assessing the fairness of an AVM involves evaluating the model’s predictions across different demographics. By analyzing the model’s performance metrics for various subgroups, one can identify any gaps in property valuations. Analysts can identify any discrepancies from there and adjust the model to achieve more objective results.
Creating fairness assessments involves defining and measuring fairness metrics specific to property valuation tasks. Metrics, such as disparate impact, equalized odds, demographic parity, or predictive parity, can quantify fairness by assessing if the model’s predictions are consistent across different groups.
6. Human oversight
AVMs cannot improve without manual intervention from human experts. As part of the evaluation and monitoring steps in the machine learning lifecycle, human input is necessary to quality check data and identify inconsistencies.
One way to incorporate human oversight is by utilizing expert appraisal reviews. These can provide an additional layer of examination to correct any potential inaccuracies introduced by the AVM.
By leveraging human oversight in the development and improvement of AVMs, stakeholders can actively address potential issues that could lead to regressivity. They can then work to adjust the model and modify features to mitigate any errors.
Battling regressivity at Xome
Regressivity is a challenge when developing AVMs, but it can be handled with specific processes in place. By implementing appropriate strategies that combat regressivity, stakeholders can work towards more equitable property valuations.
The technical teams at Xome® are constantly working to improve our valuation model to provide the most accurate results for our customers.
Want to try out the Xome Value®? Visit Xome.com and search our properties for auction and for sale.